Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Panic on truncating non-ASCII content #167

Open
uasi opened this issue Jan 17, 2025 · 7 comments · May be fixed by #171
Open

[Bug] Panic on truncating non-ASCII content #167

uasi opened this issue Jan 17, 2025 · 7 comments · May be fixed by #171
Labels
t: bug Something isn't working

Comments

@uasi
Copy link

uasi commented Jan 17, 2025

Describe the bug

comfy-table panics when rendering a dynamically-arranged non-ASCII content in a narrow cell.

Steps to reproduce

Complete setup: https://gitlab.com/uasi/20250117-comfy-table-panic

// Rust 1.84.0, comfy-table 7.1.3

fn main() {
    let mut table = comfy_table::Table::new();
    table
        .set_header(vec!["Header1"])
        .set_width(20)
        .add_row(vec!["あいうえおかきくけこさしすせそたちつてと"])
        .set_content_arrangement(comfy_table::ContentArrangement::Dynamic);

    for row in table.row_iter_mut() {
        row.max_height(1); // 2 -> also panics, 3 -> ok
    }

    println!("{table}");
}

Logs (if applicable)

% RUST_BACKTRACE=1 cargo run
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.20s
     Running `target/debug/comfy-table-panic`
thread 'main' panicked at /Users/uasi/.rustup/toolchains/stable-aarch64-apple-darwin/lib/rustlib/src/rust/library/alloc/src/string.rs:1472:13:
assertion failed: self.is_char_boundary(new_len)
stack backtrace:
   0: rust_begin_unwind
             at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869/library/std/src/panicking.rs:665:5
   1: core::panicking::panic_fmt
             at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869/library/core/src/panicking.rs:76:14
   2: core::panicking::panic
             at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869/library/core/src/panicking.rs:148:5
   3: alloc::string::String::truncate
             at /Users/uasi/.rustup/toolchains/stable-aarch64-apple-darwin/lib/rustlib/src/rust/library/alloc/src/string.rs:1472:13
   4: comfy_table::utils::formatting::content_format::format_row
             at /Users/uasi/.cargo/registry/src/index.crates.io-6f17d22bba15001f/comfy-table-7.1.3/src/utils/formatting/content_format.rs:126:25
   5: comfy_table::utils::formatting::content_format::format_content
             at /Users/uasi/.cargo/registry/src/index.crates.io-6f17d22bba15001f/comfy-table-7.1.3/src/utils/formatting/content_format.rs:55:28
   6: comfy_table::utils::build_table
             at /Users/uasi/.cargo/registry/src/index.crates.io-6f17d22bba15001f/comfy-table-7.1.3/src/utils/mod.rs:52:19
   7: comfy_table::table::Table::lines
             at /Users/uasi/.cargo/registry/src/index.crates.io-6f17d22bba15001f/comfy-table-7.1.3/src/table.rs:95:9
   8: <comfy_table::table::Table as core::fmt::Display>::fmt
             at /Users/uasi/.cargo/registry/src/index.crates.io-6f17d22bba15001f/comfy-table-7.1.3/src/table.rs:47:25
   9: core::fmt::rt::Argument::fmt
             at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869/library/core/src/fmt/rt.rs:177:76
  10: core::fmt::write
             at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869/library/core/src/fmt/mod.rs:1189:21
  11: std::io::Write::write_fmt
             at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869/library/std/src/io/mod.rs:1884:15
  12: <&std::io::stdio::Stdout as std::io::Write>::write_fmt
             at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869/library/std/src/io/stdio.rs:792:9
  13: <std::io::stdio::Stdout as std::io::Write>::write_fmt
             at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869/library/std/src/io/stdio.rs:766:9
  14: std::io::stdio::print_to
             at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869/library/std/src/io/stdio.rs:1122:21
  15: std::io::stdio::_print
             at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869/library/std/src/io/stdio.rs:1232:5
  16: comfy_table_panic::main
             at ./src/main.rs:15:5
  17: core::ops::function::FnOnce::call_once
             at /Users/uasi/.rustup/toolchains/stable-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Operating system

macOS 15.2

Comfy-table version

v7.1.3

Additional context

Related issue #53

himalaya is affected by this bug pimalaya/himalaya#497

@uasi uasi added the t: bug Something isn't working label Jan 17, 2025
@uasi
Copy link
Author

uasi commented Jan 17, 2025

Here's the issue: We shouldn't truncate a string based on its visual width, as it doesn't correspond to the string's length in bytes.

last_line.truncate(last_line.width() - surplus);

@uasi
Copy link
Author

uasi commented Jan 17, 2025

Changing it to last_line.truncate(last_line.len() - surplus) fixes the bug, although the table alignment is imperfect. We might need to truncate one additional character and put a spacer after the truncation indicator, depending on the width of the truncated string.

Image
+------------------+
| Header1          |
+==================+
| あいうえおかき... |
+------------------+

Update: We can't reliably calculate the surplus's length either. I believe the correct solution would look something like this.

let mut truncate_at = 0;
let mut acc_width = 0;
let mut needs_space = false;

for (grapheme, index) in last_line.grapheme_indices(true) {
    if acc_width + grapheme.width() > truncate_width {
        if acc_width < truncate_width {
            needs_space = true;
        }
        break;
    }

    acc_width += grapheme.width();
    truncate_at = index;
}

last_line.truncate(truncate_at);
last_line.push_str(truncate_indicator);

if needs_space {
    last_line.push(' ');
}

@soywod
Copy link

soywod commented Jan 18, 2025

I found back the code I used to have before Himalaya relied on comfy-table, and it is similar to what @uasi proposed:

https://github.com/pimalaya/himalaya/blob/a700f358fb380ce3c775ff24bf77ac79344217e9/src/ui/table/table.rs#L221-L244

@Nukesor
Copy link
Owner

Nukesor commented Jan 18, 2025

Awesome, thanks for the detailed report, repro and proposed fix :)

I'll properly check on it soon!

@uasi
Copy link
Author

uasi commented Jan 18, 2025

When processing text character by character, it's better to use the s.graphemes(true) method from the unicode-segmentation crate rather than s.chars(). This is because what appears as a single character to users (a grapheme cluster in Unicode terminology) may actually consist of multiple chars (Unicode codepoints), and summing up the widths of chars can differ from the displayed text width.

use unicode_segmentation::UnicodeSegmentation;
use unicode_width::{UnicodeWidthChar, UnicodeWidthStr};

fn main() {
    let s = "🙂‍↕️"; // U+1F642 U+200D U+2195 U+FE0F head shaking vertically
    println!("{:?}", s.chars().map(|c| c.width().unwrap()).collect::<Vec<usize>>()); // => [2, 0, 1, 0]
    println!("{:?}", s.graphemes(true).map(|g| g.width()).collect::<Vec<usize>>()); // => [2]
}

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=0eb3403d13d06e61a9a4e544c2a792a0

@Nukesor
Copy link
Owner

Nukesor commented Feb 2, 2025

@uasi Could you take a look at both #171 and #170 :)

I think they should adresses both issues #167 and #168.

@Nukesor
Copy link
Owner

Nukesor commented Feb 2, 2025

Nevermind. There's still an off-by-one error somewhere in the algorithm in #171. Gotta debug tomorrow.

Feel free to take a stab at it if you finde the time and mindset :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
t: bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants