Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggested spelling correction seems off-base #72553

Closed
dtolnay opened this issue May 24, 2020 · 3 comments · Fixed by #118381
Closed

Suggested spelling correction seems off-base #72553

dtolnay opened this issue May 24, 2020 · 3 comments · Fixed by #118381
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-suggestion-diagnostics Area: Suggestions generated by the compiler applied by `cargo fix` C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@dtolnay
Copy link
Member

dtolnay commented May 24, 2020

In https://www.reddit.com/r/rust/comments/gpw2ra/how_is_the_rust_compiler_able_to_tell_the_visible/ I noticed this surprising spelling suggestion:

#![feature(non_ascii_idents)]

fn main() {
    let _ = 读文;
}
error[E0425]: cannot find value `读文` in this scope
   --> src/main.rs:45:13
    |
45  |     let _ = 读文;
    |             ^^^^ help: a tuple variant with a similar name exists: `Ok`

To me 读文 and Ok don't seem like they would be similar enough to meet the threshold for showing such a suggestion. Can we calibrate this better for short idents?

For comparison, even kO doesn't assume you mean Ok.

error[E0425]: cannot find value `kO` in this scope
  --> src/main.rs:45:13
   |
45 |     let _ = kO;
   |             ^^ not found in this scope

rustc 1.45.0-nightly (8970e8b 2020-05-23)

Mentioning @estebank who worked on suggestions most recently in #65421.

@dtolnay dtolnay added A-diagnostics Area: Messages for errors, warnings, and lints T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. C-bug Category: This is a bug. labels May 24, 2020
@JohnTitor JohnTitor added the A-suggestion-diagnostics Area: Suggestions generated by the compiler applied by `cargo fix` label May 25, 2020
@Enselic
Copy link
Member

Enselic commented Nov 16, 2023

I debugged it a bit. Ok is selected as the candidate because the edit distance is just 2 to 读文.

The code to fix is likely in or around fn find_best_match_for_name_impl. I tried some quick tricks but got ICEs and failing tests and gave up. One thing I didn't try that could perhaps work is to consider edit distance between characters of different alphabets/logogram sets as infinite, rather than 1, which is the case right now.

@dtolnay
Copy link
Member Author

dtolnay commented Nov 16, 2023

Thank you for investigating!

Independent of what we do with different logogram sets (your suggestion sounds plausible), an edit distance of 2 for a string of length 2 should not meet the threshold for showing a suggestion, even within a single logogram set.

@Enselic
Copy link
Member

Enselic commented Nov 17, 2023

I think edit distance of 1 for a string of length 1 is reasonable (a lot of existing UI tests relies on it), but maybe an edit distance of 2 for a string of length 2 is not reasonable indeed.

As you point out, with regular chars, the edit distance is still 2 but Ok is not suggested. So maybe there is a simpler fix to be made for 读文 than looking at what alphabets/logogram sets characters belong to.

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Nov 27, 2023
rustc_span: Use correct edit distance start length for suggestions

Otherwise the suggestions can be off-base for non-ASCII identifiers. For example suggesting that `Ok` is a name similar to `读文`.

Closes rust-lang#72553.
compiler-errors added a commit to compiler-errors/rust that referenced this issue Nov 28, 2023
rustc_span: Use correct edit distance start length for suggestions

Otherwise the suggestions can be off-base for non-ASCII identifiers. For example suggesting that `Ok` is a name similar to `读文`.

Closes rust-lang#72553.
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Nov 28, 2023
Rollup merge of rust-lang#118381 - Enselic:edit-dist-len, r=WaffleLapkin

rustc_span: Use correct edit distance start length for suggestions

Otherwise the suggestions can be off-base for non-ASCII identifiers. For example suggesting that `Ok` is a name similar to `读文`.

Closes rust-lang#72553.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-suggestion-diagnostics Area: Suggestions generated by the compiler applied by `cargo fix` C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants