-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uneeded escapes with multiline string character escapes #752
Comments
I'm fine with adding toml/crates/toml_edit/src/encode.rs Lines 349 to 366 in c766756
I believe |
On I missed that. I think you're right. That's a bit odd though, I could have sworn Macs used CR for newlines. I've implemented the fix here, but ran into a few issues. One issue is with the valid/spec/string-2.toml test:
The relevant toml file is:
Previously, this library would load str3 as something like:
Note that With my patch:
I think this second version is more correct, as it properly parses the CRLF. However, this made me realize that this library seems to be converting CRLFs into LFs on-load, which is why the test fails; It previously did not transform "\r\n" into "\n". Implementing better CRLF handling, I ran into another issue with the
Updating the |
Also, looks like we carried over this behavior from Before moving forward on this, we should look into what went into the decisions back then. |
Looks like it was carried over from the rewrite from rustc-serialize, which was present from essentially the start of the crate, with no reasoning. Guessing as to why, here's the earliest version of toml's abnf that I could find. It appears that |
I think I'm willing to give it a shot. I do know some users are very touchy around output changes, so I'm going to mark this as a breaking change. I'll also likely reach out to potentially affected people, like Cargo. |
Great, thanks. Also, should I file the bug from earlier as a separate issue? const TOML: &str = "str2 = \"\"\"\nRoses are red\nViolets are blue\"\"\"\nstr3 = \"\"\"\nRoses are red\r\nViolets are blue\"\"\"";
fn main() {
let data: toml::Table = toml::from_str(TOML).unwrap();
let actual_str3 = data.get("str3").unwrap().as_str().unwrap();
let expected_str3 = "Roses are red\r\nViolets are blue";
assert!(actual_str3 == expected_str3, "{actual_str3:?} != {expected_str3:?}");
}
|
The issue being that we convert |
Hello, I'm trying to write and load a string with newlines and tabs with toml, and I'm seeing some strange behavior with CR and HT, though this might be the intended behavior.
Here's a small reproduction:
I would expect (or prefer) the output to be:
Instead, the output is:
CR and HT are escaped.
According to the toml spec, for multiline strings:
It looks like this library is escaping some characters that don't necessarily need to be escaped. However, I request that at least CR be written unescaped, as the library is currently splitting CRLFs. This makes interaction with any line ending normalizers, like git, very messy.
I would also prefer tabs to be written unescaped as well. I am using tabs to format the interior content of a multiline string, and the formatting is lost (at least visually) when passed through this library.
The text was updated successfully, but these errors were encountered: