-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Start CDATA section only after uppercase <![CDATA[
#781
Conversation
(Review in whitespace changes ignored mode)
failures (6): syntax::cdata::lowercase::ns_reader::async_tokio syntax::cdata::lowercase::ns_reader::borrowed syntax::cdata::lowercase::ns_reader::buffered syntax::cdata::lowercase::reader::async_tokio syntax::cdata::lowercase::reader::borrowed syntax::cdata::lowercase::reader::buffered
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #781 +/- ##
==========================================
- Coverage 61.81% 60.16% -1.66%
==========================================
Files 41 41
Lines 16798 16138 -660
==========================================
- Hits 10384 9709 -675
- Misses 6414 6429 +15
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
debug_assert!(buf.ends_with(b"]]")); | ||
Ok(Event::CData(BytesCData::wrap( | ||
// Cut of `![CDATA[` and `]]` from start and end | ||
&buf[8..len - 2], | ||
self.decoder(), | ||
))) | ||
} | ||
// XML requires uppercase only, but we will check that on validation stage: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a particular reason to keep this difference in behavior intact in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HTML allows using mixed case for DOCTYPE
and handling of this was intentionally added to quick-xml in 0.23.0 and related #230. In one of previous PRs you said that you prefer to keep ability to parse HTML documents, so I leaved this ability.
CDATA sections, however, are always started from uppercase variant, event in HTML. Corresponding links a couple of lines below in the comment.
I planned to work on validation after merging #766 (because it changes possible content of Text
events and we will know something about content, and may do not validate that).
After validation PR will be merged, the user will be able to validate each event manually after reading it if it requires strict parser and skip validation if that is not required. The API would be:
match reader.read_event() {
Text(e) => {
// Get iterator over all validation problems and stop at first
// You also may skip problems that you want to allow
if e.validate_iter().any() {
return Err(...);
}
// Shortcut for reporting the first problem as a `Result`
e.validate()?;
}
// do the same for other events
}
Implement more strict check for the start of CDATA section. Incorrect case is a syntax error. In that sense this PR is differs from #780 where attributes in the end tag is validation error. So we should not be case tolerant in this case.