-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow override of default escape/unescape behavior in more situations #739
Conversation
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #739 +/- ##
==========================================
+ Coverage 61.24% 61.65% +0.40%
==========================================
Files 39 39
Lines 16277 16644 +367
==========================================
+ Hits 9969 10262 +293
- Misses 6308 6382 +74
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
I squashed all commits, refactor our code a bit and move tests to doctests. I didn't include /// Escapes an `&str` with custom replacements. This method does not have any predefined escapes.
///
/// # Example
///
/// ```
/// # use quick_xml::escape::escape_with;
/// # use pretty_assertions::assert_eq;
/// let custom_resolver = |ch: u8| match ch {
/// b'\"' => Some("&"),
/// b'}' => Some("&lcurl;"),
/// b'{' => Some("&rcurl;"),
/// _ => None,
/// };
///
/// let raw = "<&\"'>";
/// let unchanged = escape_with(raw, |_| None);
///
/// let custom = r##"f'A {weird} f-string that says "Hi!"'"##;
/// let changed = r##"f'A &rcurl;weird&lcurl; f-string that says &Hi!&'"##;
///
/// assert_eq!(escape_with(custom, custom_resolver), changed);
/// assert_eq!(raw, unchanged);
/// ```
pub fn escape_with<'input, 'entity, F>(raw: &'input str, mut escape_chars: F) -> Cow<'input, str>
where
// the lifetime of the output comes from a capture or is `'static`
F: FnMut(u8) -> Option<&'entity str>,
{
let bytes = raw.as_bytes();
let mut escaped = None;
let mut pos = 0;
for (i, b) in bytes.iter().enumerate() {
if let Some(replacement) = escape_chars(*b) {
if escaped.is_none() {
escaped = Some(Vec::with_capacity(raw.len()));
}
let escaped = escaped.as_mut().expect("initialized");
escaped.extend_from_slice(&bytes[pos..i]);
escaped.extend_from_slice(replacement.as_bytes());
pos = i + 1;
}
}
if let Some(mut escaped) = escaped {
if let Some(raw) = bytes.get(pos..) {
escaped.extend_from_slice(raw);
}
// WARNING: Can be unsafe because we operate on bytes of UTF-8 input and
// could break character
Cow::Owned(String::from_utf8(escaped).unwrap())
} else {
Cow::Borrowed(raw)
}
} Edit: actually, the problem with this |
src/escape.rs
Outdated
} else if let Some(value) = named_entity(pat) { | ||
unescaped.push_str(value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also considering to completely remove this if
then the unescape_with
will never resolve predefined entities and instead caller will need to call functions for default processing which we make public. Is that something that we want to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be poor UX to force the user to declare all predefined entities. These are built into every other XML parser so I don't see a reason to diverge.
Actually, I think I'm fine with this suggestion, if it means that you pass in a "default resolver" function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I mean making named_entity
public (as resolve_predefined_entity
+ resolve_xml_entity
and resolve_html5_entity
) so the user can call it after processing it's own entities.
I have no idea why such a complex module structure was used
cf27e0d
to
ecb6edf
Compare
- `quick_xml::escape::resolve_predefined_entity` - `quick_xml::escape::resolve_xml_entity` - `quick_xml::escape::resolve_html5_entity`
Because since tafia#739 custom entity resolution function have precedence over standard one, so user can implement resolution of HTML entities by yourself.
Addresses #734, adds tests to verify that no functionality was lost.