Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix handling of backslashes and Unicode escape sequences in CSS content #398

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

liulinboyi
Copy link

@liulinboyi liulinboyi commented Feb 9, 2025

This PR addresses an issue in the handling of backslashes and Unicode escape sequences in CSS content. The current implementation may incorrectly escape backslashes, which can break valid CSS syntax, especially when dealing with Unicode escape sequences like \00a0.

Changes:

Improved Handling of Backslashes:
The code now correctly distinguishes between backslashes used for escaping specific characters (like " or ) and those that are part of Unicode escape sequences.
If a backslash is followed by a valid Unicode escape sequence (e.g., \00a0), it is preserved as-is.
If a backslash is followed by any other character, it is properly escaped to \.
Added Validation for Unicode Escape Sequences:
A helper function is_valid_unicode_escape is introduced to check if a sequence of characters following a backslash forms a valid Unicode escape sequence.
This ensures that only valid sequences are preserved, while invalid sequences are treated as regular text.
Code Changes:

fn handle_backslash(s: &str, i: usize) -> Option<&'static str> {
    if i + 1 < s.len() {
        match s.as_bytes()[i + 1] {
            b'0'..=b'9' | b'a'..=b'f' | b'A'..=b'F' => {
                // If the character following the backslash is part of a Unicode escape sequence, preserve the entire sequence
                let mut j = i + 1;
                while j < s.len() && s.as_bytes()[j].is_ascii_hexdigit() && j - i < 6 {
                    j += 1;
                }
                if j - i > 1 {
                    // Preserve the entire Unicode escape sequence
                    Some("\\")
                } else {
                    // If it is not a valid Unicode escape sequence, escape the backslash itself
                    Some("\\\\")
                }
            }
            _ => {
                // If the character following the backslash is any other character, escape the backslash itself
                Some("\\\\")
            }
        }
    } else {
        // If the backslash is the last character, escape the backslash itself
        Some("\\\\")
    }
}

@liulinboyi liulinboyi force-pushed the fix/serializer-escape branch from b9a6b7c to e1336a5 Compare February 10, 2025 12:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant