Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"RemovingTag" event isn't fired. #58

Closed
Alpalhao opened this issue Feb 9, 2016 · 5 comments
Closed

"RemovingTag" event isn't fired. #58

Alpalhao opened this issue Feb 9, 2016 · 5 comments

Comments

@Alpalhao
Copy link

Alpalhao commented Feb 9, 2016

Hi,
After updating to HtmlSanitizer with AngleSharp, we detect the follow behaviour:

When sanitizing a text only containing a element "script" or "style", the RemovingTag event isn't fired.

Here's a test case:

        [Test]
        public void RemoveEventForNotAllowedTag_ScriptTag()
        {
            RemoveReason? actual = null;
            var s = new HtmlSanitizer();
            s.RemovingTag += (sender, args) =>
            {
                actual = args.Reason;
            };
            s.Sanitize("<script>alert('Hello world!')</script>");
            Assert.That(actual, Is.EqualTo(RemoveReason.NotAllowedTag));
        }

        [Test]
        public void RemoveEventForNotAllowedTag_StyleTag()
        {
            RemoveReason? actual = null;
            var s = new HtmlSanitizer();
            s.RemovingTag += (sender, args) =>
            {
                actual = args.Reason;
            };
            s.Sanitize("<style> body {background-color:lightgrey;}</style>");
            Assert.That(actual, Is.EqualTo(RemoveReason.NotAllowedTag));
        }

Adding another tag to the text, works fine.


        [Test]
        public void RemoveEventForNotAllowedTag_ScriptTagAndSpan()
        {
            RemoveReason? actual = null;
            var s = new HtmlSanitizer();
            s.RemovingTag += (sender, args) =>
            {
                actual = args.Reason;
            };
            s.Sanitize("<span>Hi</span><script>alert('Hello world!')</script>");
            Assert.That(actual, Is.EqualTo(RemoveReason.NotAllowedTag));
        }

Thanks

@mganss
Copy link
Owner

mganss commented Feb 10, 2016

This happens because <style> and <script> in isolation are parsed into the head portion of the HTML document. But we are operating only on the body portion. As a workaround you can wrap the string to be sanitized in <body>...</body>, forcing the elements into the body.

This also means you cannot currently sanitize complete HTML documents, only (body) fragments. I know this is not an ideal situation, but I have no idea how to fix this right now without making the sanitizer overly complex and thus error prone. Suggestions are very welcome 😄

@304NotModified
Copy link
Contributor

Was this also in v2?

As a workaround you can wrap the string to be sanitized in ..., forcing the elements into the body.

You can do that also in this lib?

@mganss
Copy link
Owner

mganss commented Feb 11, 2016

This was not in v2 because CsQuery is not as sophisticated a parser as AngleSharp.

I've added wrapping the input in <body>...</body> so the workaround will no longer be necessary in the next version (though doesn't do any harm either).

@mganss mganss closed this as completed Feb 11, 2016
@Alpalhao
Copy link
Author

Thanks Michael.

Any idea when this fix will be released?

@mganss
Copy link
Owner

mganss commented Feb 17, 2016

Released as 3.1.93.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants