-
-
Notifications
You must be signed in to change notification settings - Fork 138
HTMLFilter
trans edited this page Sep 13, 2010
·
2 revisions
Comparing HTMLFilter to Loofah
- HTMLFilter is a Ruby port of lib_filter.php (v1.15) by Cal Henderson
- It is pure Ruby with no dependencies
- It also include a CSSFilter class for “sanitizing” stylesheets.
- HTMLFilter is Regexp based
HTMLFilter’s initializer accepts a set of options to specify how it will sanitize HTML. By default it is very restrictive, as it seems that it was designed to sanitized blog comments. These are it’s default options:
DEFAULT = { 'allowed' => { 'a' => ['href', 'target'], 'b' => [], 'i' => [], 'img' => ['src', 'width', 'height', 'alt'] }, 'no_close' => ['img', 'br', 'hr'], 'always_close' => ['a', 'b'], 'protocol_attributes' => ['src', 'href'], 'allowed_protocols' => ['http', 'ftp', 'mailto'], 'remove_blanks' => ['a', 'b'], 'strip_comments' => true, 'always_make_tags' => true, 'allow_numbered_entities' => true, 'allowed_entities' => ['amp', 'gt', 'lt', 'quot'] }
HTMLFilter has one method #filter to which the HTML document or fragment is passed.
htmlfilter = HTMLFilter.new htmlfilter.filter(html)
In benchmarks HTMLFilter is about 2-3x slower than Loofah in dealing with HTML documents and fragments, and about twice as fast in dealing with small text snippets.
HeadToHeadHTMLFilter Large document, 98282 bytes (x100) total single rel Loofah::Helpers.sanitize 23.085 (0.230853) - HTMLFilter sanitize 55.526 (0.555257) 2.41x Small fragment, 3178 bytes (x1000) total single rel Loofah::Helpers.sanitize 7.066 (0.007066) - HTMLFilter sanitize 20.479 (0.020479) 2.90x Text snippet, 58 bytes (x10000) total single rel Loofah::Helpers.sanitize 5.671 (0.000567) - HTMLFilter sanitize 2.756 (0.000276) 0.49x