We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describe the bug
I was considering using Flexmark as a HTML => text/plain engine for Apache James
(We currently rely on an homegrown Jsoup based parser)
I did throw our test suite at flexmark-html2md-converter and triggered an OutOfMemory error after 18 seconds at the given code:
flexmark-html2md-converter
@Test public void boom() { String html = ("<blockquote>" + "<p>a</p>".repeat(800)) .repeat(400) + "</blockquote>".repeat(400); String plainText = FlexmarkHtmlConverter.builder() .build().convert(html); }
Will throw an OOM
This is because:
O(N)
O(N2)
Same code with different parameters:
String html = ("<blockquote>".repeat(420) + "a<br/>".repeat(400 * 420)) + "</blockquote>".repeat(420);
Generates 1MB of input and 142 MB output.
Those are well in ranges I do encounter in emails.
Is there a way to limit memory that could limit allocated memory (IE size of the output) and just throw when this is exceeded as a defense mechanism?
This would prevent me from DOS attacks though unbounded memory allocation and be a condition for adoption.
The text was updated successfully, but these errors were encountered:
Similar amplification also exists with lists.
EG:
@Test public void boom() { String html = ("<ul>" + "<li>a</li>".repeat(400)).repeat(420) + "</ul>".repeat(420); System.out.println(html.length() + " bytes"); String plainText = FlexmarkHtmlConverter.builder() .build().convert(html); System.out.println(plainText.length() + " bytes"); }
=>
1683780 bytes 71064000 bytes
Sorry, something went wrong.
No branches or pull requests
Describe the bug
I was considering using Flexmark as a HTML => text/plain engine for Apache James
(We currently rely on an homegrown Jsoup based parser)
I did throw our test suite at
flexmark-html2md-converter
and triggered an OutOfMemory error after 18 seconds at the given code:Will throw an OOM
This is because:
O(N)
with the blockquote nesting levelO(N2)
with the blockquote nesting level (for each paragraph N previous blockquotes is appliedSame code with different parameters:
Generates 1MB of input and 142 MB output.
Those are well in ranges I do encounter in emails.
Is there a way to limit memory that could limit allocated memory (IE size of the output) and just throw when this is exceeded as a defense mechanism?
This would prevent me from DOS attacks though unbounded memory allocation and be a condition for adoption.
The text was updated successfully, but these errors were encountered: