-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fixed memory leak when layer offset lower than mask offset #15102
Conversation
Could you file a bug and attach a buggy pdf ? I'd like to understand what's exactly the issue. |
width, | ||
chunkHeight | ||
); | ||
const maskX = Math.max(0, layerOffsetX - maskOffsetX); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR should ideally get a test case (PDF file as mentioned above), but if that's hard at the very least a comment here on why this is necessary to prevent it from being removed accidentally during e.g., refactoring. I think the line from your PR description, In some PDF files the mask offset is larger than the layer offset. In this case first two parameters of getImageData become negative numbers and produce a memory leak.
, is already quite a good and descriptive one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that this is Node.js-specific, which probably makes the need for a test-case slightly less important?
Also, the example provided below is 14.5 MB in size (for a one-page document) so it's probably not a great reduced test-case unfortunately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep according to the getImageData
doc:
https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D/getImageData
it's ok to have negative sx
and sy
and the corresponding pixels in the extracted data should be black transparent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we perhaps view this as a bug in the node-canvas
package then, or do we want to consider accepting this work-around here?
yes, sure, here is the working example https://github.com/esvit/pdf.js-memory-leak-example |
@calixteman, @timvandermeij Assuming that it passes all tests, should we consider accepting this patch (it may perhaps lead to slightly less memory usage in affected cases, even in browsers)? |
I don't directly see a problem with merging this. I can't really imagine the negative numbers being relied on in any real-world PDF files because we don't render pixels outside of the canvas. The only potential problem I see is that an exception is raised if zero is used for In short, I'm not really sure. If this is safe and isn't expected to impact real-world PDFs, I'm OK with landing it, but I'm also completely fine with regarding this as a |
Closing since this is PR is almost a year old now and I'm still not entirely convinced we should fix this here. I'm leaning towards seeing this as a bug in the |
@timvandermeij @GGULBAE and I have a PDF witch causes the same error in node-canvas here Automattic/node-canvas#2314 My problem is that I need to process every PDF I get and this PDF causes the process to crash. Acrabot and Co simply draw that PDF without any issue. So for me it would be important if PDF.js could just ignore invalid stuff like this and did not call the underlying API's |
Note that as of PR #18029, which landed three days ago, the code modified in this PR no longer exists. |
In some pdf files, the mask offset is larger than the layer offset. In this case first two parameters of the function
getImageData
become negative numbers and produce such a memory leak: