Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixed memory leak when layer offset lower than mask offset #15102

Closed
wants to merge 1 commit into from

Conversation

esvit
Copy link

@esvit esvit commented Jun 25, 2022

In some pdf files, the mask offset is larger than the layer offset. In this case first two parameters of the function getImageData become negative numbers and produce such a memory leak:

<--- Last few GCs --->

[3181:0x7f7aa3b00000]     7098 ms: Mark-sweep (reduce) 86.9 (130.2) -> 86.8 (90.7) MB, 109.2 / 0.0 ms  (average mu = 0.384, current mu = 0.000) external memory pressure GC in old space requested
[3181:0x7f7aa3b00000]     7170 ms: Mark-sweep (reduce) 86.8 (90.7) -> 86.7 (90.4) MB, 72.4 / 0.0 ms  (average mu = 0.261, current mu = 0.000) external memory pressure GC in old space requested


<--- JS stacktrace --->

FATAL ERROR: v8::ArrayBuffer::New Allocation failed - process out of memory
 1: 0x10a56f735 node::Abort() (.cold.1) [/usr/local/bin/node]
 2: 0x109151619 node::Abort() [/usr/local/bin/node]
 3: 0x10915178f node::OnFatalError(char const*, char const*) [/usr/local/bin/node]
 4: 0x1092d38c7 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/usr/local/bin/node]
 5: 0x1092d3863 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/usr/local/bin/node]
 6: 0x1092d35eb v8::internal::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*) [/usr/local/bin/node]
 7: 0x1092fb705 v8::ArrayBuffer::New(v8::Isolate*, unsigned long) [/usr/local/bin/node]
 8: 0x10e20b212 Context2d::GetImageData(Nan::FunctionCallbackInfo<v8::Value> const&) [node_modules/canvas/build/Release/canvas.node]
 9: 0x10e1f7d27 Nan::imp::FunctionCallbackWrapper(v8::FunctionCallbackInfo<v8::Value> const&) [node_modules/canvas/build/Release/canvas.node]
10: 0x10933c789 v8::internal::FunctionCallbackArguments::Call(v8::internal::CallHandlerInfo) [/usr/local/bin/node]
11: 0x10933c256 v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments) [/usr/local/bin/node]
12: 0x10933b9cf v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) [/usr/local/bin/node]
13: 0x109bf23b9 Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_BuiltinExit [/usr/local/bin/node]
14: 0x10eed67a1
15: 0x109b8140e Builtins_InterpreterEntryTrampoline [/usr/local/bin/node]

@calixteman
Copy link
Contributor

Could you file a bug and attach a buggy pdf ? I'd like to understand what's exactly the issue.

width,
chunkHeight
);
const maskX = Math.max(0, layerOffsetX - maskOffsetX);
Copy link
Contributor

@timvandermeij timvandermeij Jun 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR should ideally get a test case (PDF file as mentioned above), but if that's hard at the very least a comment here on why this is necessary to prevent it from being removed accidentally during e.g., refactoring. I think the line from your PR description, In some PDF files the mask offset is larger than the layer offset. In this case first two parameters of getImageData become negative numbers and produce a memory leak., is already quite a good and descriptive one.

Copy link
Collaborator

@Snuffleupagus Snuffleupagus Jun 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that this is Node.js-specific, which probably makes the need for a test-case slightly less important?
Also, the example provided below is 14.5 MB in size (for a one-page document) so it's probably not a great reduced test-case unfortunately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep according to the getImageDatadoc:
https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D/getImageData
it's ok to have negative sx and sy and the corresponding pixels in the extracted data should be black transparent.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we perhaps view this as a bug in the node-canvas package then, or do we want to consider accepting this work-around here?

esvit pushed a commit to esvit/pdf.js-memory-leak-example that referenced this pull request Jun 25, 2022
@esvit
Copy link
Author

esvit commented Jun 25, 2022

Could you file a bug and attach a buggy pdf ? I'd like to understand what's exactly the issue.

yes, sure, here is the working example https://github.com/esvit/pdf.js-memory-leak-example

@Snuffleupagus
Copy link
Collaborator

@calixteman, @timvandermeij Assuming that it passes all tests, should we consider accepting this patch (it may perhaps lead to slightly less memory usage in affected cases, even in browsers)?
Or, should we regard this as bug in the node-canvas package instead and decline the patch?

@timvandermeij
Copy link
Contributor

I don't directly see a problem with merging this. I can't really imagine the negative numbers being relied on in any real-world PDF files because we don't render pixels outside of the canvas. The only potential problem I see is that an exception is raised if zero is used for sw or sh according to https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D/getImageData, so while it won't crash due to OOM won't this still raise an exception?

In short, I'm not really sure. If this is safe and isn't expected to impact real-world PDFs, I'm OK with landing it, but I'm also completely fine with regarding this as a node-canvas bug.

@timvandermeij
Copy link
Contributor

Closing since this is PR is almost a year old now and I'm still not entirely convinced we should fix this here. I'm leaning towards seeing this as a bug in the node-canvas package, so filing it upstream seems better.

@lanwin
Copy link

lanwin commented May 3, 2024

@timvandermeij @GGULBAE and I have a PDF witch causes the same error in node-canvas here Automattic/node-canvas#2314

My problem is that I need to process every PDF I get and this PDF causes the process to crash. Acrabot and Co simply draw that PDF without any issue.

So for me it would be important if PDF.js could just ignore invalid stuff like this and did not call the underlying API's

@Snuffleupagus
Copy link
Collaborator

Snuffleupagus commented May 3, 2024

Note that as of PR #18029, which landed three days ago, the code modified in this PR no longer exists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants