Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt to further reduce re-parsing for globally cached images (PR 11912, 16108 follow-up) #17428

Merged
merged 3 commits into from
Dec 22, 2023

Conversation

Snuffleupagus
Copy link
Collaborator

In PR #11912 we started caching images that occur on multiple pages globally, which improved performance a lot in many PDF documents.
However, one slightly annoying limitation of the implementation is the need to re-parse the image once the global-caching threshold has been reached. Previously this was difficult to avoid, since large image-resources will cause cleanup to run on the main-thread after rendering has finished. In PR #16108 we started delaying this cleanup a little bit, to improve performance if a user e.g. zooms and/or rotates the document immediately after rendering completes.

Taking those two PRs together, we now have a situation where it's much more likely that the main-thread has "globally used" images cached at the page-level. Hence we can instead attempt to copy a locally cached image into the global object-cache on the main-thread and thus reduce unnecessary re-parsing of large/complex global images, which significantly reduces the rendering time in many cases.

For the PDF document in issue #11878, the rendering time of the second page changes as follows (on my computer):

  • With the master-branch it takes >600 ms to render.
  • With this patch that goes down to ~50 ms, which is one order of magnitude faster.

(Note that all other pages are, as expected, completely unaffected by these changes.)

This new main-thread copying is limited to "large" global images, since:

  • Re-parsing of small images, on the worker-thread, is usually fast enough to not be an issue.
  • With the delayed cleanup after rendering, it's still not guaranteed that an image is available in a page-level cache on the main-thread.
  • This forces the worker-thread to wait for the main-thread, which is a pattern that you always want to avoid unless absolutely necessary.

@Snuffleupagus Snuffleupagus force-pushed the cacheGlobally-CopyImage branch 2 times, most recently from d0bf33c to e2d7e15 Compare December 15, 2023 09:55
@mozilla mozilla deleted a comment from moz-tools-bot Dec 15, 2023
@mozilla mozilla deleted a comment from moz-tools-bot Dec 15, 2023
@mozilla mozilla deleted a comment from moz-tools-bot Dec 15, 2023
@mozilla mozilla deleted a comment from moz-tools-bot Dec 15, 2023
@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.241.84.105:8877/5b4fe913e37c482/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.193.163.58:8877/6c7e30a1cdbed19/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Failed

Full output at http://54.241.84.105:8877/5b4fe913e37c482/output.txt

Total script time: 23.91 mins

  • Unit tests: Passed
  • Integration Tests: Passed
  • Regression tests: FAILED
  different ref/snapshot: 12
  different first/second rendering: 1

Image differences available at: http://54.241.84.105:8877/5b4fe913e37c482/reftest-analyzer.html#web=eq.log

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/6c7e30a1cdbed19/output.txt

Total script time: 37.63 mins

  • Unit tests: Passed
  • Integration Tests: FAILED
  • Regression tests: FAILED
  different ref/snapshot: 24

Image differences available at: http://54.193.163.58:8877/6c7e30a1cdbed19/reftest-analyzer.html#web=eq.log

src/display/api.js Outdated Show resolved Hide resolved
@calixteman
Copy link
Contributor

The pdf in #11878 is probably a bit unusual, I mean I'm not sure (it's just an opinion) that there are so many pdfs in the wild having a common big image on several pages.
Do you know if there are some improvements in other context (e.g. when rotating ? zooming ?...) ? or with more usual pdfs ?
Do you know if there are some possibles regressions (possibly due to the added sync point in the worker) ?

This (obviously) only includes "resolved" data, and will be used in an upcoming patch.
Currently this is done in the API, but moving it into the worker-thread will simplify upcoming changes.
…1912, 16108 follow-up)

In PR 11912 we started caching images that occur on multiple pages globally, which improved performance a lot in many PDF documents.
However, one slightly annoying limitation of the implementation is the need to re-parse the image once the global-caching threshold has been reached. Previously this was difficult to avoid, since large image-resources will cause cleanup to run on the main-thread after rendering has finished. In PR 16108 we started delaying this cleanup a little bit, to improve performance if a user e.g. zooms and/or rotates the document immediately after rendering completes.

Taking those two PRs together, we now have a situation where it's much more likely that the main-thread has "globally used" images cached at the page-level. Hence we can instead attempt to *copy* a locally cached image into the global object-cache on the main-thread and thus reduce unnecessary re-parsing of large/complex global images, which significantly reduces the rendering time in many cases.

For the PDF document in issue 11878, the rendering time of *the second page* changes as follows (on my computer):
 - With the `master`-branch it takes >600 ms to render.
 - With this patch that goes down to ~50 ms, which is one order of magnitude faster.

(Note that all other pages are, as expected, completely unaffected by these changes.)

This new main-thread copying is limited to "large" global images, since:
 - Re-parsing of small images, on the worker-thread, is usually fast enough to not be an issue.
 - With the delayed cleanup after rendering, it's still not guaranteed that an image is available in a page-level cache on the main-thread.
 - This forces the worker-thread to wait for the main-thread, which is a pattern that you always want to avoid unless absolutely necessary.
@Snuffleupagus Snuffleupagus force-pushed the cacheGlobally-CopyImage branch from e2d7e15 to 9f02cc3 Compare December 21, 2023 20:26
@Snuffleupagus
Copy link
Collaborator Author

The pdf in #11878 is probably a bit unusual, I mean I'm not sure (it's just an opinion) that there are so many pdfs in the wild having a common big image on several pages.

While it's difficult for me to say how common this is, we've seen a number of cases (in both GitHub and Bugzilla) where the same large image is used as the background on all pages; note the issues/bugs listed at the end of #11912 (comment) and those are the ones I knew about three years ago.

Hence I'd say that a global image-cache definitely helps, to both reduce CPU and memory usage, and this patch simply improves things a little bit further.

Do you know if there are some improvements in other context (e.g. when rotating ? zooming ?...) ? or with more usual pdfs ?

This patch won't really affect those cases, as far as I can tell.

Do you know if there are some possibles regressions (possibly due to the added sync point in the worker) ?

There shouldn't be, since only images that we've previously seen at least once on another page will trigger this new code-path.

Copy link
Contributor

@calixteman calixteman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you.

@calixteman
Copy link
Contributor

Just a side note, wdyt about using Map objects for #objs and similar ?

@Snuffleupagus
Copy link
Collaborator Author

Just a side note, wdyt about using Map objects for #objs and similar ?

I've also been thinking about that, but as far as I know Map (and Set) is still slower than an Object; see https://bugzilla.mozilla.org/show_bug.cgi?id=1799154

@Snuffleupagus
Copy link
Collaborator Author

/botio test

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.193.163.58:8877/1dcd8cb0b32524f/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.241.84.105:8877/ea8609c246f872f/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/1dcd8cb0b32524f/output.txt

Total script time: 2.64 mins

  • Unit tests: FAILED
  • Integration Tests: FAILED
  • Regression tests: FAILED

Image differences available at: http://54.193.163.58:8877/1dcd8cb0b32524f/reftest-analyzer.html#web=eq.log

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Failed

Full output at http://54.241.84.105:8877/ea8609c246f872f/output.txt

Total script time: 23.90 mins

  • Unit tests: Passed
  • Integration Tests: Passed
  • Regression tests: FAILED
  different ref/snapshot: 14
  different first/second rendering: 2

Image differences available at: http://54.241.84.105:8877/ea8609c246f872f/reftest-analyzer.html#web=eq.log

@calixteman
Copy link
Contributor

/botio-windows test

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Received

Command cmd_test from @calixteman received. Current queue size: 0

Live output at: http://54.193.163.58:8877/8685011bf36e9e3/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/8685011bf36e9e3/output.txt

Total script time: 37.64 mins

  • Unit tests: Passed
  • Integration Tests: FAILED
  • Regression tests: FAILED
  different ref/snapshot: 27

Image differences available at: http://54.193.163.58:8877/8685011bf36e9e3/reftest-analyzer.html#web=eq.log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants