Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Send fetch requests for all page dict lookups in parallel
- When adding page dict candidates to the lookup tree, also initiate fetching them from xref, so if they are not yet loaded at all, the XHR will be sent - Only at the top level - assume that if there is a /Pages tree, it is sensibly structured and the number of requests won't be too bad - We can then await on the cached Promise without making the requests pipeline - This has a significant performance improvement for load-on-demand (i.e. with auto-fetch turned off) when a PDF has a large number of pages in the top level /Pages collection, and those pages are spread through a file, so every candidate needs to be fetched separately - PDFs with many pages where each page is a big image and all the pages are at the top level are quite a common output for digitisation programmes - I would have liked to do something like "if it's the top level collection and page count = number of kids, then just fetch that page without traversing the tree" but unfortunately I agree with comments on #8088 that there is no good general solution to allow for /Pages nodes with empty /Kids arrays - The other alternative for fixing this use case is to simply not validate the last page at all, so pages can be loaded on demand. But that validation was added for good reasons, and this would also result in a bad experience if you didn't read the document from the front. Or assume in certain conditions that the top level /Pages contains only pages (see https://github.com/mozilla/pdf.js/compare/master...richard-smith-preservica:pdf.js:rcs/assume-all-pages-in-top-level-when-likely?expand=1), but that allows for particular edge case 'bad' PDFs to render incorrectly eslint Review - Fix new promise side of fetch; local cache variable; validation on when to prefetch
- Loading branch information