Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve error types (e.g. InvalidPDFException, MissingPDFException) #11658

Closed
xelan opened this issue Mar 3, 2020 · 0 comments · Fixed by #11834
Closed

Preserve error types (e.g. InvalidPDFException, MissingPDFException) #11658

xelan opened this issue Mar 3, 2020 · 0 comments · Fixed by #11834
Labels

Comments

@xelan
Copy link
Contributor

xelan commented Mar 3, 2020

Configuration:

  • Web browser and its version: Chrome 80.0.3987.122 64-bit
  • Operating system and its version: Windows 10 1909
  • PDF.js version: 2.4.384 (build: 30e0f02)
  • Is a browser extension: no

Steps to reproduce the problem:

  1. Set a non-existing (404) PDF URL for the viewer
  2. Open pdf.js
  3. Open DevTools Inspector
  4. Find uncaught error of generic error/exception type Uncaught (in promise) Error: Missing PDF file.

What is the expected behavior?
4. Find uncaught error of specific error/exception type Uncaught (in promise) MissingPDFException: Missing PDF file.

By preserving the exception type, more fine-grained error handling could be performed via client-side logic (e.g. redirect to a search page if a PDF is not found, or to a ticket system in case of invalid PDF files).

Apparently, the error message translation in app.js always throws a generic Error instead of re-throwing with the same exception class of the previous exception.

IMHO, changing the exception class should be BC compatible, as the custom exceptions such as MissingPDFException extend BaseException, which in turn uses the generic Error as prototype.

Thank you very much!

xelan added a commit to xelan/pdf.js that referenced this issue Apr 22, 2020
By preserving the exception type, more fine-grained error handling can be performed via client-side logic (e.g. redirect to a search page if a PDF is not found, or to a ticket system in case of invalid PDF files).

The original exception type and its properties (e.g. status code) are preserved by cloning the error object and assigning the translated message. Note that only a shallow clone is performed for simplicity.

Fixes mozilla#11658
xelan added a commit to xelan/pdf.js that referenced this issue Apr 22, 2020
By preserving the exception type, more fine-grained error handling can be performed via client-side logic (e.g. redirect to a search page if a PDF is not found, or to a ticket system in case of invalid PDF files).

The original exception type and its properties (e.g. status code) are preserved by cloning the error object and assigning the translated message. Note that only a shallow clone is performed for simplicity.

Fixes mozilla#11658
xelan added a commit to xelan/pdf.js that referenced this issue Apr 22, 2020
By preserving the exception type, more fine-grained error handling can be performed via client-side logic (e.g. redirect to a search page if a PDF is not found, or to a ticket system in case of invalid PDF files).

The original exception is re-thrown with the translated error message.

Fixes mozilla#11658
xelan added a commit to xelan/pdf.js that referenced this issue Apr 28, 2020
By preserving the exception type, more fine-grained error handling can be performed via client-side logic (e.g. redirect to a search page if a PDF is not found, or to a ticket system in case of invalid PDF files).

The original exception is now re-thrown.

Fixes mozilla#11658
EugenMayer added a commit to KontextWork/pdf.js that referenced this issue Sep 24, 2020
* Update the `FontLoader.bind` method to avoid explicitly returning `undefined`

The only reason for the `return undefined;` lines was to appease the ESLint `consistent-return` rule, but that's not actually necessary if you make use of the fact that the method is `async` and that we can thus await the Promise rather than returning it.

* Fix typo

* Slightly improve the `BaseViewer.{firstPagePromise, onePageRendered, pagesPromise}` functionality

There's a couple of issues with this functionality:
 - The respective `PromiseCapability` instances are not being reset, in `BaseViewer._resetView`, when the document is closed which is inconsistent with all other state.
 - While the default viewer depends on these promises, and they thus ought to be considered part of e.g. the `PDFViewer` API-surface, they're not really defined in a particularily user-visible way (being that they're attached to the `BaseViewer` instance *inline* in `BaseViewer.setDocument`).
 - There's some internal `BaseViewer` state, e.g. `BaseViewer._pageViewsReady`, which is tracked manually and could instead be tracked indirectly via the relevant `PromiseCapability`, thus reducing the need to track state *twice* since that's always best to avoid.

*Please note:* In the existing implementation, these promises are not defined *until* the `BaseViewer.setDocument` method has been called.
While it would've been simple to lift that restriction in this patch, I'm purposely choosing *not* to do so since this ensures that any Promise handlers added inside of `BaseViewer.setDocument` are always invoked *before* any external ones (and keeping that behaviour seems generally reasonable).

* Initialize the `textLayerFactory` once in `BaseViewer.setDocument`, rather than repeating it for every page

For reasons that I don't even pretend to understand, the `textLayerFactory` property is determined for *every single* page in the PDF document.
Given that the `TextLayerMode` should be consistent for *all* pages in a document, we obviously could/should define `textLayerFactory` just once instead.

* Prevent lookup errors in `PartialEvaluator.hasBlendModes` from breaking all parsing/rendering of a page (issue 11678)

The PDF document in question is *corrupt*, since it contains an XObject with a truncated dictionary and where the stream contents start without a "stream" operator.

* Limit the number of warning messages printed by any one `Lexer.getHexString` invocation

*This patch fixes something that's annoyed me every now and then over the years, when debugging/fixing corrupt PDF documents.*

For corrupt PDF documents where `Lexer.getHexString` encounters invalid characters, there's very rarely just a handful of them. In practice it's not uncommon for there to be many hundreds, or even many thousands, invalid hex characters found.
Not only is the resulting console warning spam utterly useless in these cases, there's often enough of it that performance may even suffer; hence this patch which limits the amount of messages that any one `Lexer.getHexString` invocation may print.

* Magnifier positioning in reftest analyzer

When reftest analyzer shows magnified pixels, there is a seemingly random offset between the mouse position and the magnified position. The reason for this is that reftest analyzer assumes all images have 800 * 1000 pixels but actually the test images have varying sizes.

* Remove transition effects from toolbar buttons/fields

While Firefox originally used transition effects for browser UI toolbar buttons, that was removed years ago in https://bugzilla.mozilla.org/show_bug.cgi?id=1393057

Since the PDF.js viewer toolbar transitions were likely based on the Firefox ones, it seems reasonable that these transition effects are removed from PDF.js as well. Besides removing a bunch of CSS, this also makes the toolbar feel ever so slightly more "snappy" without these delays on mouse interaction.

(In order to make it more feasible to modernize/improve the viewer UI, trying to clean-up/simplify existing rules iteratively seems like the most reasonble way to make any progress here w.r.t. being able to test/review things.)

* Move `IsLittleEndianCached` and `IsEvalSupportedCached` to `src/shared/util.js`

Rather than duplicating the lookup and caching in multiple files, it seems easier to simply move all of this functionality into `src/shared/util.js` instead.
This will also help avoid a bunch of ESLint errors once the `no-shadow` rule is eventually enabled.

* Rename the `isSpace` helper function to `isWhiteSpace`

Trying to enable the ESLint rule `no-shadow`, against the `master` branch, would result in a fair number of errors in the `Glyph` class in `src/core/fonts.js`.
Since the glyphs are exposed through the API, we can't very well change the `isSpace` property on `Glyph` instances. Thus the best approach seems, at least to me, to simply rename the `isSpace` helper function to `isWhiteSpace` which shouldn't cause any issues given that it's only used in the `src/core/` folder.

* Remove variable shadowing from the JavaScript files in the `web/` folder

*This is part of a series of patches that will try to split PR 11566 into smaller chunks, to make reviewing more feasible.*

Once all the code has been fixed, we'll be able to eventually enable the ESLint `no-shadow` rule; see https://eslint.org/docs/rules/no-shadow

* Bump acorn from 6.4.0 to 6.4.1

Bumps [acorn](https://github.com/acornjs/acorn) from 6.4.0 to 6.4.1.
- [Release notes](https://github.com/acornjs/acorn/releases)
- [Commits](https://github.com/acornjs/acorn/compare/6.4.0...6.4.1)

Signed-off-by: dependabot[bot] <[email protected]>

* Don't accidentally accept invalid glyphNames which *appear* to follow the Cdd{d}/cdd{d} format in `PartialEvaluator._buildSimpleFontToUnicode` (issue 11697)

The /Differences array of the problematic font contains a `/c.1` entry, which is consequently detected as a *possible* Cdd{d}/cdd{d} glyphName by the existing heuristics.
Because of how the base 10 conversion is implemented, which is necessary for the base 16 special case, the parsed charCode becomes `0.1` thus causing `String.fromCodePoint` to throw since that obviously isn't a valid code point.

To fix the referenced issue, and to hopefully prevent similar ones in the future, the patch adds *additional* validation of the charCode found by the heuristics.

* Ensure that the JavaScript-warning is always displayed, in the viewer, regardless of browser printing support

The viewer doesn't currently support executing of any JavaScript, as found in some PDF documents, for security reasons. (Although there's a small hack to at least provide basic support for automatic printing on document load, without running scripts.)

However, in the event that the browser doesn't support printing we're not run *any* of this code. In particular that means that we're also not displaying the "Warning: JavaScript is not supported" message, which seems strange since JavaScript found in a PDF document can really contain *anything* (and not only printing instructions).

It thus seem reasonable, as far as I'm concerned, to always display the JavaScript-warning *even* if printing happens to be unsupported.

* Move the `webViewerOpenFile` function, and the "openfile" eventBus listener, since they only matter in GENERIC builds of the default viewer

This is similar to the existing implementation of the `webViewerFileInputChange` function and its associated "fileinputchange" event.

* Update `npm` packages

* Update l10n files

* Enable the `prefer-exponentiation-operator` ESLint rule

Please see https://eslint.org/docs/rules/prefer-exponentiation-operator for additional information.

* [api-minor] Change the pageIndex, on `PDFPageProxy` instances, to a private property

This property has never been documented and/or *intentionally* exposed through the API, instead the `PDFPageProxy.pageNumber` property is the documented/intended API to use here.
Hence pageIndex is changed to a "private" property on `PDFPageProxy` instances, and internal API functionality is also updated to *consistently* use `this._pageIndex` rather than a mix of formats.

* Bump versions in `pdfjs.config`

* Update the getting started page of the website for the new release

* Remove the unused `wideChars` property on `Font` instances

This property was added in PR 1599 (almost eight years ago), but has been unused ever since PR 3674 (six and a half years ago).

* Update the `eslint-plugin-no-unsanitized` package to the latest version

* Don't let Travis run `npm update` during setup

Letting Travis update npm packages can lead to unexpected and completely unrelated failures for any PR and/or merge, see e.g. 11719, if there's ever backwards *incompatible* changes when a package is updated.

The *exact* package versions are specified in `package-lock.json`, and they should thus be used when running tests. Note that the bots won't update npm packages, and neither should Travis as far as I'm concerned.

* Always prefer the PDF.js JPEG decoder for very large images, in order to reduce peak memory usage (issue 11694)

When JPEG images are decoded by the browser, on the main-thread, there's a handful of short-lived copies of the image data; see https://github.com/mozilla/pdf.js/blob/c3f4690bde8137d80c74203b1ad91476fc2ca160/src/display/api.js#L2364-L2408
That code thus becomes quite problematic for very big JPEG images, since it increases peak memory usage a lot during decoding. In the referenced issue there's a couple of JPEG images whose dimensions are `10006 x 7088` (i.e. ~68 mega-pixels), which causes the *peak* memory usage to increase by close to `1 GB` (i.e. one giga-byte) in my testing.

By letting the PDF.js JPEG decoder, rather than the browser, handle very large images the *peak* memory usage is considerably reduced and the allocated memory also seem to be reclaimed faster.

*Please note:* This will lead to movement in some existing `eq` tests.

* Remove variable shadowing from the JavaScript files in the `src/display/` folder

*This is part of a series of patches that will try to split PR 11566 into smaller chunks, to make reviewing more feasible.*

Once all the code has been fixed, we'll be able to eventually enable the ESLint no-shadow rule; see https://eslint.org/docs/rules/no-shadow

* Only build the necessary `web/` files during the `gulp default_preferences` task

By explicitly specifying only the required `web/` files, the runtime of the gulp task is reduced by approximately 30 percent.

* Add a bit more validation in the `ViewHistory` constructor

 - Ensure that `database.files` actually contains an Array, rather than some arbitrary data.

 - Only try to lookup an existing entry when the `database` existed on load, since there's obviously nothing to find when `database.files = []` was set (this case is very common in the MOZCENTRAL build since `sessionStorage` is being used there).

* Re-factor `PDFViewerApplication.load` such that `{PDFViewer, PDFThumbnailViewer}.setDocument` happens slightly earlier

The `BaseViewer.setDocument` method in particular is necessary for rendering to start when the viewer loads, hence you obviously want that to happen as soon as possible and without any unnecessary delays.
Unfortunately *some* API calls need to be done before that, note existing comments, however the `ViewHistory` initialization (and subsequent fetching of data) in particular can be moved slightly without any adverse effects.

As part of testing I've used logging with `performance.now()` inserted in various parts of this code, and there's *obviously* no discernible changes between `master` and this patch for e.g. rendering starting in the viewer.

*Note:* The vast majority of this patch is simple indentation changes, which were forced by Prettier (and done automatically with `gulp lint --fix`).

* Remove unnecessary checks from the `PDFDocumentProperties` constructor

Given that none of the relevant options are marked as optional in the code/JSDocs, and that the `PDFDocumentProperties` class is specific to the default viewer (and not exposed as part of the viewer components), there's no good reason as far as I can tell for these checks.

* Remove old API methods which were previously converted to throwing (PR 11219 follow-up)

These methods were deprecated already in PDF.js version `2.1.266`, see PRs 10246 and 10369, and were converted to throw `Error`s upon invocation in PDF.js version `2.4.456`, see PR 11219.
Hence it ought to be possible to remove these methods now.

* Exclude the `setPDFNetworkStreamFactory` function from the built API docs

Please note that the `setPDFNetworkStreamFactory` functionality isn't exposed in the public API, i.e. not listed among the exports in the `src/pdf.js` file, and that even if it were it wouldn't really be useful considering that none of the `PDFNetworkStream`/`PDFFetchStream`/`PDFNodeStream` classes are exported either.

* Add `passive: false` to the `wheel` event listener used in `PDFPresentationMode` (issue 11735, PR 10765 follow-up)

* Remove variable shadowing from the JavaScript files in the `src/core/` folder

*This is part of a series of patches that will try to split PR 11566 into smaller chunks, to make reviewing more feasible.*

Once all the code has been fixed, we'll be able to eventually enable the ESLint no-shadow rule; see https://eslint.org/docs/rules/no-shadow

* Remove variable shadowing from the JavaScript files in the `test/unit/` folder

*This is part of a series of patches that will try to split PR 11566 into smaller chunks, to make reviewing more feasible.*

Once all the code has been fixed, we'll be able to eventually enable the ESLint no-shadow rule; see https://eslint.org/docs/rules/no-shadow

* The first glyph in CFF CIDFonts must be named 0 instead of ".notdef"

Fixes #11718 in which the `ff` ligature glyph is at index zero in a CFF font. Beacuse this is a CIDFont, glyph names are CIDs, which are integers. Thus the string `".notdef"` is not correct. The rest of the charset data is already parsed correctly as integers when the boolean argument `cid` is true.

* Ensure that automatic printing still works when the viewer and/or its pages are hidden (bug 1618621, bug 1618955)

Please note that this patch, on its own, won't magically fix all of these printing bugs without [bug 1618553](https://bugzilla.mozilla.org/show_bug.cgi?id=1618553) also being fixed.
(However I don't foresee that being too difficult, famous last words :-), but it will as suggested require a platform API that we can notify when the viewer is ready.)

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1618621
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1618955
Fixes 8208

* Rename `BaseViewer._setDocumentViewerElement` to `BaseViewer._viewerElement`

It was pointed out the the old name felt confusing, so let's just rename the getter since it's an internal property anyway.

* Enable the ESLint `no-shadow` rule

This rule is *not* currently enabled in mozilla-central, but it appears commented out[1] in the ESLint definition file; see https://searchfox.org/mozilla-central/rev/c80fa7258c935223fe319c5345b58eae85d4c6ae/tools/lint/eslint/eslint-plugin-mozilla/lib/configs/recommended.js#238-239

Unfortunately this rule is, for fairly obvious reasons, impossible to `--fix` automatically (even partially) and each case thus required careful manual analysis.
Hence this ESLint rule is, by some margin, probably the most difficult one that we've enabled thus far. However, using this rule does seem like a good idea in general since allowing variable shadowing could lead to subtle (and difficult to find) bugs or at the very least confusing code.

Please find additional details about the ESLint rule at https://eslint.org/docs/rules/no-shadow

---
[1] Most likely, a very large number of lint errors have prevented this rule from being enabled thus far.

* Whitelist closure related cases to address the remaining `no-shadow` linting errors

Given the way that "classes" were previously implemented in PDF.js, using regular functions and closures, there's a fair number of false positives when the `no-shadow` ESLint rule was enabled.

Note that while *some* of these `eslint-disable` statements can be removed if/when the relevant code is converted to proper `class`es, we'll probably never be able to get rid of all of them given our naming/coding conventions (however I don't really see this being a problem).

* Remove a spurious `console.log` from the `ChromiumBrowser` function in `test/webbrowser.js` file

This looks entirely like something which was left-over from debugging, and that line hasn't been touched since PR 4515, especially considering that the corresponding branch in `FirefoxBrowser` doesn't print anything.

* Add `passive: false` when removing wheel listeners

Code of listening `wheel` event uses `{passive: false}`,
while this argument will be treated as `true` before Firefox 49,
accordin to https://developer.mozilla.org/en-US/docs/Web/API/EventTarget/addEventListener#Browser_compatibility .

This commit adds it when removing wheel listeners,
so that such listeners can be really removed.

* Remove the unused `sizes` and `encoding` properties on `Font` instances

The `sizes` property doesn't appear to have been used ever since the code was first split into main/worker-threads, which is so many years ago that I wasn't able to easily find exactly in which PR/commit it became unused.

The `encoding` property is always assigned the `properties.baseEncoding` value, however the `PartialEvaluator` doesn't actually compute/set that value any more. Again it was difficult to determine when it became unused, but it's been that way for years.

* [api-minor] Remove the `getGlobalEventBus` viewer functionality (PR 11631 follow-up)

The correct/intended way of working with the "viewer components" is by providing an `EventBus` instance upon initialization, and the `getGlobalEventBus` was only added for backwards compatibility.
Note, for example, that using `getGlobalEventBus` doesn't really work at all well with a use-case where there's *multiple* `PDFViewer` instances on a one page, since it may then be difficult/impossible to tell which viewer a particular event originated from.

All of the "viewer components" examples have been previously updated, such that there's no longer any code/examples which relies on the now removed `getGlobalEventBus` functionality.

* [api-minor] Remove the `eventBusDispatchToDOM` option/preference, and thus the general ability to dispatch "viewer components" events to the DOM

This functionality was only added to the default viewer for backwards compatibility and to support the various PDF viewer tests in mozilla-central, with the intention to eventually remove it completely.
While the different mozilla-central tests cannot be *easily* converted from DOM events, it's however possible to limit that functionality to only MOZCENTRAL builds *and* when tests are running.

Rather than depending of the re-dispatching of internal events to the DOM, the default viewer can instead be used in e.g. the following way:
```javascript
document.addEventListener("webviewerloaded", function() {
  PDFViewerApplication.initializedPromise.then(function() {
    // The viewer has now been initialized, and its properties can be accessed.

    PDFViewerApplication.eventBus.on("pagerendered", function(event) {
      console.log("Has rendered page number: " + event.pageNumber);
    });
  });
});
```

* Replace the RTL images with CSS transforms of the standard images (issue 11766)

This avoids unnecessary duplication of many images, thus reducing the size of PDF.js image resources slightly.

Note that since the images should only be flipped horizontally, this required specifying the horizontal/vertical scaling separately for the hiDPI-images.

* Fail early, in modern `GENERIC` builds, if certain required browser functionality is missing (issue 11762)

With two kind of builds now being produced, with/without translation/polyfills, it's unfortunately somewhat easy for users to accidentally pick the wrong one.

In the case where a user would attempt to use a modern build of PDF.js in an older browser, such as e.g. IE11, the failure would be immediate when the code is loaded (given the use of unsupported ECMAScript features).
However in some browsers/environments, in particular Node.js, a modern PDF.js build may load correctly and thus *appear* to function, only to fail for e.g. certain API calls. To hopefully lessen the support burden, and to try and improve things overall, this patch adds checks to ensure that a modern build of PDF.js cannot be used in browsers/environments which lack native support for critical functionality (such as e.g. `ReadableStream`). Hence we'll fail early, with an error message telling users to pick an ES5-compatible build instead.

To ensure that we actually test things better especially w.r.t. usage of the PDF.js library in Node.js environments, the `gulp npm-test` task as used by Node.js/Travis was changed (back) to test an ES5-compatible build.
(Since the bots still test the code as-is, without transpilation/polyfills, this shouldn't really be a problem as far as I can tell.)
As part of these changes there's now both `gulp lib` and `gulp lib-es5` build targets, similar to e.g. the generic builds, which thanks to some re-factoring only required adding a small amount of code.

*Please note:* While it's probably too early to tell if this will be a widespread issue, it's possible that this is the sort of patch that *may* warrant being `git cherry-pick`ed onto the current beta version (v2.4.456).

* Update `npm` packages

* Update the `mkdirp` package, since its major version was increased

* Update l10n files

* Ensure that `Font.charToGlyph` won't fail because `String.fromCodePoint` is given an invalid code point (issue 11768)

*Please note:* This patch on its own is *not* sufficient to address the underlying problem in the referenced issue, hence why no test-case is included since the *actual* bug still needs to be fixed.

As can be seen in the specification, https://tc39.es/ecma262/#sec-string.fromcodepoint, `String.fromCodePoint` will throw a RangeError for invalid code points.

In the event that a CMap, in a composite font, contains invalid data and/or we fail to parse it correctly, it's thus possible that the glyph mapping that we build end up with entires that cause `String.fromCodePoint` to throw and thus `Font.charToGlyph` to break.
If that happens, as is the case in issue 11768, significant portions of a page/document may fail to render which seems very unfortunate.

While this patch doesn't fix the underlying problem, it's hopefully deemed useful not only for the referenced issue but also to prevent similar bugs in the future.

* Update the "gulp jsdoc" task to account for API changes in the `mkdirp` package (PR 11772 follow-up)

I completely overlooked the fact that we had *one* occurrence of an asynchronous `mkdirp` call in the gulpfile, which thus breaks since the package now uses Promises rather than a callback function; sorry about that!

* Move the initialization of "page labels" out of `PDFViewerApplication.load`

Over time, with more and more API-functionality added, the `PDFViewerApplication.load` method has become quite large and complex. In an attempt to improve the current situation somewhat, this patch moves the fetching and initialization of "page labels" out into its own (private) helper method instead.

* Move the initialization of "metadata" out of `PDFViewerApplication.load`

Over time, with more and more API-functionality added, the `PDFViewerApplication.load` method has become quite large and complex. In an attempt to improve the current situation somewhat, this patch moves the fetching and initialization of "metadata" out into its own (private) helper method instead.

* Move the initialization of "auto print" out of `PDFViewerApplication.load`

Over time, with more and more API-functionality added, the `PDFViewerApplication.load` method has become quite large and complex. In an attempt to improve the current situation somewhat, this patch moves the fetching and initialization of "auto print" out into its own (private) helper method instead.

* A couple of small improvements of the `PDFViewerApplication.{_initializeMetadata, _initializePdfHistory}` methods

 - Use template strings when printing document/viewer information in `_initializeMetadata`, since the old format feels overly verbose.
   Also, get the WebGL state from the `BaseViewer` instance[1] rather than the `AppOptions`. Since the `AppOptions` value could theoretically have been changed (by the user) after the viewer components were initialized, it seems much more useful to print the *actual* value that'll be used during rendering.

 - Change `_initializePdfHistory` to actually do the "is embedded"-check first, in accordance with the comment and given that the "disableHistory" option usually shouldn't be set.

---
[1] Admittedly reaching into the `BaseViewer` instance and just grabbing the value perhaps isn't a great approach overall, but given that the WebGL-backend isn't even on by default this probably doesn't matter too much.

* Change `Font.exportData` to use an explicit white-list of exportable properties

This patch addresses an existing, and very long standing, TODO in the code such that it's no longer possible to send arbitrary/unnecessary font properties to the main-thread.
Furthermore, by having a white-list it's also very easy to see *exactly* which font properties are being exported.

Please note that in its current form, the list of exported properties contains *every* possible enumerable property that may exist in a `Font` instance.
In practice no single font will contain *all* of these properties, and e.g. embedded/non-embedded/Type3 fonts will all differ slightly with respect to what properties are being defined. Hence why only explicitly set properties are included in the exported data, to avoid half of them being `undefined`, which however should not be a problem for any existing consumer (since they'd already need to handle those cases).

Since a fair number of these font properties are completely *internal* functionality, and doesn't make any sense to expose on the main-thread and/or in the API, follow-up patch(es) will be required to trim down the list. (I purposely included all properties here for brevity and future documentation purposes.)

* [api-minor] Prevent `Font.exportData` from exporting internal/unused properties

A number of *internal* font properties, which only make sense on the worker-thread, were previously exported. Some of these properties could also contain potentially large Arrays/Objects, which thus unnecessarily increases memory usage since we're forced to copy these to the main-thread and also store them there.

This patch stops exporting the following font properties:

 - "_shadowWidth": An internal property, which was never intended to be exported.

 - "charsCache": An internal cache, which was never intended to be exported and doesn't make any sense on the main-thread. Furthermore, by the time `Font.exportData` is called it's usually `undefined` or a mostly empty Object as well.

 - "cidEncoding": An internal property used with (some) composite fonts.
   As can be seen in the `PartialEvaluator.translateFont` method, `cidEncoding` will only be assigned a value when the font dictionary has an "Encoding" entry which is a `Name` (and not in the `Stream` case, since those obviously cannot be cloned).
   All-in-all this property doesn't really make sense on the main-thread and/or in the API, and note also that the resulting `cMap` property is (partially) available already.

 - "fallbackToUnicode": An internal map, part of the heuristics used to improve text-selection in (some) badly generated PDF documents with simple fonts. This was never intended to be exposed on the main-thread and/or in the API.

 - "glyphCache": An internal cache, which was never intended to be exported and which doesn't make any sense on the main-thread. Furthermore, by the time `Font.exportData` is called it's usually a mostly empty Object as well.

 - "isOpenType": An internal property, used only during font parsing on the worker-thread. In the *very* unlikely event that an API consumer actually needs that information, then `fontType` should be a (generally) much better property to use.

Finally, in the (hopefully) unlikely event that any of these properties become necessary on the main-thread, re-adding them to the white-list is easy to do.

* Use "standard" shadowing in the `Font.spaceWidth` method

With `Font.exportData` now only exporting white-listed properties, there should no longer be any reason to not use standard shadowing in the `Font.spaceWidth` method.
Furthermore, considering the amount of other changes to the code-base over the years it's not even clear to me that the special-case was necessary any more (regardless of the preceding patches).

* Ensure that all `Font` instances have the `vertical` property set to a boolean

Given that the `vertical` property is always accessed on the main-thread, ensuring that the property is explicitly defined seems like the correct thing to do since it also avoids boolean casting elsewhere in the code-base.

* Change the signature of `TranslatedFont`, and convert it to a proper class

In preparation for the next patch, this changes the signature of `TranslatedFont` to take an object rather than individual parameters. This also, in my opinion, makes the call-sites easier to read since it essentially provides a small bit of documentation of the arguments.

Finally, since it was necessary to touch `TranslatedFont` anyway it seemed like a good idea to also convert it to a proper `class`.

* Improve detection of binary/ASCII `eexec` encrypted Type1 font programs in `Type1Parser` (issue 11740)

The PDF document, in the referenced issue, actually contains ASCII-encoded Type1 data which we currently *incorrectly* identify as binary.

According to the specification, see https://www-cdf.fnal.gov/offline/PostScript/T1_SPEC.PDF#[{%22num%22%3A203%2C%22gen%22%3A0}%2C{%22name%22%3A%22XYZ%22}%2C87%2C452%2Cnull], the current checks are insufficient to decide between binary/ASCII encoded Type1 font programs.

* Create the glyph mapping correctly for composite Type1, i.e. CIDFontType0, fonts (issue 11740)

This updates `Type1Font.getGlyphMapping` with a code-path "borrowed" from `CFFFont.getGlyphMapping`.

* Make the `decryptAscii` helper function, in `src/core/type1_parser.js`, slightly more efficient

By slicing the Uint8Array directly, rather than using the prototype and a `call` invocation, the runtime of `decryptAscii` is decreased slightly (~30% based on quick logging).
The `decryptAscii` function is still less efficient than `decrypt`, however ASCII encoded Type1 font programs are sufficiently rare that it probably doesn't matter much (we've only seen *two* examples, issue 4630 and 11740).

* [api-minor] Change `Font.exportData` to, by default, stop exporting properties which are completely unused on the main-thread and/or in the API (PR 11773 follow-up)

For years now, the `Font.exportData` method has (because of its previous implementation) been exporting many properties despite them being completely unused on the main-thread and/or in the API.
This is unfortunate, since among those properties there's a number of potentially very large data-structures, containing e.g. Arrays and Objects, which thus have to be first structured cloned and then stored on the main-thread.

With the changes in this patch, we'll thus by default save memory for *every* `Font` instance created (there can be a lot in longer documents). The memory savings obviously depends a lot on the actual font data, but some approximate figures are: For non-embedded fonts it can save a couple of kilobytes, for simple embedded fonts a handful of kilobytes, and for composite fonts the size of this auxiliary can even be larger than the actual font program itself.

All-in-all, there's no good reason to keep exporting these properties by default when they're unused. However, since we cannot be sure that every property is unused in custom implementations of the PDF.js library, this patch adds a new `getDocument` option (named `fontExtraProperties`) that still allows access to the following properties:

 - "cMap": An internal data structure, only used with composite fonts and never really intended to be exposed on the main-thread and/or in the API.
   Note also that the `CMap`/`IdentityCMap` classes are a lot more complex than simple Objects, but only their "internal" properties survive the structured cloning used to send data to the main-thread. Given that CMaps can often be *very* large, not exporting them can also save a fair bit of memory.

 - "defaultEncoding": An internal property used with simple fonts, and used when building the glyph mapping on the worker-thread. Considering how complex that topic is, and given that not all font types are handled identically, exposing this on the main-thread and/or in the API most likely isn't useful.

 - "differences": An internal property used with simple fonts, and used when building the glyph mapping on the worker-thread. Considering how complex that topic is, and given that not all font types are handled identically, exposing this on the main-thread and/or in the API most likely isn't useful.

 - "isSymbolicFont": An internal property, used during font parsing and building of the glyph mapping on the worker-thread.

  - "seacMap": An internal map, only potentially used with *some* Type1/CFF fonts and never intended to be exposed in the API. The existing `Font.{charToGlyph, charToGlyphs}` functionality already takes this data into account when handling text.

 - "toFontChar": The glyph map, necessary for mapping characters to glyphs in the font, which is built upon the various encoding information contained in the font dictionary and/or font program. This is not directly used on the main-thread and/or in the API.

 - "toUnicode": The unicode map, necessary for text-extraction to work correctly, which is built upon the ToUnicode/CMap information contained in the font dictionary, but not directly used on the main-thread and/or in the API.

 - "vmetrics": An array of width data used with fonts which are composite *and* vertical, but not directly used on the main-thread and/or in the API.

 - "widths": An array of width data used with most fonts, but not directly used on the main-thread and/or in the API.

* Add a heuristic to scale even single-char text, when the horizontal/vertical scaling differs significantly (issue 11713)

At this point in time, compared to when the "ignore single-char" code was added, we *should* generally be doing a much better job of combining text into as few chunks as possible.
However, there's still bad cases where we're not able to combine text as much as one would like, which is why I'm *not* proposing to simply measure/scale all text. Instead this patch will to only measure/scale single-char text in cases where the horizontal/vertical scale is off significantly, since that's were you'd expect bad text-selection behaviour otherwise.

Note that most of the movement caused by this patch is with Type3 fonts, which is a somewhat special font type and one where our current text-selection behaviour is probably the least good.

* Add a new `pdfjs.enablePermissions` preference, off by default, to allow the PDF documents to disable copying in the viewer (bug 792816)

*Please note:* Most of the necessary API work was done in PR 10033, and the only remaining thing to do here was to implement it in the viewer.

The new preference should thus allow e.g. enterprise users to disable copying in the viewer, for PDF documents whose permissions specify that.

In order to simplify things the "copy"-permission was implemented using CSS, as suggested in https://bugzilla.mozilla.org/show_bug.cgi?id=792816#c55, which should hopefully suffice.[1]
The advantage of this approach, as opposed to e.g. disabling the `textLayer` completely, is first of all that it ensures that searching still works correctly even in copy-protected documents. Secondly this also greatly simplifies the overall implementation, since it doesn't require a lot of code for something that's disabled by default.

---
[1] As the discussion in the bug shows, this kind of copy-protection is not very strong and is also generally easy to remove/circumvent in various ways. Hence a simple solution, targeting "regular"-users rather than "power"-users is hopefully deemed acceptable here.

* Don't bundle the fallback `grab`/`grabbing` cursor images when running `gulp mozcentral`

These cursor images are only necessary as a fallback for older browsers, hence there's no reason to keep shipping them in Firefox as far as I can tell; see https://developer.mozilla.org/en-US/docs/Web/CSS/cursor#Browser_compatibility

* Fail early, in modern `GENERIC` builds, if certain required browser functionality is missing (PR 11771 follow-up)

With two kind of builds now being produced, with/without translation/polyfills, it's unfortunately somewhat easy for users to accidentally pick the wrong one.

In the case where a user would attempt to use a modern build of PDF.js in an older browser, such as e.g. IE11, the failure would be immediate when the code is loaded (given the use of unsupported ECMAScript features).
However in some browsers/environments, a modern PDF.js build may load correctly and thus *appear* to function, only to fail for e.g. certain API calls. To hopefully lessen the support burden, and to try and improve things overall, this patch adds additional checks to ensure that a modern build of PDF.js cannot be used in browsers/environments which lack native support for `Promise.allSettled`.[1] Hence we'll fail early, with an error message telling users to pick an ES5-compatible build instead.

*Please note:* While it's probably too early to tell if this will be a widespread issue, it's possible that this is the sort of patch that *may* warrant being `git cherry-pick`ed onto the current beta version (v2.4.456).

---
[1] This was a fairly recent addition to the web platform, see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/allSettled#Browser_compatibility

* Remove any mention of Gitpod from the README (issue 11732)

 - Since the Gitpod contributing work-flow is, from the PDF.js project perspective, completely unsupported we don't want to mention it in such a highly visible way as the README file.

 - Since Gitpod is a commercial service, we probably want to avoid *potentially* being seen as advertising and/or endorsing it by mentioning it (somewhat prominently) in the PDF.js README file.

 - By leaving the actual Gitpod files in the repository, for now, this should thus avoid outright breaking things for any existing users.

* Update Prettier to version 2.0

Please note that these changes were done automatically, using `gulp lint --fix`.

Given that the major version number was increased, there's a fair number of (primarily whitespace) changes; please see https://prettier.io/blog/2020/03/21/2.0.0.html
In order to reduce the size of these changes somewhat, this patch maintains the old "arrowParens" style for now (once mozilla-central updates Prettier we can simply choose the same formatting, assuming it will differ here).

* [api-minor] Fix the return value of `PDFDocumentProxy.getViewerPreferences` when no viewer preferences are present (PR 10738 follow-up)

This patch fixes yet another instalment in the never-ending series of "what the *bleep* was I thinking", by changing the `PDFDocumentProxy.getViewerPreferences` method to return `null` by default.
Not only is this method now consistent with many other API methods, for the data not present case, but it also avoids having to e.g. loop through an object to check if it's actually empty (note the old unit-test).

* [src/core/jpg.js] Remove some redundant marker validation during the MCU parsing in the `decodeScan` function

Some of the code in `src/core/jpg.js` is fairly old, and has with time become unnecessary when the surrounding code has been updated to handle various types of JPEG corruption.
In particular the `if (!marker || marker <= 0xff00) { ... }` branch is now dead code, since:

 - The `!marker` case can no longer happen, since we would already have broken out of the loop thanks to the `!fileMarker` branch a handful of lines above.

 - The `marker <= 0xff00` case can also no longer happen, since the `findNextFileMarker` function validate markers much more thoroughly (by checking `marker >= 0xffc0 && marker <= 0xfffe`). Hence we'd again have broken out of the loop via the `!fileMarker` branch above when no valid marker was found.

* Always skip over any additional, unexpected, RSTx (restart) markers in corrupt JPEG images (issue 11794)

* A couple of small `String.fromCodePoint` improvements (PR 11698 and 11769 follow-up)

 - Add a reduced test-case for issue 11768, to prevent future regressions.
   (Given that PR 11769 is only a work-around, rather than a proper solution, it may not be entirely accurate for the issue to be closed as fixed.)

 - Add more validation of the charCode, as found by the heuristics, in `PartialEvaluator._buildSimpleFontToUnicode` to prevent future issues.

* Suppress browser autofill on page number

What the user did:
    Open the PDF Viewer in Chrome;
    Mouse click into the “Page number” input field;

What they saw:
    A pop-up list with seemingly random numbers

What you were expecting to see:
    Nothing

What they see is the Chrome “Autofill” feature at work – that is suggesting values that you have previously entered into number fields in forms, as possible values you may want to enter into this field.  The list has nothing to do with the PDF currently open but the user does not know this.

* Support the vertical writing mode with SVG backend.

* Enable the `dot-notation` ESLint rule

*Please note:* These changes were done automatically, using the `gulp lint --fix` command.

This rule is already enabled in mozilla-central, see https://searchfox.org/mozilla-central/rev/567b68b8ff4b6d607ba34a6f1926873d21a7b4d7/tools/lint/eslint/eslint-plugin-mozilla/lib/configs/recommended.js#103-104

The main advantage, besides improved consistency, of this rule is that it reduces the size of the code (by 3 bytes for each case). In the PDF.js code-base there's close to 8000 instances being fixed by the `dot-notation` ESLint rule, which end up reducing the size of even the *built* files significantly; the total size of the `gulp mozcentral` build target changes from `3 247 456` to `3 224 278` bytes, which is a *reduction* of `23 178` bytes (or ~0.7%) for a completely mechanical change.

A large number of these changes affect the (large) lookup tables used on the worker-thread, but given that they are still initialized lazily I don't *think* that the new formatting this patch introduces should undo any of the improvements from PR 6915.

Please find additional details about the ESLint rule at https://eslint.org/docs/rules/dot-notation

* [src/core/jpg.js] Remove redundant marker validation at the end of the `decodeScan` function (PR 11805 follow-up)

With the MCU parsing changes made in PR 11805, the final marker validation is no longer necessary before the `decodeScan` function returns.

* docs: Fix simple typo, occurences -> occurrences

There is a small typo in gulpfile.js, systemjs.config.js.

Should read `occurrences` rather than `occurences`.

* Update `npm` packages

* Update l10n files

* [src/core/fonts.js] Replace some unnecessary `Stream.getBytes(...)` calls with `Stream.skip(...)` instead

There's a handful of cases in the code where the intention is simply to advance the `Stream` position, but rather than only doing that the code instead fetches the bytes in question (and without using the result for anything).

* [src/core/fonts.js] Replace some unnecessary `Stream.getUint16()` calls with `Stream.skip(2)` instead

There's a handful of cases in the code where the intention is simply to advance the `Stream` position, but rather than only doing that the code instead fetches/computes a Uint16 value (and without using the result for anything).

* [src/core/fonts.js] Improve the `validateOS2Table` function

Rather than creating a new `Stream` just to validate the OS/2 TrueType table, it's simpler/better to just pass in a reference to the font data and use that instead (similar to other TrueType helper functions).

* Avoid reading the "disablePreferences"/"locale" options, when initializing the viewer, in extension builds

These two `AppOptions` are only defined in GENERIC builds, hence it's completely unnecessary to check them in the extension builds (e.g. MOZCENTRAL and CHROME).

Also, simply let the "printResolution" option be defined in all builds since it's being accessed in `web/firefox_print_service.js` as well.

* Change the "download" keyboard shortcut (Ctrl+S) handling, in GENERIC/CHROME builds, to utilize the `EventBus` (issue 11657)

This improves the consistency of the "download" handling, in the default viewer, such that the `Toolbar`/`SecondaryToolbar` buttons *and* the keyboard shortcut are now handled in the same way (using the `EventBus`).

Given that the "download" keyboard shortcut handling is limited to GENERIC/CHROME builds and that the issue does raise a valid point about only being able to observe *some* downloads, these changes seem acceptable in this particular case.

Finally the pre-processor condition is adjusted to *explicitly*, rather than implicitly, list the affected build targets.

*Please note:* This patch should NOT be construed as carte blanche to simply convert all of the code in `webViewerKeyDown`, or elsewhere, to make use of the `EventBus` instead of direct function calls.
Any further changes, along the lines in this patch, would need to be evaluated on a case-by-case basis to determine if they are actually wanted, given that many/most existing cases in `webViewerKeyDown` should already be *indirectly* observable through the `EventBus` instance.

* Add a new "openfile" keyboard shortcut (Ctrl+O), in GENERIC builds

Somewhat surprisingly, despite the GENERIC viewer implementing "openfile" support, there's never been a keyboard shortcut available. Similar to the previous patch, this utilizes the `EventBus` for consistency with the `Toolbar`/`SecondaryToolbar` buttons.

*Please note:* This patch should NOT be construed as carte blanche to simply convert all of the code in `webViewerKeyDown`, or elsewhere, to make use of the `EventBus` instead of direct function calls.
Any further changes, along the lines in this patch, would need to be evaluated on a case-by-case basis to determine if they are actually wanted, given that many/most existing cases in `webViewerKeyDown` should already be *indirectly* observable through the `EventBus` instance.

* Use Node LTS releases to fix Travis CI builds (issue 10790)

Hopefully we don't need the *latest* Node.js releases for the unit-tests to work on Travis, and by using "long-term support" releases instead we should be able to avoid these types of sudden failures in the future as well.

* [api-minor] Change `PageViewport` to throw when the `rotation` is not a multiple of 90 degrees

As evident from the code, `PageViewport` only supports[1] `rotation` values which are a multiple of 90 degrees. Besides it being somewhat difficult to imagine meaningful use-cases for a non-multiple of 90 degrees `rotation`, the code also becomes both simpler and more efficient by not having to consider arbitrary `rotation` values.

However, any invalid rotation will *silently* fallback to assume zero `rotation` which probably isn't great for e.g. `PDFPageProxy.getViewport` in the API. Hence this patch, which will now enforce that only valid `rotation` values are accepted.

---
[1] As far as I can tell, from looking through the history, nothing else has ever been supported either.

* Always attempt to dispatch the "webviewerloaded" event at the embedding `document` (PR 10318 follow-up, issue 11829)

This is necessary in order to support cases where the default viewer is embedded in a *dynamically* created <iframe> element.

In order to also support a use-case where there's *multiple* <iframe> elements (containing default viewers) on the same page, the "webviewerloaded" event now includes a `source` detail parameter such that it's possible to associate an event with the relevant DOM element.

* [api-minor] Immediately release the `font.data` property once the font been attached to the DOM (PR 11777 follow-up)

*This patch implements https://github.com/mozilla/pdf.js/pull/11777#issuecomment-609741348*

This extends the work from PR 11773 and 11777 further, by immediately releasing the `font.data` property once the font been attached to the DOM. By not unnecessarily holding onto this data on the main-thread, we'll thus reduce the memory usage of fonts even further (especially beneficial in longer documents with composite fonts).

The new behaviour is controlled by the recently added `fontExtraProperties` API option (adding a new option just for this patch didn't seem necessary), since there's one edge-case in the SVG renderer where the `font.data` property is necessary (see the `pdf2svg` example).

Note that while the default viewer does run clean-up with an idle timeout, that timeout will be reset whenever rendering occurs *or* when scrolling happens in the viewer. In practice this means that unless the user doesn't interact with the viewer in *any* way during an extended period of time, currently set to 30 seconds, the `PDFDocumentProxy.cleanup` method will never be called and font resources will thus not be cleaned-up.

* Use the native `URL.createObjectURL` method more in `web/firefoxcom.js`

Given that `URL.createObjectURL` is assumed to always be available in MOZCENTRAL builds, note the existing usage in the file, there's no reason to depend on the PDF.js helper function `createObjectURL` at all here.

Furthermore this patch also changes `DownloadManager.downloadData` to actually revoke the `blobUrl` after downloading has completed, which is similar to the existing code in `DownloadManager.download`.

* Use the native `URL.createObjectURL` method in `web/pdf_attachment_viewer.js`

There's no particular reason for using the PDF.js helper function `createObjectURL` here, given that the relevant code-path is already guarded by multiple "disableCreateObjectURL" option checks.

* Let `PDFAttachmentViewer._bindPdfLink` fallback to downloading the PDF file, when opening the `blobUrl` fails

This is a simple work-around for https://bugzilla.mozilla.org/show_bug.cgi?id=1632644 which was caused by platform changes in Firefox. Ideally the Firefox bug should still be fixed, but these PDF.js changes seem generally useful to prevent both current and future issues here.

* Move the `maybeValidDimensions` check, used with JPEG images, to occur earlier (PR 11523 follow-up)

Given that the `NativeImageDecoder.{isSupported, isDecodable}` methods require both dictionary lookups *and* ColorSpace parsing, in hindsight it actually seems more reasonable to the `JpegStream.maybeValidDimensions` checks *first*.

* Identify browsers using the name instead of the path

The other testing code already uses the name of the browser as the
unique identifier, so I don't see a good reason to not use that for
identifying browsers to quit as well. Doing so simplifies the (already
somewhat complex) testing logic and ensures that we can use existing
functionality (such as the `getSession` function) to retrieve sessions.

* Introduce Puppeteer for handling browsers during tests

This commit replaces our own infrastructure for handling browsers during
tests with Puppeteer. Using our own infrastructure for this had a few
downsides:

- It has proven to not always be reliable, especially when closing the
  browser, causing failures on the bots because browsers were still
  running even though they should have been stopped. Puppeteer should do
  a better job with this because it uses the browser's test built-in
  instrumentation tools for this (the devtools protocol) which our code
  didn't. This also means that we don't have to pass
  parameters/preferences to tweak browser behavior anymore.
- It requires the browsers under test to be installed on the system,
  whereas Puppeteer downloads the browsers before the test. This means
  that setup is much easier (no more manual installations and browser
  manifest files) as well as testing with different browser versions
  (since they can be provisioned on demand). Moreover, this ensures that
  contributors always run the tests in both Firefox and Chrome,
  regardless of which browsers they have installed locally.
- It's all code we have to maintain, so Puppeteer abstracts away how the
  browsers start/stop for us so we don't have to keep that code.

By default, Puppeteer only installs one browser during installation,
hence the need for a post-install script to install the second browser.
This requires `cross-env` to make passing the environment variable work
on both Linux and Windows.

* Implement a command line flag to skip Chrome when running tests

To save time or resources during development it can be useful to run
tests only in Firefox. Previously this could be done by editing the
browser manifest file, but since that file is no longer used for
Puppeteer, this command line flag replaces it. For example, executing
`gulp unittest --noChrome` will only run the unit tests in Firefox.

* Include the name for interactive form elements

We already rendered the name for radio buttons, but it was missing for
all other interactive form elements. This commit adds that so that
values entered in form elements can be read based on the element name.

* Preserve error types during translation

By preserving the exception type, more fine-grained error handling can be performed via client-side logic (e.g. redirect to a search page if a PDF is not found, or to a ticket system in case of invalid PDF files).

The original exception is now re-thrown.

Fixes #11658

* Remove the `create-react-app` example (issue 11729)

Given that none of the PDF.js contributors know React, maintaining and/or providing supporting for the example isn't really feasible unfortunately.
Even something as simple as running/testing the example becomes difficult for anyone completely unfamiliar with React, and furthermore:

 - It's very difficult to tell if the example demonstrates React best-practices, since the PDF.js contributors don't know React.

 - We also have no reasonable way of keeping the example up-to-date with changes in React.

 - The React example, in its current form, is even *hard-coding* the PDF.js version to a now unsupported version.

 - The example is currently triggering "fake worker" usage, see issue 11729, which is really *really* bad. Note that the "fake worker" functionality is *only* intended as a fallback, and it should absolutely *not* under any circumstances be advertised and certainly shouldn't be triggered in official PDF.js examples.

* Update `npm` packages

* Update l10n files

* Remove the `gulp bundle` task since it's unused and doesn't really make sense

Not only is there no code depending on it now, the actual task itself doesn't even make sense as-is. Note that it uses the default `DEFINES` configuration *unaltered*, which is neither useful nor correct since the resulting build thus won't make sense without an actual built target set.

* Split the `createBundle` helper function, in gulpfile.js, into separate ones for the main/worker-thread files

All of the other *similar* helper functions only target one file per function, and there's no particular reason for this one to be different.
This patch will simplify future changes, e.g. experimenting with using `gulp watch` instead of SystemJS for the development viewer.

* Gracefully handle annotation parsing errors in `Page.getOperatorList` (issue 11871)

This should ensure that a page will always render successfully, even if there's errors during the Annotation fetching/parsing.
Additionally the `OperatorList.addOpList` method is also adjusted to ignore invalid data, to make it slightly more robust.

* Add more categories of unsupported features.

Fixes #11815

* Use the ESLint `no-restricted-syntax` rule to ensure that `assert` is always called with two arguments

Having `assert` calls without a message string isn't very helpful when debugging, and it turns out that it's easy enough to make use of ESLint to enforce better `assert` call-sites.
In a couple of cases the `assert` calls were changed to "regular" throwing of errors instead, since that seemed more appropriate.

Please find additional details about the ESLint rule at https://eslint.org/docs/rules/no-restricted-syntax

* Enable the ESLint `grouped-accessor-pairs` rule

This rule complements the existing `accessor-pairs` nicely, and ensures that a getter/setter pair is always consistently ordered.

Please find additional details about this rule at https://eslint.org/docs/rules/grouped-accessor-pairs

* Attempt to respect the "zoom" hash parameter, even when the "nameddest" parameter is present (issue 11875)

Given that the `PDFLinkService.setHash` method itself if completely synchronous, moving the handling of "nameddest" to occur last *shouldn't* cause any problems (famous last words).
This way the destination will still override any previous parameter, such as e.g. the "page", as expected. Furthermore, given that the `PDFLinkService.navigateTo` method is asynchronous that should provide additional guarantees that the "nameddest" parameter is always respected.

As sort-of expected, this fairly innocent looking change also required some tweaks in the `PDFHistory` to prevent dummy history entires upon document load (only an issue when both "page" *and* "nameddest" parameters are provided in the hash).

* Handle errors individually for each annotation in the `_parsedAnnotations` getter

While working on PR 11872, it occurred to me that it probably wouldn't be a bad idea to change the `_parsedAnnotations` getter to handle errors individually for each annotation. This way, one broken/corrupt annotation won't prevent the rest of them from being e.g. fetched through the API.

* Update ESLint to version 7

Please see https://eslint.org/blog/2020/05/eslint-v7.0.0-released for a list of notable changes.

* Enable the ESLint `default-case-last` rule, and tweak the existing `use-isnan` rule

These changes were made possible by ESLint version 7, and neither of these rules required any code changes.

Please find additional details about the ESLint rules at https://eslint.org/docs/rules/default-case-last and https://eslint.org/docs/rules/use-isnan

* Ensure that the `DEFINES` build target constants, in `gulpfile.js`, cannot be changed

* Add a `minified-es5` gulp task (issue 11858)

By re-factoring the existing gulp tasks, most of the code can be re-used for both the existing `gulp minified` as well as the new `gulp minified-es5` task.

* Reduce the usage of `require` statements in code-paths not protected by pre-processor and/or run-time checks

This replaces some additional `require`/`exports` usage with standard `import`/`export` statements instead.
Hence another, small, part in the effort to reduce the reliance on SystemJS-specific functionality in the development viewer.

* Update the `eslint-plugin-unicorn` package

* Update `npm` packages

* Update l10n files

* Remove the SystemJS dependency from the `web/preferences.js` file

Originally the `default_preferences.json` file was checked into the repository, and we thus needed to load it in non-PRODUCTION mode (which was originally done asynchronously using `XMLHttpRequest`). Over the years a lot has changed and the `default_preferences.json` file is now built, by the `gulp default_preferences` task, from the `web/app_options.js` file. Hence it's no longer necessary, in non-PRODUCTION mode, to use SystemJS here since we can simply use a standard `import` statement instead.

Note how e.g. `web/app.js` already imports from `web/app_options.js` in the same exact way that `web/preferences.js` now does, hence this patch will *not* result in any significant changes in the built/bundled viewer file.

This is another (small) part in trying to reduce usage of SystemJS, with the goal of hopefully getting rid of it completely. (I've started working on this, and doing so has identified a number of problem areas; this patch addresses one of them.)

* [Firefox] Allow PDF attachments to, once again, be opened directly in the browser (bug 1632644)

Apparently the old link format used in MOZCENTRAL-builds, with the blob URL separated from the filename with a `?` character violates the specification; see https://bugzilla.mozilla.org/show_bug.cgi?id=1632644#c5

Obviously just removing the `?`-part of the URL would have worked, but that would also have meant that we'd no longer be able to provide the correct filename when the user attempts to download the opened PDF attachment.
To fix this we'll instead append the filename in the hash-part of the URL, which however required using a *custom* hash-parameter to avoid triggering the fallback "named destination" code-paths in the viewer.

Note that only changing the `web/pdf_attachment_viewer.js` file wasn't sufficient to fix the bug, and we also need to tweak the `webViewerInitialized` function in `web/app.js` since MOZCENTRAL-builds used to ignore *everything* in the URL hash.
This particular code is very old, but changing it *should* be completely safe given that the `PDFViewerApplication.setTitleUsingUrl` method since some time now stores both the original URL (in `this.url`) as well as one without the hash (in `this.baseUrl`). The latter one is already used everywhere where it matters, so this change seem fine to me.

This patch thus restores the original behaviour for PDF attachments in the MOZCENTRAL-build, by once again allowing them to be opened *directly* in the browser without downloading. (The fallback added in PR 11845 is obviously kept, since it seems generally useful to have.)

* Remove unnecessary empty string fallback from the `getPDFFileNameFromURL` call in `web/pdf_document_properties.js` (PR 10114 follow-up)

Given that the `getPDFFileNameFromURL` helper function has a specific code-path for handling non-string inputs, this empty string fallback really isn't necessary at the call-site in `web/pdf_document_properties.js`.

* Re-factor `setPDFNetworkStreamFactory`, in src/display/api.js, to also accept an asynchronous function

As part of trying to reduce the usage of SystemJS in the development viewer, this patch is a necessary step that will allow removal of some `require` statements.

Currently this uses `SystemJS.import` in non-PRODUCTION mode, but it should be possible to replace those with standard *dynamic* `import` calls in the future.

* Convert the `src/pdf.js` and `src/pdf.worker.js` files to use standard `import`/`export` statements

As part of reducing our reliance on SystemJS in the development viewer, this patch replaces usage of `require` statements with modern standards `import`/`export` statements instead.

If we want to try and move forward with reducing usage of SystemJS, we don't have much choice but to make these kind changes (despite what prior test-results showed, however I'm no longer able to reproduce the issues locally).

* Reduce usage of SystemJS, in the development viewer, even further

With these changes SystemJS is now only used, during development, on the worker-thread and in the unit/font-tests, since Firefox is currently missing support for worker modules; please see https://bugzilla.mozilla.org/show_bug.cgi?id=1247687

Hence all the JavaScript files in the `web/` and `src/display/` folders are now loaded *natively* by the browser (during development) using standard `import` statements/calls, thanks to a nice `import-maps` polyfill.

*Please note:* As soon as https://bugzilla.mozilla.org/show_bug.cgi?id=1247687 is fixed in Firefox, we should be able to remove all traces of SystemJS and thus finally be able to use every possible modern JavaScript feature.

* Attempt to cache repeated images at the document, rather than the page, level (issue 11878)

Currently image r…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants