Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[api-minor] Include the document /Lang attribute in the textContent-data #17941

Merged
merged 1 commit into from
May 14, 2024

Conversation

Snuffleupagus
Copy link
Collaborator

@Snuffleupagus Snuffleupagus commented Apr 15, 2024

  • These changes will allow a simpler way of implementing PR Add language attribute to canvas #17770.

  • The /Lang attribute is fetched lazily, with the first getTextContent invocation. Given the existing worker-thread caching, this will thus only need to be done once per PDF document (and most PDFs don't included this data).

  • This makes the /Lang attribute directly available in the textLayer, which has the following advantages:

    • We don't need to block, and thus delay, overall viewer initialization on fetching it (nor pass it around throughout the viewer).

    • Third-party users of the textLayer will automatically benefit from this, once we start actually using the /Lang attribute in PR Add language attribute to canvas #17770.
      Please note: This also, importantly, means that the text reference-tests will then cover this code (which wouldn't otherwise have been the case).

@Snuffleupagus Snuffleupagus force-pushed the getTextContent-lang branch 2 times, most recently from de07ea9 to 6dd2dd8 Compare April 22, 2024 10:05
@Snuffleupagus Snuffleupagus force-pushed the getTextContent-lang branch 3 times, most recently from 4d811e1 to 0ac822e Compare May 3, 2024 11:01
@mozilla mozilla deleted a comment from moz-tools-bot May 3, 2024
@mozilla mozilla deleted a comment from moz-tools-bot May 3, 2024
@mozilla mozilla deleted a comment from moz-tools-bot May 3, 2024
@mozilla mozilla deleted a comment from moz-tools-bot May 3, 2024
@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.193.163.58:8877/eabbc1b8aad8045/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.241.84.105:8877/e959d8976999b5e/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Failed

Full output at http://54.241.84.105:8877/e959d8976999b5e/output.txt

Total script time: 27.60 mins

  • Unit tests: Passed
  • Integration Tests: Passed
  • Regression tests: FAILED
  different ref/snapshot: 18
  different first/second rendering: 2

Image differences available at: http://54.241.84.105:8877/e959d8976999b5e/reftest-analyzer.html#web=eq.log

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/eabbc1b8aad8045/output.txt

Total script time: 40.41 mins

  • Unit tests: Passed
  • Integration Tests: Passed
  • Regression tests: FAILED
  different ref/snapshot: 7

Image differences available at: http://54.193.163.58:8877/eabbc1b8aad8045/reftest-analyzer.html#web=eq.log

@Snuffleupagus Snuffleupagus marked this pull request as ready for review May 3, 2024 11:55
Copy link
Contributor

@timvandermeij timvandermeij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, with the two questions below answered/addressed. Thanks!

src/display/text_layer.js Outdated Show resolved Hide resolved
@Snuffleupagus Snuffleupagus force-pushed the getTextContent-lang branch from 0ac822e to 6588c30 Compare May 7, 2024 14:14
@timvandermeij
Copy link
Contributor

/botio-linux unittest

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Received

Command cmd_unittest from @timvandermeij received. Current queue size: 1

Live output at: http://54.241.84.105:8877/88e047ea12f3d16/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Success

Full output at http://54.241.84.105:8877/88e047ea12f3d16/output.txt

Total script time: 2.62 mins

  • Unit Tests: Passed

Copy link
Contributor

@timvandermeij timvandermeij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me, with passing Windows tests once the bot is working again. Thank you for the patch!

 - These changes will allow a simpler way of implementing PR 17770.

 - The /Lang attribute is fetched lazily, with the first `getTextContent` invocation. Given the existing worker-thread caching, this will thus only need to be done *once* per PDF document (and most PDFs don't included this data).

 - This makes the /Lang attribute *directly available* in the `textLayer`, which has the following advantages:
    - We don't need to block, and thus delay, overall viewer initialization on fetching it (nor pass it around throughout the viewer).

    - Third-party users of the `textLayer` will automatically benefit from this, once we start actually using the /Lang attribute in PR 17770.
      *Please note:* This also, importantly, means that the `text` reference-tests will then cover this code (which wouldn't otherwise have been the case).
@Snuffleupagus Snuffleupagus force-pushed the getTextContent-lang branch from 6588c30 to 6d523c3 Compare May 14, 2024 10:45
@Snuffleupagus
Copy link
Collaborator Author

/botio test

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.193.163.58:8877/af6b465e78d0e4d/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.241.84.105:8877/cebfb7f73a0d0af/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Failed

Full output at http://54.241.84.105:8877/cebfb7f73a0d0af/output.txt

Total script time: 27.43 mins

  • Unit tests: Passed
  • Integration Tests: Passed
  • Regression tests: FAILED
  different ref/snapshot: 19
  different first/second rendering: 1

Image differences available at: http://54.241.84.105:8877/cebfb7f73a0d0af/reftest-analyzer.html#web=eq.log

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/af6b465e78d0e4d/output.txt

Total script time: 40.22 mins

  • Unit tests: FAILED
  • Integration Tests: Passed
  • Regression tests: FAILED
  different ref/snapshot: 6

Image differences available at: http://54.193.163.58:8877/af6b465e78d0e4d/reftest-analyzer.html#web=eq.log

@Snuffleupagus Snuffleupagus merged commit bb9bb34 into mozilla:master May 14, 2024
9 checks passed
@Snuffleupagus Snuffleupagus deleted the getTextContent-lang branch May 14, 2024 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants