Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util: move validations for text decoder to c++ #45388

Closed
wants to merge 1 commit into from

Conversation

anonrig
Copy link
Member

@anonrig anonrig commented Nov 9, 2022

This is the 3rd pull request for my text-decoder performance patches. It removes any unnecessary check and moves them to the C++ side. I'll update the description when benchmark CI is complete.

Benchmark CI: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/1214/

                                                                                                 confidence improvement accuracy (*)    (**)   (***)
util/text-decoder.js type='ArrayBuffer' n=100 len=16384 ignoreBOM=0 encoding='iso-8859-3'                       -7.02 %       ±9.20% ±12.25% ±15.98%
util/text-decoder.js type='ArrayBuffer' n=100 len=16384 ignoreBOM=0 encoding='latin1'                           -8.56 %      ±10.35% ±13.81% ±18.03%
util/text-decoder.js type='ArrayBuffer' n=100 len=16384 ignoreBOM=0 encoding='utf-8'                            -3.94 %       ±7.71% ±10.27% ±13.38%
util/text-decoder.js type='ArrayBuffer' n=100 len=16384 ignoreBOM=1 encoding='iso-8859-3'                        2.52 %       ±9.46% ±12.59% ±16.40%
util/text-decoder.js type='ArrayBuffer' n=100 len=16384 ignoreBOM=1 encoding='latin1'                           -1.23 %       ±9.25% ±12.33% ±16.09%
util/text-decoder.js type='ArrayBuffer' n=100 len=16384 ignoreBOM=1 encoding='utf-8'                            -1.40 %       ±7.54% ±10.04% ±13.07%
util/text-decoder.js type='ArrayBuffer' n=100 len=256 ignoreBOM=0 encoding='iso-8859-3'                 ***     22.19 %      ±11.86% ±15.78% ±20.55%
util/text-decoder.js type='ArrayBuffer' n=100 len=256 ignoreBOM=0 encoding='latin1'                     ***     32.85 %      ±12.41% ±16.53% ±21.54%
util/text-decoder.js type='ArrayBuffer' n=100 len=256 ignoreBOM=0 encoding='utf-8'                       **     14.43 %       ±9.69% ±12.89% ±16.78%
util/text-decoder.js type='ArrayBuffer' n=100 len=256 ignoreBOM=1 encoding='iso-8859-3'                 ***     28.77 %      ±12.55% ±16.70% ±21.73%
util/text-decoder.js type='ArrayBuffer' n=100 len=256 ignoreBOM=1 encoding='latin1'                     ***     28.74 %      ±13.39% ±17.83% ±23.22%
util/text-decoder.js type='ArrayBuffer' n=100 len=256 ignoreBOM=1 encoding='utf-8'                      ***     16.03 %       ±8.30% ±11.05% ±14.42%
util/text-decoder.js type='ArrayBuffer' n=100 len=524288 ignoreBOM=0 encoding='iso-8859-3'                       7.19 %       ±7.88% ±10.48% ±13.64%
util/text-decoder.js type='ArrayBuffer' n=100 len=524288 ignoreBOM=0 encoding='latin1'                          -2.51 %       ±6.77%  ±9.01% ±11.72%
util/text-decoder.js type='ArrayBuffer' n=100 len=524288 ignoreBOM=0 encoding='utf-8'                            0.14 %       ±6.88%  ±9.16% ±11.92%
util/text-decoder.js type='ArrayBuffer' n=100 len=524288 ignoreBOM=1 encoding='iso-8859-3'                      -0.41 %       ±7.27%  ±9.67% ±12.59%
util/text-decoder.js type='ArrayBuffer' n=100 len=524288 ignoreBOM=1 encoding='latin1'                           1.48 %       ±7.13%  ±9.49% ±12.36%
util/text-decoder.js type='ArrayBuffer' n=100 len=524288 ignoreBOM=1 encoding='utf-8'                            0.62 %       ±6.23%  ±8.29% ±10.79%
util/text-decoder.js type='Buffer' n=100 len=16384 ignoreBOM=0 encoding='iso-8859-3'                            -2.13 %      ±11.79% ±15.68% ±20.41%
util/text-decoder.js type='Buffer' n=100 len=16384 ignoreBOM=0 encoding='latin1'                                -3.48 %      ±11.47% ±15.28% ±19.92%
util/text-decoder.js type='Buffer' n=100 len=16384 ignoreBOM=0 encoding='utf-8'                                 -7.71 %      ±13.21% ±17.59% ±22.93%
util/text-decoder.js type='Buffer' n=100 len=16384 ignoreBOM=1 encoding='iso-8859-3'                            -2.91 %      ±11.03% ±14.68% ±19.11%
util/text-decoder.js type='Buffer' n=100 len=16384 ignoreBOM=1 encoding='latin1'                                 1.95 %       ±9.36% ±12.46% ±16.24%
util/text-decoder.js type='Buffer' n=100 len=16384 ignoreBOM=1 encoding='utf-8'                                 -6.77 %      ±12.67% ±16.86% ±21.95%
util/text-decoder.js type='Buffer' n=100 len=256 ignoreBOM=0 encoding='iso-8859-3'                      ***     37.90 %      ±12.51% ±16.66% ±21.72%
util/text-decoder.js type='Buffer' n=100 len=256 ignoreBOM=0 encoding='latin1'                          ***     28.82 %      ±11.43% ±15.23% ±19.87%
util/text-decoder.js type='Buffer' n=100 len=256 ignoreBOM=0 encoding='utf-8'                           ***     24.74 %      ±13.10% ±17.54% ±23.06%
util/text-decoder.js type='Buffer' n=100 len=256 ignoreBOM=1 encoding='iso-8859-3'                      ***     31.08 %      ±11.10% ±14.81% ±19.34%
util/text-decoder.js type='Buffer' n=100 len=256 ignoreBOM=1 encoding='latin1'                          ***     18.27 %       ±9.55% ±12.71% ±16.54%
util/text-decoder.js type='Buffer' n=100 len=256 ignoreBOM=1 encoding='utf-8'                           ***     22.99 %      ±10.36% ±13.79% ±17.97%
util/text-decoder.js type='Buffer' n=100 len=524288 ignoreBOM=0 encoding='iso-8859-3'                           -0.68 %       ±5.90%  ±7.86% ±10.27%
util/text-decoder.js type='Buffer' n=100 len=524288 ignoreBOM=0 encoding='latin1'                                0.15 %       ±6.61%  ±8.79% ±11.44%
util/text-decoder.js type='Buffer' n=100 len=524288 ignoreBOM=0 encoding='utf-8'                                 1.44 %       ±5.09%  ±6.77%  ±8.82%
util/text-decoder.js type='Buffer' n=100 len=524288 ignoreBOM=1 encoding='iso-8859-3'                           -2.64 %       ±6.48%  ±8.62% ±11.23%
util/text-decoder.js type='Buffer' n=100 len=524288 ignoreBOM=1 encoding='latin1'                                3.62 %       ±6.72%  ±8.94% ±11.64%
util/text-decoder.js type='Buffer' n=100 len=524288 ignoreBOM=1 encoding='utf-8'                                 3.34 %       ±5.21%  ±6.93%  ±9.02%
util/text-decoder.js type='SharedArrayBuffer' n=100 len=16384 ignoreBOM=0 encoding='iso-8859-3'                  3.80 %       ±9.18% ±12.21% ±15.89%
util/text-decoder.js type='SharedArrayBuffer' n=100 len=16384 ignoreBOM=0 encoding='latin1'                     -1.58 %      ±10.41% ±13.87% ±18.07%
util/text-decoder.js type='SharedArrayBuffer' n=100 len=16384 ignoreBOM=0 encoding='utf-8'                      -4.04 %       ±9.79% ±13.06% ±17.06%
util/text-decoder.js type='SharedArrayBuffer' n=100 len=16384 ignoreBOM=1 encoding='iso-8859-3'                  5.56 %      ±10.50% ±14.01% ±18.31%
util/text-decoder.js type='SharedArrayBuffer' n=100 len=16384 ignoreBOM=1 encoding='latin1'                      1.38 %       ±9.58% ±12.77% ±16.65%
util/text-decoder.js type='SharedArrayBuffer' n=100 len=16384 ignoreBOM=1 encoding='utf-8'                       0.72 %       ±6.72%  ±8.95% ±11.68%
util/text-decoder.js type='SharedArrayBuffer' n=100 len=256 ignoreBOM=0 encoding='iso-8859-3'           ***     16.70 %       ±9.00% ±11.97% ±15.59%
util/text-decoder.js type='SharedArrayBuffer' n=100 len=256 ignoreBOM=0 encoding='latin1'               ***     22.84 %      ±10.16% ±13.53% ±17.65%
util/text-decoder.js type='SharedArrayBuffer' n=100 len=256 ignoreBOM=0 encoding='utf-8'                 **     17.96 %      ±11.81% ±15.73% ±20.51%
util/text-decoder.js type='SharedArrayBuffer' n=100 len=256 ignoreBOM=1 encoding='iso-8859-3'           ***     33.84 %      ±10.52% ±14.05% ±18.39%
util/text-decoder.js type='SharedArrayBuffer' n=100 len=256 ignoreBOM=1 encoding='latin1'               ***     28.41 %      ±10.75% ±14.34% ±18.73%
util/text-decoder.js type='SharedArrayBuffer' n=100 len=256 ignoreBOM=1 encoding='utf-8'                ***     24.81 %       ±6.80%  ±9.05% ±11.78%
util/text-decoder.js type='SharedArrayBuffer' n=100 len=524288 ignoreBOM=0 encoding='iso-8859-3'                 1.99 %       ±6.46%  ±8.59% ±11.19%
util/text-decoder.js type='SharedArrayBuffer' n=100 len=524288 ignoreBOM=0 encoding='latin1'                     0.89 %       ±6.93%  ±9.23% ±12.02%
util/text-decoder.js type='SharedArrayBuffer' n=100 len=524288 ignoreBOM=0 encoding='utf-8'                     -0.38 %       ±7.79% ±10.36% ±13.49%
util/text-decoder.js type='SharedArrayBuffer' n=100 len=524288 ignoreBOM=1 encoding='iso-8859-3'                 2.89 %       ±6.31%  ±8.40% ±10.94%
util/text-decoder.js type='SharedArrayBuffer' n=100 len=524288 ignoreBOM=1 encoding='latin1'                     4.58 %       ±6.05%  ±8.05% ±10.47%
util/text-decoder.js type='SharedArrayBuffer' n=100 len=524288 ignoreBOM=1 encoding='utf-8'                     -2.54 %       ±7.23%  ±9.63% ±12.55%

Be aware that when doing many comparisons the risk of a false-positive
result increases. In this case, there are 54 comparisons, you can thus
expect the following amount of false-positive results:
  2.70 false positives, when considering a   5% risk acceptance (*, **, ***),
  0.54 false positives, when considering a   1% risk acceptance (**, ***),
  0.05 false positives, when considering a 0.1% risk acceptance (***)

CC @nodejs/performance @nodejs/util

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. encoding Issues and PRs related to the TextEncoder and TextDecoder APIs. i18n-api Issues and PRs related to the i18n implementation. needs-ci PRs that need a full CI run. labels Nov 9, 2022
@anonrig anonrig added the performance Issues and PRs related to the performance of Node.js. label Nov 9, 2022
@anonrig anonrig force-pushed the perf/text-decoder-3 branch 2 times, most recently from 0437868 to 1bf201e Compare November 9, 2022 14:41
Copy link
Member

@addaleax addaleax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a heads up, while I personally really like this change and think it’s a good idea, this isn’t really in line with what the project has been doing in the past years in terms of input validation (which is to validate in JS and add CHECKs in C++).

src/node_i18n.cc Outdated Show resolved Hide resolved
@anonrig anonrig force-pushed the perf/text-decoder-3 branch from 1bf201e to c74850f Compare November 9, 2022 14:48
@anonrig anonrig added the request-ci Add this label to start a Jenkins CI on a PR. label Nov 9, 2022
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Nov 9, 2022
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@anonrig
Copy link
Member Author

anonrig commented Nov 11, 2022

Landed in d6c10e1

@anonrig anonrig closed this Nov 11, 2022
anonrig added a commit that referenced this pull request Nov 11, 2022
PR-URL: #45388
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Santiago Gimeno <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
Trott pushed a commit to Trott/io.js that referenced this pull request Nov 12, 2022
PR-URL: nodejs#45388
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Santiago Gimeno <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
@Trott
Copy link
Member

Trott commented Nov 12, 2022

Re-landed in ca3ed36

ruyadorno pushed a commit that referenced this pull request Nov 21, 2022
PR-URL: #45388
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Santiago Gimeno <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
@ruyadorno ruyadorno mentioned this pull request Nov 24, 2022
anonrig added a commit to anonrig/node that referenced this pull request Nov 27, 2022
PR-URL: nodejs#45388
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Santiago Gimeno <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
anonrig added a commit to anonrig/node that referenced this pull request Nov 27, 2022
PR-URL: nodejs#45388
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Santiago Gimeno <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
anonrig added a commit to anonrig/node that referenced this pull request Nov 28, 2022
PR-URL: nodejs#45388
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Santiago Gimeno <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
@anonrig anonrig added lts-watch-v14.x lts-watch-v18.x PRs that may need to be released in v18.x. labels Nov 28, 2022
danielleadams pushed a commit that referenced this pull request Dec 30, 2022
PR-URL: #45388
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Santiago Gimeno <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
danielleadams pushed a commit that referenced this pull request Dec 30, 2022
PR-URL: #45388
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Santiago Gimeno <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
danielleadams pushed a commit that referenced this pull request Jan 3, 2023
PR-URL: #45388
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Santiago Gimeno <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
@targos targos added backported-to-v18.x PRs backported to the v18.x-staging branch. and removed lts-watch-v18.x PRs that may need to be released in v18.x. labels Oct 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backported-to-v18.x PRs backported to the v18.x-staging branch. c++ Issues and PRs that require attention from people who are familiar with C++. encoding Issues and PRs related to the TextEncoder and TextDecoder APIs. i18n-api Issues and PRs related to the i18n implementation. needs-ci PRs that need a full CI run. performance Issues and PRs related to the performance of Node.js.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants