Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util.inspect is slow for large sparse arrays #14487

Closed
not-an-aardvark opened this issue Jul 25, 2017 · 9 comments
Closed

util.inspect is slow for large sparse arrays #14487

not-an-aardvark opened this issue Jul 25, 2017 · 9 comments
Labels
performance Issues and PRs related to the performance of Node.js. util Issues and PRs related to the built-in util module.

Comments

@not-an-aardvark
Copy link
Contributor

  • Version: 8.2.1
  • Platform: macOS
  • Subsystem: util
  1. Open the REPL
  2. Enter Array(100000000)

After about 15 seconds, [ <100000000 empty items> ] is printed.

In contrast, entering the same thing into the Chrome devtools console results in an output almost immediately.

It looks like this happens because util.inspect iterates from 0 to array.length. It might be better to iterate over Object.keys(array) instead.

@not-an-aardvark not-an-aardvark added the util Issues and PRs related to the built-in util module. label Jul 25, 2017
@vsemozhetbyt
Copy link
Contributor

vsemozhetbyt commented Jul 25, 2017

BTW, v4.8.4 just crashes after some seconds:

Output:
> node.4.8.4.v8-4.5.exe
> Array(100000000)

<--- Last few GCs --->

   48905 ms: Scavenge 1165.3 (1212.7) -> 1165.3 (1212.7) MB, 0.4 / 0 ms (+ 40.0 ms in 1 steps since last GC) [allocation failure] [incremental marking delayingmark-sweep].
   49298 ms: Mark-sweep 1165.3 (1212.7) -> 577.9 (616.7) MB, 393.3 / 0 ms (+ 79.0 ms in 10 steps since start of marking, biggest step 40.0 ms) [last resort gc].

   49638 ms: Mark-sweep 577.9 (616.7) -> 577.9 (615.7) MB, 340.0 / 0 ms [last resort gc].


<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 000002CAB57373A9 <JS Object>
    2: formatArray(aka formatArray) [util.js:~510] [pc=000002C888C383BE] (this=000002CAB5704131 <undefined>,ctx=000002CAB57F93E1 <an Object with map 0000005B2F645AD1>,value=000002CAB57F93C1 <JS Array[100000000]>,recurseTimes=2,visibleKeys=000002CAB57F9389 <an Object with map 0000005B2F6065C9>,keys=000002CAB57FE441 <JSArray[0]>)
    3: formatValue(aka formatValue) [util.js:447] [pc=000002C888...

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory

@vsemozhetbyt
Copy link
Contributor

vsemozhetbyt commented Jul 25, 2017

v6.11.1 returns immediately:

Output:
> node.6.11.1.v8-5.1.exe
> Array(100000000)
[ ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ,
  ... 99999900 more items ]

@vsemozhetbyt
Copy link
Contributor

Possible cause: #11576

@vsemozhetbyt vsemozhetbyt added the performance Issues and PRs related to the performance of Node.js. label Jul 25, 2017
@andrasq
Copy link

andrasq commented Jul 25, 2017

indeed. Versions of node before #11576 examined only the first 100 array elements; with the change they do 100 million String() casts. (iterating over 100e6 elements is very fast, and the hasOwnProperty test on each index is fast enough.)

The String() cast is a holdover from older versions of the code but is not necessary, letting hasOwnProperty(1) coerce the numeric 1 is 17x faster: for the 100e6 length array, 0.79 sec vs 12.85 sec.

% node -p 'x = [0,0,0]; x.hasOwnProperty(1)'
true

@not-an-aardvark
Copy link
Contributor Author

(iterating over 100e6 elements is very fast, and the hasOwnProperty test on each index is fast enough.)

Would it be better to iterate over keys instead, avoiding linear-time complexity at all? 1e8 was just an example, but it practice the length could easily be larger (up to 232-1).

@BridgeAR
Copy link
Member

@not-an-aardvark using Object.keys is bad for the average case where arrays do not contain huge holes.

@not-an-aardvark
Copy link
Contributor Author

That makes sense (it would result in a new large array being generated in the common case, even when only the first 100 elements will be displayed).

For what it's worth, iterating over the array with a for-in loop would solve both issues.

I don't think removing the String() cast is a good solution on its own. It would improve performance in general, but it still wouldn't solve the issue that sparse arrays take O(length) time to inspect.

@andrasq
Copy link

andrasq commented Jul 26, 2017

correct, String() is orthogonal to the O(n) scan, but it accounts for 94% of the runtime, making the scan a minor second-order effect.

The code already fetches the Object.keys of the array, and the keys are passed as a hash to formatArray. If the array itself were also available (instead of just the hash), the first 100 entries in the keys array contain all the information needed to compose the formatted output (because Object.keys omits unset indexes)

% node -p 'Object.keys([1,,3,4])'
[ '0', '2', '3' ]

Edit: actually, the keys are also available in formatArray, so no need to test properties

@BridgeAR
Copy link
Member

@andrasq that is true and I realized that as well when looking at the code but it is actually bad that Object.keys is used as it is not necessary for arrays when using for in. I am pretty sure inspect has quite a few more optimization possibilities overall and this would be a good follow up for this. But using for in in general for arrays seems like the best solution on the long run as you only have to inspect up to the number of visible entries.

@refack refack closed this as completed in 95bbb68 Aug 13, 2017
addaleax pushed a commit that referenced this issue Aug 13, 2017
* improve util.inspect performance
  This is a huge performance improvement in case of sparse arrays
  when using util.inspect as the hole will simple be skipped.

* use faster visibleKeys property lookup

* add inspect-array benchmark

PR-URL: #14492
Fixes: #14487
Reviewed-By: Alexey Orlenko <[email protected]>
Reviewed-By: Refael Ackermann <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Issues and PRs related to the performance of Node.js. util Issues and PRs related to the built-in util module.
Projects
None yet
Development

No branches or pull requests

4 participants