Performance studies #8

staltz · 2021-11-26T20:43:05Z

As of commit bb5b98f built with ReleaseSmall:

operation	ops/ms
bipf.encode	1.8511662347278786
bipf-native.encode	0.13290096220296635
bipf.decode	7.326007326007326
bipf-native.decode	2.0977554017201596
JSON.stringify	3.6724201248622843
JSON.parse	5.1894135962636225
JSON.parse(buffer)	5.574136008918618
JSON.stringify(JSON.parse())	3.134796238244514
bipf.seek(string)	1250
bipf-native.seek(string)	98.03921568627452

cc @jerive

The text was updated successfully, but these errors were encountered:

staltz · 2021-11-26T21:17:53Z

I ran Valgrind on bipf-native running perf.js on the encode part only, and then I ran Valgrind on bipf running the same perf.js, here's what I found (screenshot from KCachegrind, sorted by Called):

bipf-native

bipf

Notes

What I think is going on is that napi probably does a v8 TryCatch, and try-catches are known to be bad for performance if you would write them by hand in JS. The other alternative is implementing bipf-native with v8 APIs directly, maybe that way we can take shortcuts.

staltz · 2021-11-26T21:38:44Z

@jerive I think we should try implementing encodingLength and encode using V8 APIs directly and check how the performance looks like. We have no use for bipf-native if it's slow.

jerive · 2021-11-26T22:26:33Z

Sure it's what I was afraid of.
I'm currently trying to include the V8 headers...

jerive · 2021-11-26T23:04:06Z

I will certainly need support here
zig does not find some things that are part of cpp stdlib.

// v8.h
#include <atomic>
#include <memory>
#include <string>
#include <type_traits>
#include <utility>
#include <vector>

it should be supported (ziglang/zig#4786) but no clue how.
I will leave it here for tonight.

staltz · 2021-11-26T23:31:24Z

I will give it a try too, tomorrow

jerive · 2021-11-27T00:37:53Z

On ubuntu, the header files are in the package libc++-11-dev.
Now how to link them.

staltz · 2021-11-27T14:10:18Z

Can you try -lc++ ?

staltz · 2021-11-27T15:37:00Z

See this issue in Node.js nodejs/node#14379 (comment)

Seems like the performance drawbacks of N-API were acknowledged, but "compared the results to the original module that used V8 APIs ... overhead of N-API is fairly minimal already". I'm concerned what does that "minimal overhead" mean. If converting bipf-native to V8 APIs would only speed up by 2x compared to N-API bipf-native, that's not good enough. We would need it to be 20x to make bipf-native V8 at least 2x faster than bipf.

dominictarr · 2021-12-03T05:11:23Z

I'll bet you it's the overhead of the js-native api. especially since you are doing many tiny calls.
The thing that originally lead me to creating bipf was realizing that grep could process the ssb log in less than a second, but json parse took several minutes.

I would be interested to see how fast it went if you ran the benchmark inside zig, you wouldn't be able to do encode but it would do seek, etc. I'd also be curious how it went with wasm. However, you'd have to copy each record into wasm to be able to run bipf on it. What would be faster is better is to copy the whole block into wasm memory and then scan many bipf records without leaving wasm. However, in practical terms that means reimplementing flume/jitdb in wasm.

staltz · 2021-12-03T10:17:35Z

Yes, @dominictarr, it smells like a bridging overhead indeed. What I'm wondering is why is there a copy in the first place. There's no FFI. How can JSON.stringify/parse operate without copying, and if we have access to V8 APIs, couldn't we get as close as JSON's status?

staltz · 2021-12-03T21:57:29Z

Yeah, so we tried to implement BIPF in C++ first using N-API and then V8 APIs, and it's a bit hopeless: jerive/bipf-napi#1

TL;DR is that the problem isn't that the bridge between C++ and JS is slow, the problem is that pure JS code is usually optimized just in time to machine code and that's highly efficient. In our benchmarks in the issue linked, the bottleneck was literally just V8::Number::New(). Bottomline is that JS is fast, and the way of being competitive is by replacing large or all parts of the program with C++ or other low-level programming languages. Replacing small parts doesn't seem to work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance studies #8

Performance studies #8

staltz commented Nov 26, 2021 •

edited

Loading

staltz commented Nov 26, 2021

staltz commented Nov 26, 2021 •

edited

Loading

jerive commented Nov 26, 2021

jerive commented Nov 26, 2021

staltz commented Nov 26, 2021

jerive commented Nov 27, 2021

staltz commented Nov 27, 2021

staltz commented Nov 27, 2021 •

edited

Loading

dominictarr commented Dec 3, 2021

staltz commented Dec 3, 2021

staltz commented Dec 3, 2021

Performance studies #8

Performance studies #8

Comments

staltz commented Nov 26, 2021 • edited Loading

staltz commented Nov 26, 2021

bipf-native

bipf

Notes

staltz commented Nov 26, 2021 • edited Loading

jerive commented Nov 26, 2021

jerive commented Nov 26, 2021

staltz commented Nov 26, 2021

jerive commented Nov 27, 2021

staltz commented Nov 27, 2021

staltz commented Nov 27, 2021 • edited Loading

dominictarr commented Dec 3, 2021

staltz commented Dec 3, 2021

staltz commented Dec 3, 2021

staltz commented Nov 26, 2021 •

edited

Loading

staltz commented Nov 26, 2021 •

edited

Loading

staltz commented Nov 27, 2021 •

edited

Loading