-
-
Notifications
You must be signed in to change notification settings - Fork 484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(sourcemap): use simd to escape JSON string #4487
perf(sourcemap): use simd to escape JSON string #4487
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. Join @Brooooooklyn and the rest of your teammates on Graphite |
Your org has enabled the Graphite merge queue for merging into mainAdd the label “merge” to the PR and Graphite will automatically add it to the merge queue when it’s ready to merge. Or use the label “hotfix” to add to the merge queue as a hot fix. You must have a Graphite account and log in to Graphite in order to use the merge queue. Sign up using this link. |
ee510b2
to
e989f09
Compare
CodSpeed Performance ReportMerging #4487 will improve performances by 3.56%Comparing Summary
Benchmarks breakdown
|
7d44f5c
to
1190237
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
conflict
SIMD! Now we're talking! I've made a couple of quick comments, but not reviewed in full. Please don't merge before I've had a chance to do that. I'm afraid I'm tied up this weekend, so that won't be until Monday. |
284676b
to
6115cc1
Compare
6115cc1
to
c6d1c3d
Compare
19e852f
to
3957e5f
Compare
194c148
to
cd2fced
Compare
673ea5a
to
32fa9ec
Compare
10% speed up! Unfortunately, I think the way we're using If you build on same machine as you run the code on, no problem. But if you build a binary on a machine which has AVX2 support, that binary will crash if run on another x86 machine which only has SSE2. This seems like a bad idea - people usually expect they're building portable binaries. fn main() {
#[cfg(any(target_arch = "x86_64", target_arch = "x86"))]
{
if is_x86_feature_detected!("sse2") {
println!("cargo:rustc-cfg=v_escape_sse");
}
if is_x86_feature_detected!("avx2") {
println!("cargo:rustc-cfg=v_escape_avx");
}
}
} pub fn b_escape<B: buf_min::Buffer>(s: &[u8], buf: &mut B) {
#[allow(unused_unsafe)]
unsafe {
_b_escape(s, buf)
}
} pub unsafe fn _b_escape<B: buf_min::Buffer>(bytes: &[u8], buf: &mut B) {
#[cfg(not(v_escape_avx))]
{
#[cfg(not(v_escape_sse))]
{
scalar::b_escape(bytes, buf)
}
#[cfg(v_escape_sse)]
{
ranges::sse::b_escape(bytes, buf)
}
}
#[cfg(v_escape_avx)]
{
ranges::avx::b_escape(bytes, buf)
}
} This problem is mentioned in zzau13/v_escape#54. It looks like we'd be better off using
|
Also, perhaps we should merge #4528 first (if it's correct) so we can see effect of this PR on benchmarks once the benchmark is measuring what it's meant to. |
I agree; I can't even run So let's just pre-allocate in this pr and implement JSON escape in upcoming separated PRs. |
I think we can go forwards with this PR if we switch to using I agree that longer term, we'd be better off maintaining our own crate for SIMD ops (lexer and various other parts of Oxc could also benefit a lot from SIMD) and support aarch64 + wasm32 (and also AVX512, which would likely be almost x2 speed boost for some workloads on CPUs that support it). But for now, this PR is producing a significant speed boost for common case of x86_64, so I think it's worthwhile getting these changes in to Oxc (assuming we can remove the dodgy Personally, I think implementing a SIMD crate will not be a straightforward task. Not so much the code itself, but building the testing infra around it to make sure it's completely solid on all targets (some discussion of this in #2285). But maybe you're familiar @Brooooooklyn with the cross-platform testing side of things from napi-rs?
Yes, I agree it would be a good idea to split the pre-allocation optimization and the switch to |
b3d8231
to
506565c
Compare
The problem Dunqing raised is now solved.
Merge activity
|
Also optimize the memory allocation in string escape. The default size in `serde_json` is 1024 for String type, we pre allocate `string.len() * 2 + 2` for every string to reduce re-allocate in escaping. I've tried to hand write SIMD implementation, but it's too complex, so I uses the `v_jsonescape` here. But it doesn't support `aarch64` and `wasm32` simd implementation, we need to contribute to it!
c01ec4b
to
1fd9dd0
Compare
Merging! Hooray. @Brooooooklyn I hope you haven't found my approach to review too annoyingly nit-picky. If it is, please say so. I'm new to being maintainer on a popular project. I know Boshen takes a much more light-touch "if the tests pass, merge it" approach, so I'm not sure if I'm out of line, and would appreciate any feedback. |
@overlookmotel don't worry about that, you are the best |
## [0.23.0] - 2024-08-01 - 27fd062 sourcemap: [**BREAKING**] Avoid passing `Result`s (#4541) (overlookmotel) ### Features - a558492 codegen: Implement `BinaryExpressionVisitor` (#4548) (Boshen) - 7446e98 codegen: Align more esbuild implementations (#4510) (Boshen) - 35654e6 codegen: Align operator precedence with esbuild (#4509) (Boshen) - b952942 linter: Add eslint/no-unused-vars (⭐ attempt 3.2) (#4445) (DonIsaac) - 85e8418 linter: Add react/jsx-curly-brace-presence (#3949) (Don Isaac) - cf1854b semantic: Remove `ReferenceFlags::Value` from non-type-only exports that referenced type binding (#4511) (Dunqing) ### Bug Fixes - b58ed80 codegen: Enable more test cases (#4585) (Boshen) - 6a94e3f codegen: Fixes for esbuild test cases (#4503) (Boshen) - d5c4b19 parser: Fix enum member parsing (#4543) (DonIsaac) ### Performance - 4c6d19d allocator: Use capacity hint (#4584) (Luca Bruno) - 7585e16 linter: Remove allocations for string comparisons (#4570) (DonIsaac) - 55a8763 parser: Faster decoding unicode escapes in identifiers (#4579) (overlookmotel) - ae1d38f parser: Fast path for ASCII when checking char after numeric literal (#4577) (overlookmotel) - 56ae615 parser: Make not at EOF the hot path in `Source` methods (#4576) (overlookmotel) - 25679e6 parser: Optimize `Lexer::hex_digit` (#4572) (overlookmotel) - bb33bcc parser: Speed up lexing non-decimal numbers (#4571) (overlookmotel) - ab8509e parser: Use `-` not `saturating_sub` (#4561) (overlookmotel) - c9c38a1 parser: Support peeking over bytes (#4304) (lucab) - 0870ee1 parser: Get and check lookahead token (#4534) (lucab) - d00014e sourcemap: Elide bounds checks in VLQ encoding (#4583) (overlookmotel) - 1fd9dd0 sourcemap: Use simd to escape JSON string (#4487) (Brooooooklyn) ### Documentation - 0914e47 ast: Add doc comments to literal nodes (#4551) (DonIsaac) - c6a11be ast: Auto-generate doc comments for AstBuilder methods (#4471) (DonIsaac) ### Refactor - e68ed62 parser: Convert lexer byte handler for `|` to a single match (#4575) (overlookmotel) - bba824b parser: Convert `Lexer::read_minus` to a single match (#4574) (overlookmotel) - ef5418a parser: Convert `Lexer::read_left_angle` to a single match (#4573) (overlookmotel) - 9e5be78 parser: Add `Lexer::consume_2_chars` (#4569) (overlookmotel) - 649913e parser: Extract `u8` not `&u8` when iterating over bytes (#4568) (overlookmotel) - 59f00c0 parser: Rename function (#4566) (overlookmotel) - 8e3e910 parser: Rename vars (#4565) (overlookmotel) - 0c0601f parser: Rename function (#4564) (overlookmotel) - 0acc4a7 parser: Fetch 2 bytes in `?` byte handler (#4563) (overlookmotel) - 565eccf parser: Shorten lexer code (#4562) (overlookmotel) - 148bdb5 parser: Adjust function inlining (#4530) (overlookmotel) - 16c7b98 semantic: Move CatchClause scope binding logic to visit_block_statement (#4505) (Dunqing) - d6974d4 semantic: `AstNodeParentIter` fetch nodes lazily (#4533) (overlookmotel) - d914b14 semantic: Reusing the same reference (#4529) (Dunqing) - 7b5e1f5 semantic: Use `is_empty()` instead of `len() == 0` (#4532) (overlookmotel) - 9db4259 semantic: Inline trivial methods (#4531) (overlookmotel) - 7c42ffc sourcemap: Align Base64 chars lookup table to cache line (#4535) (overlookmotel) - 96602bf transformer/typescript: Determine whether to remove `ExportSpeicifer` by `ReferenceFlags` (#4513) (Dunqing) - e6a8af6 traverse: Speed up tests (#4538) (overlookmotel) Co-authored-by: Boshen <[email protected]>
Also optimize the memory allocation in string escape. The default size in
serde_json
is 1024 for String type, we pre allocatestring.len() * 2 + 2
for every string to reduce re-allocate in escaping.I've tried to hand write SIMD implementation, but it's too complex, so I uses the
v_jsonescape
here. But it doesn't supportaarch64
andwasm32
simd implementation, we need to contribute to it!