-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vectorized decoding in parquet reader #21465
Conversation
67574f7
to
3304b44
Compare
core/docker/bin/run-trino
Outdated
@@ -7,4 +7,6 @@ if ! grep -s -q 'node.id' /etc/trino/node.properties; then | |||
launcher_opts+=("-Dnode.id=${HOSTNAME}") | |||
fi | |||
|
|||
launcher_opts+=("-J--add-modules=jdk.incubator.vector") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add this to jvm.config instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, but it's easy to miss and it'll fail queries at runtime only when parquet is used while the server will start-up without complaining.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As with the other recommended/required JVM settings. We can detect whether when enabled, the module is linked. Please see verifyPackageAccessAllowed
in BigQueryConnectorModule
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That approach would also work, but it feels a bit roundabout to throw error about a required JVM property and make the user fill it in when we can easily just include it ourselves. But I don't mind going that way depending on what the consensus is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that modifying a launcher is a proper way of adding required JVM flags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of automatically adding required JVM flags, but this should be done in the Airlift launcher, so that they're present no matter how Trino is run. The change here only affects Docker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created airlift/airlift#1144
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have refactored this PR so that the Vector module is no longer required. We will automatically fall back to existing code paths if Vector module isn't found. So now it's just a matter of finding a convenient way to add --add-modules=jdk.incubator.vector
. We could just add it to all the JVM configs in the codebase and add it to our JVM configs documentation and leave it at that. The other option is to add it to core/docker/bin/run-trino
which will cover our main mode of deployment.
...ng/trino-product-tests-launcher/src/main/resources/docker/presto-product-tests/run-presto.sh
Outdated
Show resolved
Hide resolved
lib/trino-parquet/src/main/java/io/trino/parquet/reader/decoders/VectorIntBitUnpackers.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of polluting every decoder with a "vectorizedEnabled" flag, create a separate VectorizedXXXDecoder and instantiate them as necessary if the flag is enabled at the top level.
lib/trino-parquet/src/test/java/io/trino/parquet/reader/flat/TestNullsDecoder.java
Outdated
Show resolved
Hide resolved
f7b1ab8
to
e38f2d6
Compare
1206912
to
73baf14
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The high-level structure looks good. I'm reviewing the implementation methods now.
lib/trino-parquet/src/main/java/io/trino/parquet/reader/decoders/PlainValueDecoders.java
Outdated
Show resolved
Hide resolved
lib/trino-parquet/src/main/java/io/trino/parquet/reader/flat/NullsDecoder.java
Outdated
Show resolved
Hide resolved
lib/trino-parquet/src/test/java/io/trino/parquet/reader/flat/TestBitPackingUtils.java
Show resolved
Hide resolved
lib/trino-parquet/src/main/java/io/trino/parquet/reader/decoders/VectorIntBitUnpackers.java
Show resolved
Hide resolved
lib/trino-parquet/src/main/java/io/trino/parquet/reader/decoders/VectorIntBitUnpackers.java
Show resolved
Hide resolved
lib/trino-parquet/src/main/java/io/trino/parquet/reader/decoders/VectorIntBitUnpackers.java
Outdated
Show resolved
Hide resolved
d8fb208
to
aeb2038
Compare
...java/io/trino/parquet/reader/decoders/VectorShortDecimalFixedWidthByteArrayBatchDecoder.java
Outdated
Show resolved
Hide resolved
...java/io/trino/parquet/reader/decoders/VectorShortDecimalFixedWidthByteArrayBatchDecoder.java
Outdated
Show resolved
Hide resolved
...java/io/trino/parquet/reader/decoders/VectorShortDecimalFixedWidthByteArrayBatchDecoder.java
Outdated
Show resolved
Hide resolved
BenchmarkShortDecimalColumnReader.read (byteArrayLength) (encoding) Mode Cnt Before Score After Score Units 2 PLAIN thrpt 20 2890.335 ? 27.542 3185.799 ? 47.105 ops/s 3 PLAIN thrpt 20 2484.251 ? 54.313 2794.937 ? 39.779 ops/s 4 PLAIN thrpt 20 897.936 ? 7.806 2269.610 ? 24.969 ops/s 5 PLAIN thrpt 20 1940.120 ? 19.592 1917.614 ? 25.896 ops/s 6 PLAIN thrpt 20 2135.827 ? 32.586 2132.197 ? 20.372 ops/s 7 PLAIN thrpt 20 1510.185 ? 28.064 1509.423 ? 14.175 ops/s 8 PLAIN thrpt 20 668.761 ? 7.721 2716.746 ? 36.779 ops/s
BenchmarkBooleanColumnReader (encoding) Mode Cnt Before score After score Units PLAIN thrpt 10 4043.490 ? 92.281 5309.229 ? 15.559 ops/s RLE thrpt 10 1678.787 ? 14.702 2041.165 ? 12.379 ops/s
ca033aa
to
9b27c10
Compare
m7g.2xlarge EC2 machine with Temurin JDK 22 BenchmarkRleBitPackingDecoder (bitWidth) (dataSet) Mode Cnt Scalar score Vector score Units 1 RANDOM thrpt 10 650.368 ± 0.148 898.939 ± 0.345 ops/s 2 RANDOM thrpt 10 665.379 ± 0.162 1017.777 ± 0.130 ops/s 3 RANDOM thrpt 10 570.223 ± 0.275 765.640 ± 0.256 ops/s 4 RANDOM thrpt 10 620.299 ± 0.361 1073.897 ± 0.584 ops/s 5 RANDOM thrpt 10 560.624 ± 0.148 761.582 ± 0.085 ops/s 6 RANDOM thrpt 10 449.945 ± 0.201 741.803 ± 0.321 ops/s 7 RANDOM thrpt 10 516.331 ± 0.106 736.790 ± 0.535 ops/s 8 RANDOM thrpt 10 620.409 ± 0.907 1087.050 ± 0.340 ops/s 9 RANDOM thrpt 10 466.077 ± 0.085 824.186 ± 1.486 ops/s 10 RANDOM thrpt 10 462.300 ± 0.104 817.194 ± 1.144 ops/s 11 RANDOM thrpt 10 359.207 ± 0.046 810.836 ± 0.714 ops/s 12 RANDOM thrpt 10 468.968 ± 0.092 813.104 ± 0.238 ops/s 13 RANDOM thrpt 10 347.213 ± 0.089 818.207 ± 1.691 ops/s 14 RANDOM thrpt 10 346.392 ± 0.144 807.127 ± 1.213 ops/s 15 RANDOM thrpt 10 240.605 ± 0.217 801.025 ± 2.004 ops/s 16 RANDOM thrpt 10 506.141 ± 0.192 1149.157 ± 0.561 ops/s 17 RANDOM thrpt 10 369.903 ± 0.121 480.297 ± 0.588 ops/s 18 RANDOM thrpt 10 367.348 ± 0.085 461.633 ± 0.939 ops/s 19 RANDOM thrpt 10 250.100 ± 0.043 468.750 ± 1.367 ops/s 20 RANDOM thrpt 10 364.844 ± 0.146 461.019 ± 1.270 ops/s BenchmarkFlatDefinitionLevelDecoder (dataGenerator) (size) (vectorizedDecodingEnabled) Mode Cnt Score Error Units RANDOM 100000 false thrpt 10 21.903 ± 0.014 ops/ms RANDOM 100000 true thrpt 10 32.216 ± 0.025 ops/ms m5.2xlarge EC2 machine with Temurin JDK 22 BenchmarkBooleanColumnReader (encoding) Mode Cnt Before Score After Score Units PLAIN thrpt 20 1420.173 ± 4.133 2710.464 ± 8.410 ops/s RLE thrpt 20 689.606 ± 2.140 738.302 ± 2.089 ops/s
Description
Uses the new incubating Java Vector API to improve decoding performance
for some encodings in parquet reader.
Vectorized decoding is used only when the preferred vector bit size for the
current platform is at least 256 bits (enabled on x86 and Graviton 3 machines but not on Graviton 2 machines).
This can be disabled using
parquet.experimental.vectorized-decoding.enabled
configuration flag.Additional context and related issues
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text: