-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move postings back to int[] to take advantage of having more lanes per vector. #13968
Conversation
In Lucene 8.4, we updated postings to work on long[] arrays internally. This allowed us to workaround the lack of explicit vectorization (auto-vectorization doesn't detect all the scenarios that we would like to handle) support in the JVM by summing up two integers in one operation for instance. With explicit vectorization now available, it looks like we can get more benefits from the ability to compare multiple intetgers in one operations than from summing up two integers in one operation. Moving back to ints helps compare 2x more integers at once vs. longs.
Here is a
The The The simplicity and memory efficiency of working with int[] instead of long[], coupled with the speedup to advancing make this change a good trade-off in my opinion. |
…ctorizationProvider.
I plan on merging tomorrow, so that we have two data points with longs on nightly benchmarks before seeing how it performs with ints. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems good. Tried to look through the actual changes (and not all the codec wiring stuff).
* cIndex}. | ||
* </ul> | ||
*/ | ||
public void splitInts( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we drop #splitLongs
? (Also, should we add @lucene.internal
to this class so we're free to drop public methods?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching, I had meant to do it but missed some bits obviously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW this class may only be used from a very small set of explicitly named classes, see org.apache.lucene.internal.vectorization.VectorizationProvider#VALID_CALLERS
, so there is no risk that users use this API.
Nightly benchmarks confirmed a speedup from the combination of #13958 (SIMD for advancing within a block) and this PR. (CountAndHighHigh, CountAndHighMed) Exhaustive evaluation of disjunctive queries looks stable or barely slower (CountOrHighHigh, CountOrHighMed) while we could have expected a slowdown from the loss of some optimizations in block decoding. |
…r vector. (#13968) In Lucene 8.4, we updated postings to work on long[] arrays internally. This allowed us to workaround the lack of explicit vectorization (auto-vectorization doesn't detect all the scenarios that we would like to handle) support in the JVM by summing up two integers in one operation for instance. With explicit vectorization now available, it looks like we can get more benefits from the ability to compare multiple intetgers in one operations than from summing up two integers in one operation. Moving back to ints helps compare 2x more integers at once vs. longs.
…r vector. (apache#13968) In Lucene 8.4, we updated postings to work on long[] arrays internally. This allowed us to workaround the lack of explicit vectorization (auto-vectorization doesn't detect all the scenarios that we would like to handle) support in the JVM by summing up two integers in one operation for instance. With explicit vectorization now available, it looks like we can get more benefits from the ability to compare multiple intetgers in one operations than from summing up two integers in one operation. Moving back to ints helps compare 2x more integers at once vs. longs.
In Lucene 8.4, we updated postings to work on long[] arrays internally. This allowed us to workaround the lack of explicit vectorization (auto-vectorization doesn't detect all the scenarios that we would like to handle) support in the JVM by summing up two integers in one operation for instance.
With explicit vectorization now available, it looks like we can get more benefits from the ability to compare multiple intetgers in one operations than from summing up two integers in one operation. Moving back to ints helps compare 2x more integers at once vs. longs.
The diff is large because of the codec dance:
Lucene912PostingsFormat
andLucene100Codec
moved tolucene/backward-codecs
and a newLucene101PostingsFormat
is a copy of the previousLucene912PostingsFormat
with a move from long[] arrays to int[] arrays, and changes to the on-disk format for blocks of packed integers.Note that
DataInput#readGroupVInt
andVectorUtilSupport#findNextGEQ
have been cleaned up to only supportint[]
and no longerlong[]
.