-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bump Elasticsearch codec to track Lucene101Codec #116318
Bump Elasticsearch codec to track Lucene101Codec #116318
Conversation
Pinging @elastic/es-search-foundations (Team:Search Foundations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
fa068a6
to
5ca891d
Compare
import org.apache.lucene.codecs.lucene99.Lucene99HnswVectorsFormat; | ||
import org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat; | ||
import org.apache.lucene.codecs.perfield.PerFieldKnnVectorsFormat; | ||
import org.apache.lucene.codecs.perfield.PerFieldPostingsFormat; | ||
import org.elasticsearch.index.codec.zstd.Zstd814StoredFieldsFormat; | ||
|
||
/** | ||
* Elasticsearch codec as of 9.0. This extends the Lucene 10.0 codec to compressed stored fields with ZSTD instead of LZ4/DEFLATE. See | ||
* {@link Zstd814StoredFieldsFormat}. | ||
* Elasticsearch codec as of 9.0-snapshot. This extends the Lucene 10.0 codec to compressed stored fields with ZSTD instead of LZ4/DEFLATE. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"-snapshot" - I initially stumbled over this comment. I kinda don't know how better to say it, but it's basically a pre-GA release of ES 9.0. There could be more than one, but as of now there is just one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great catch Chris, this was more of a placeholder to remember to bring this up because javadocs are no longer entirely accurate. I will try to clarify it further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
public Elasticsearch900Lucene101Codec(Zstd814StoredFieldsFormat.Mode mode) { | ||
super("Elasticsearch900Lucene101", new Lucene101Codec()); | ||
this.storedFieldsFormat = mode.getFormat(); | ||
this.defaultPostingsFormat = new Lucene101PostingsFormat(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it would make maintetance easier if we retrieved it from the wrapper Lucene101Codec
instead of instantiating it manually here. Then we'd only need to update the delegate codec when Lucene bumps the codec, and not the postings/doc-values/knn-vectors formats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I wonder why we have not done that before. That can be applied to doc values format and knn vectors format too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I think it's just a bit annoying because you don't want to use getPostingsFormat()
but getPostingsFormatForField("some_ignored_field_name")
.
We moved to Completion101 with #116318, but that has been overwritten by
See apache/lucene#13968