-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LUCENE-10008: Respect ignoreCase flag in CommonGramsFilterFactory #2573
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @vigyasharma -- sorry I took so long to review -- this nearly fell past the event-horizon of my TODO list!
I left one small comment about moving the CHANGES entry to the right place, otherwise I think this is good to go for an eventual 8.11 release.
lucene/CHANGES.txt
Outdated
@@ -648,6 +648,8 @@ Optimizations | |||
|
|||
Bug Fixes | |||
--------------------- | |||
* LUCENE-10008: Respect ignoreCase in CommonGramsFilterFactory (Vigya Sharma) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoa, this is in the wrong (Lucene 8.6.0) section of CHANGES!
Since the 8.10 bits have already been set free, can you move this to the new 8.11 section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, my bad. Thanks for checking this Mike. Moved this line to 8.11 section.
3726571
to
fa43ff5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @vigyasharma -- I'll push soon!
Backporting a bug fix pushed into
apache/lucene:main
as part of this PR - apache/lucene#188Description
CommonGramsFilterFactory
should respect the ignoreCase flag passed in args even when the default stop word set is used. It currently ignores the flag ifcommonWordFiles
are not specified.Solution
Ensure the flag is respected in even when default stop word set is used.
Tests
Added test to ensure that bigrams get constructed with common words that are not in lower case, when
ignoreCase
is passed astrue
to theCommonGramsFilterFactory
.