-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support match_phrase filter function in SQL and PPL #604
Support match_phrase filter function in SQL and PPL #604
Conversation
Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: Yury Fridlyand <[email protected]>
Signed-off-by: MaxKsyunz <[email protected]>
…ents. Signed-off-by: Yury Fridlyand <[email protected]>
Added a couple test for match_test in PPL. The tests are currently @ignore'd until more of the issue is complete. Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: Yury Fridlyand <[email protected]>
Signed-off-by: Yury Fridlyand <[email protected]>
Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: Yury Fridlyand <[email protected]>
Create a parametrized test for PPLSyntax parser. Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: Yury Fridlyand <[email protected]>
Signed-off-by: Yury Fridlyand <[email protected]>
Signed-off-by: Yury Fridlyand <[email protected]>
Signed-off-by: Yury Fridlyand <[email protected]>
…for `MATCH` are also updated. Signed-off-by: Yury Fridlyand <[email protected]>
Signed-off-by: Yury Fridlyand <[email protected]>
Signed-off-by: Yury Fridlyand <[email protected]>
Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: MaxKsyunz <[email protected]>
1. Use PPL instead of SQL as samples. 2. Use data that doctest runs with. Signed-off-by: MaxKsyunz <[email protected]>
… samples. This fixes doctest. Signed-off-by: Yury Fridlyand <[email protected]>
Signed-off-by: Yury Fridlyand <[email protected]>
Codecov Report
@@ Coverage Diff @@
## main #604 +/- ##
=============================================
- Coverage 94.59% 62.76% -31.83%
=============================================
Files 276 10 -266
Lines 7457 658 -6799
Branches 552 119 -433
=============================================
- Hits 7054 413 -6641
+ Misses 349 192 -157
+ Partials 54 53 -1
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
Add support for matchphrase as an alternative spelling of match_phrase. Mention in both SQL and PPL documentation that matchphrase is accepted as well as match_phrase. Signed-off-by: MaxKsyunz <[email protected]>
…update-v2 Support legacy syntax for match_phrase in the new SQL engine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, minor comments
@@ -2195,3 +2195,43 @@ Another example to show how to set custom values for the optional parameters:: | |||
| Bond | | |||
+------------+ | |||
|
|||
MATCH_PHRASE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what should be done but should we distinguish between this one and https://github.com/opensearch-project/sql/blob/main/docs/user/beyond/fulltext.rst#match-phrase-query
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that match_phrase
from the legacy SQL engine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i believe so, i'm thinking maybe add a note to the old one, or link it after "For backward compatibility" in the new one, so users know what the difference is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joshuali925 that's a good point.
Distinction between old and new SQL engines was explicitly removed as part of #58
I'd like to add a note to fulltext.rst
that the document describes the legacy SQL engine and to refer to functions.rst
for the new engine.
Once the new engine is on par with the old one, we can update fulltext.rst
to describe the default engine.
What do you think?
@@ -0,0 +1,40 @@ | |||
package org.opensearch.sql.ppl.antlr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing license header
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See 1fae665
@@ -0,0 +1,112 @@ | |||
package org.opensearch.sql.opensearch.storage.script.filter.lucene; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
license header
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See 1fae665
public QueryBuilder build(FunctionExpression func) { | ||
List<Expression> arguments = func.getArguments(); | ||
if (arguments.size() < 2) { | ||
throw new SemanticCheckException("match_phrase requires at least two parameters"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ANTLR will check the syntax, right? If you want to assert that, it should be syntax exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see 7101d07
/** | ||
* Lucene query that builds a match_phrase query. | ||
*/ | ||
public class MatchPhraseQuery extends LuceneQuery { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like MatchPhraseQuery and MatchQuery have duplicate code, could it been simplified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see RelevanceQuery
added in d3667ce
Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: MaxKsyunz <[email protected]>
Also, use getWriteableName() to get the name of the query in the error message. Signed-off-by: MaxKsyunz <[email protected]>
This dummy instance of MatchPhraseQueryBuilder is used in RelevanceQuery when creating an exception to get the name of the query. Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: MaxKsyunz <[email protected]>
This branch changed RelevanceQuery.build to throw SyntaxCheckException if it is called with less than two arguments. Previously, build would through SemanticCheckException. Signed-off-by: MaxKsyunz <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
List<Expression> arguments = func.getArguments(); | ||
if (arguments.size() < 2) { | ||
String queryName = createQueryBuilder("dummy_field", "").getWriteableName(); | ||
throw new SyntaxCheckException( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a UT to assert the exception message is expected?
/** | ||
* Base class for query abstraction that builds a relevance query from function expression. | ||
*/ | ||
public abstract class RelevanceQuery<T extends QueryBuilder> extends LuceneQuery { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it!
Signed-off-by: MaxKsyunz <[email protected]>
assertThrows(SemanticCheckException.class, () -> query.build(expr)); | ||
SemanticCheckException exception = | ||
assertThrows(SemanticCheckException.class, () -> query.build(expr)); | ||
assertEquals("Parameter wrongArg is invalid for mock_query function.", exception.getMessage()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure you want to tie the test that closely to the code (so that if someone changes a single character in the string, the test fails).
Maybe you can check that the messages contains "wrongArg" and "mock_query" and leave it at that?
assertThat(exception.getMessages(), containsString("wrongArg"));
assertThat(exception.getMessages(), containsString("mock_query"));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@acarbonetto that's in line with pre-existing convention in this codebase.
assertThrows(SyntaxCheckException.class, () -> query.build(createCall(arguments))); | ||
SyntaxCheckException exception = assertThrows(SyntaxCheckException.class, | ||
() -> query.build(createCall(arguments))); | ||
assertEquals("mock_query requires at least two parameters", exception.getMessage()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the change!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, merging in
See opensearch-project#604 (comment) for reference Signed-off-by: MaxKsyunz <[email protected]>
See opensearch-project#604 (comment) for reference Signed-off-by: MaxKsyunz <[email protected]>
See opensearch-project#604 (comment) for reference Signed-off-by: MaxKsyunz <[email protected]>
- Update QueryStringTest to check for SyntaxCheckException. SyntaxCheckException is correct when incorrect # of parameters See opensearch-project#604 (comment) for reference. - Introduce MultiFieldQuery and SingleFieldQuery base classes. - Extract FunctionResolver interface. FunctionResolver is now DefaultFunctionResolver. RelevanceFunctionResolver is a simplified function resolver for relevance search functions. - Removed tests from FilterQueryBuilderTest that verified exceptions thrown for invalid function calls. These scenarios are now handled by RelevanceQuery::build. Signed-off-by: MaxKsyunz <[email protected]>
- Update QueryStringTest to check for SyntaxCheckException. SyntaxCheckException is correct when incorrect # of parameters See #604 (comment) for reference. - Introduce MultiFieldQuery and SingleFieldQuery base classes. - Extract FunctionResolver interface. FunctionResolver is now DefaultFunctionResolver. RelevanceFunctionResolver is a simplified function resolver for relevance search functions. - Removed tests from FilterQueryBuilderTest that verified exceptions thrown for invalid function calls. These scenarios are now handled by RelevanceQuery::build. Signed-off-by: MaxKsyunz <[email protected]> Signed-off-by: MaxKsyunz <[email protected]>
Description
match_phrase
filter function in SQL.match_phrase
filter function in PPL.matchphrase
filter function in SQL to maintain backwards compatibility with the legacy engine.Issues Resolved
Resolves #185.
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.