Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend comparison methods to accept different datetime types. #129

Merged
merged 19 commits into from
Dec 19, 2022

Conversation

Yury-Fridlyand
Copy link

@Yury-Fridlyand Yury-Fridlyand commented Oct 6, 2022

Signed-off-by: Yury-Fridlyand [email protected]

Description

Functions affected: =, !=, >, <, <= >=.
Signatures added:

TIME, DATE               -> BOOLEAN
TIME, DATETIME           -> BOOLEAN
TIME, TIMESTAMP          -> BOOLEAN
DATE, TIME               -> BOOLEAN
DATE, DATETIME           -> BOOLEAN
DATE, TIMESTAMP          -> BOOLEAN
DATETIME, TIME           -> BOOLEAN
DATETIME, DATE           -> BOOLEAN
DATETIME, TIMESTAMP      -> BOOLEAN
TIMESTAMP, TIME          -> BOOLEAN
TIMESTAMP, DATE          -> BOOLEAN
TIMESTAMP, DATETIME      -> BOOLEAN

Test data

mysql> show fields from calcs where field IN ('date0', 'time0', 'time1', 'datetime0');
+-----------+-----------+------+-----+---------+-------+
| Field     | Type      | Null | Key | Default | Extra |
+-----------+-----------+------+-----+---------+-------+
| date0     | date      | YES  |     | NULL    |       |
| time0     | datetime  | YES  |     | NULL    |       |
| time1     | time      | YES  |     | NULL    |       |
| datetime0 | timestamp | YES  |     | NULL    |       |
+-----------+-----------+------+-----+---------+-------+
mysql> select `key`, date0, time0, time1, datetime0 from calcs limit 7;
+-------+------------+---------------------+----------+---------------------+
| key   | date0      | time0               | time1    | datetime0           |
+-------+------------+---------------------+----------+---------------------+
| key00 | 2004-04-15 | 1899-12-30 21:07:32 | 19:36:22 | 2004-07-09 10:17:35 |
| key01 | 1972-07-04 | 1900-01-01 13:48:48 | 02:05:25 | 2004-07-26 12:30:34 |
| key02 | 1975-11-12 | 1900-01-01 18:21:08 | 09:33:31 | 2004-08-02 07:59:23 |
| key03 | 2004-06-04 | 1900-01-01 18:51:48 | 22:50:16 | 2004-07-05 13:14:20 |
| key04 | 2004-06-19 | 1900-01-01 15:01:19 | NULL     | 2004-07-28 23:30:22 |
| key05 | NULL       | 1900-01-01 08:59:39 | 19:57:33 | 2004-07-22 00:30:23 |
| key06 | NULL       | 1900-01-01 07:37:48 | NULL     | 2004-07-28 06:54:50 |
+-------+------------+---------------------+----------+---------------------+

Test queries

MySQL
select `key`, date0, time0, time1, datetime0, date0 < now(), time0 < now(), time1 < now(), datetime0 < now() from calcs limit 6;
select `key`, date0, time0, time1, datetime0, date0 <= now(), time0 <= now(), time1 <= now(), datetime0 <= now() from calcs limit 6;
OpenSearch
SELECT CAST(date0 AS date)                  AS `date`,       CAST(date0 AS date)             < now() AS `date < datetime`,
       datetime(CAST(time0 AS STRING))      AS `datetime`,   datetime(CAST(time0 AS STRING)) < now() AS `datetime < datetime`,
       CAST(time1 AS time)                  AS `time`,       CAST(time1 AS time)             < now() AS `time < datetime`,
       CAST(datetime0 AS timestamp)         AS `timestamp`,  CAST(datetime0 AS timestamp)    < now() AS `timestamp < datetime`
FROM calcs LIMIT 6;
   
SELECT CAST(date0 AS date)                  AS `date`,       CAST(date0 AS date)             <= now() AS `date <= datetime`,
       datetime(CAST(time0 AS STRING))      AS `datetime`,   datetime(CAST(time0 AS STRING)) <= now() AS `datetime <= datetime`,
       CAST(time1 AS time)                  AS `time`,       CAST(time1 AS time)             <= now() AS `time <= datetime`,
       CAST(datetime0 AS timestamp)         AS `timestamp`,  CAST(datetime0 AS timestamp)    <= now() AS `timestamp <= datetime`
FROM calcs LIMIT 6;

Issues Resolved

Fix comparison between different datetime data types. #294

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@codecov
Copy link

codecov bot commented Oct 6, 2022

Codecov Report

Merging #129 (d92fb42) into integ-datetime-comparison (64a3794) will increase coverage by 0.01%.
The diff coverage is 100.00%.

@@                       Coverage Diff                       @@
##             integ-datetime-comparison     #129      +/-   ##
===============================================================
+ Coverage                        95.78%   95.80%   +0.01%     
- Complexity                        3503     3538      +35     
===============================================================
  Files                              350      350              
  Lines                             9310     9345      +35     
  Branches                           669      676       +7     
===============================================================
+ Hits                              8918     8953      +35     
  Misses                             334      334              
  Partials                            58       58              
Flag Coverage Δ
query-workbench 62.76% <ø> (ø)
sql-engine 98.30% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...g/opensearch/sql/data/model/AbstractExprValue.java 100.00% <100.00%> (ø)
...a/org/opensearch/sql/data/model/ExprDateValue.java 100.00% <100.00%> (ø)
...g/opensearch/sql/data/model/ExprDatetimeValue.java 100.00% <100.00%> (ø)
...a/org/opensearch/sql/data/model/ExprTimeValue.java 100.00% <100.00%> (ø)
.../opensearch/sql/data/model/ExprTimestampValue.java 100.00% <100.00%> (ø)
.../java/org/opensearch/sql/data/model/ExprValue.java 100.00% <100.00%> (ø)
.../org/opensearch/sql/data/model/ExprValueUtils.java 100.00% <100.00%> (ø)
...c/main/java/org/opensearch/sql/expression/DSL.java 100.00% <100.00%> (ø)
...pensearch/sql/expression/function/FunctionDSL.java 100.00% <100.00%> (ø)
...on/operator/predicate/BinaryPredicateOperator.java 100.00% <100.00%> (ø)
... and 1 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link

@MaxKsyunz MaxKsyunz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BinaryPredicateOperator.less, .greater, .equals, etc differ only in a few places.
You could write one operatorImpl method parametrized by the differences and use it.

@Yury-Fridlyand
Copy link
Author

BinaryPredicateOperator.less, .greater, .equals, etc differ only in a few places. You could write one operatorImpl method parametrized by the differences and use it.

Fixed in d4425e6.

@@ -73,7 +73,7 @@ public boolean equal(ExprValue o) {
*/
@Override
public int compare(ExprValue other) {
return Integer.compare(valueList.size(), other.collectionValue().size());
return equal(other) ? 0 : 1;
Copy link

@forestmvey forestmvey Oct 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this identical implementation in ExprTupleValue needed twice? Why not pull implementation to AbstractExprValue?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted this in d6f68b9.

@@ -51,6 +56,21 @@ public LocalTime timeValue() {
return time;
}

@Override
public LocalDate dateValue() {
return LocalDate.now();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use the curdate date value. If dateValue() is called at the wrong time, it can lead to unexpected behaviour.

For example, consider SELECT CAST(TIME('22:00:00') AS DATETIME) > CAST(TIME('21:00:00')).
Let's say today's date is December 1, 2022. Then the query will be the same as SELECT DATETIME('2022-12-01 22:00:00') > DATETIME('2022-12-01 21:00:00') and would evaluate to true.

However, if it's executed close enough to midnight, it's possible to get December 1st as current date for the first cast, and December 2nd for the second. Then the query will be same as SELECT DATETIME('2022-12-01 22:00:00') > DATETIME('2022-12-02 21:00:00') and suddenly evaluate to false.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed that will be a part of another feature.

+ "`neq3` = DATE('1961-04-12') != DATE('1961-04-12'), "
+ "`gt1` = DATE('1984-12-15') > DATE('1961-04-12'), "
+ "`gt2` = DATE('1984-12-15') > DATE('2020-09-16'), "
+ "`lt1` = DATE('1961-04-12') < DATE('1984-12-15'), "

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would help reviews a lot if each test case was a separate query.

As it is written, to check the expected results for, say lt1, I have to count which field it's in the response -- 8, and then count to the 8th parameter of rows on line 72. If it was one query -- source =%s | eval eq1 = DATE('1961-04-12') < DATE('1984-12-15') | fields eq1` -- it's obvious.

Test like these are prime candidates for parametarization.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b6371b6.

Comment on lines 42 to 47
v1.type() == TIME ? v1.timeValue().atDate(LocalDate.now()) : v1.datetimeValue(),
v2.type() == TIME ? v2.timeValue().atDate(LocalDate.now()) : v2.datetimeValue());
} else if (v1.type() != v2.type()) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using LocalDate.now() can lead to unexpected results when executed close to midnight. It should use same value as curdate.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed that will be a part of another feature

@@ -31,6 +32,9 @@
},
"nested_value": {
"type": "nested"
},
"geo_point_value": {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does geo_point have to do with type comparison?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added geo_point to have all supported types listed. No tests for this type yet.

@Yury-Fridlyand
Copy link
Author

To fix:

  1. update AbstractExprValue::compareTo to support comparison of different datetime types
  2. move to another PR comparison of STRUCTs and ARRAYs (ExprTupleValues and ExprCollectionValues)

@Yury-Fridlyand
Copy link
Author

To fix:

  1. update AbstractExprValue::compareTo to support comparison of different datetime types
  2. move to another PR comparison of STRUCTs and ARRAYs (ExprTupleValues and ExprCollectionValues)
  1. d6f68b9
  2. ab60013

@acarbonetto acarbonetto changed the title Extend comparison methods to accept different datetime types. [BLOCKED] Extend comparison methods to accept different datetime types. Oct 25, 2022
Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
@Yury-Fridlyand Yury-Fridlyand force-pushed the dev-datetime-comparison branch from e25b93f to d92fb42 Compare December 10, 2022 01:08
@Yury-Fridlyand Yury-Fridlyand changed the title [BLOCKED] Extend comparison methods to accept different datetime types. Extend comparison methods to accept different datetime types. Dec 10, 2022
@Yury-Fridlyand
Copy link
Author

Rebased and updated. Most recent change include proper use of FunctionProperties.
Auxiliary changes:

  • Added implWithProperties and nullMissingHandlingWithProperties for function with 2 args;
  • Added UTs for them;
  • Added compare(fp, v1, v2) function to ComparisonUtil to compare different datetime types. compare(v1, v2) is reverted;
  • Added extractDateTime to DateTimeUtils to extract datetime from all datetime types using FunctionProperties;
  • Updated DSL signatures;
  • Updated comparison implementation and tests.

}
if ((this.isNumber() && other.isNumber()) || this.type() == other.type()) {
if ((this.isNumber() && other.isNumber())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be better to include a isComparable(AbstractExprValue) function here? Overridden by child classes that return true for valid cases or are the same type?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we add a new type, we have to update that function override in all/some other types to reflect the change. It is not scalable solution.

implWithProperties(
SerializableTriFunction<FunctionProperties, ExprValue, ExprValue, ExprValue> function,
ExprType returnType,
ExprType args1Type,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to use a List<ExprType> argTypes instead of having to create a separate signature for each permutation

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be a separate feature. We have to change all impl/implWithProperties and hundreds of their usages in that case.
You are welcome to open a feature request!

return (functionProperties, v1, v2) -> {
if (v1.isMissing() || v2.isMissing()) {
return ExprValueUtils.missingValue();
} else if (v1.isNull() || v2.isNull()) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else unnecessary here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just copy-pasted from

public static SerializableBiFunction<ExprValue, ExprValue, ExprValue> nullMissingHandling(
SerializableBiFunction<ExprValue, ExprValue, ExprValue> function) {
return (v1, v2) -> {
if (v1.isMissing() || v2.isMissing()) {
return ExprValueUtils.missingValue();
} else if (v1.isNull() || v2.isNull()) {
return ExprValueUtils.nullValue();
} else {
return function.apply(v1, v2);
}
};
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can groom all of them in scope of another task I think.

case DOUBLE: return getDoubleValue(v1).compareTo(getDoubleValue(v2));
case STRING: return getStringValue(v1).compareTo(getStringValue(v2));
case BOOLEAN: return v1.booleanValue().compareTo(v2.booleanValue());
case TIME: return v1.timeValue().compareTo(v2.timeValue());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For time, shouldn't we potentially use datetimeValue() if the other type is DATETIME or timestampValue if the other type is TIMESTAMP?
same for date...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. This function has v1 and v2 values of the same type, so we can compare time to time without overhead of producing and comparting timestamps.
  2. We can't get timestamp/datetime from time without dark magic FunctionProperties

@Yury-Fridlyand Yury-Fridlyand merged commit e3b4e51 into integ-datetime-comparison Dec 19, 2022
@Yury-Fridlyand Yury-Fridlyand deleted the dev-datetime-comparison branch December 19, 2022 21:06
Yury-Fridlyand added a commit that referenced this pull request Dec 19, 2022
* Extend comparison methods to accept different datetime types.

Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Yury-Fridlyand added a commit that referenced this pull request Jan 13, 2023
* Extend comparison methods to accept different datetime types.

Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Yury-Fridlyand added a commit that referenced this pull request Jan 26, 2023
…pensearch-project#1196)

* Extend comparison methods to accept different datetime types. (#129)

* Extend comparison methods to accept different datetime types.

Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>

* Typo fix.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Rework fix according to @dai-chen's comment.
opensearch-project#1196 (review)

* Revert `BinaryPredicateOperator`.
* Add automatic cast rules `DATE`/`TIME`/`DATETIME` -> `DATETIME`/TIMESTAMP`.
* Update unit tests.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Add doctest sample.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Docs typo fix.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor comments update.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Doctest typo fix.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Doctest typo fix.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Doctest typo fix.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Modify `ExprCoreType` dependencies.

Signed-off-by: Yury-Fridlyand <[email protected]>

Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
matthewryanwells pushed a commit that referenced this pull request Feb 1, 2023
…pensearch-project#1196) (opensearch-project#1294)

* Extend comparison methods to accept different datetime types. (#129)

* Extend comparison methods to accept different datetime types.

Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>

* Typo fix.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Rework fix according to @dai-chen's comment.
opensearch-project#1196 (review)

* Revert `BinaryPredicateOperator`.
* Add automatic cast rules `DATE`/`TIME`/`DATETIME` -> `DATETIME`/TIMESTAMP`.
* Update unit tests.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Add doctest sample.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Docs typo fix.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor comments update.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Doctest typo fix.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Doctest typo fix.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Doctest typo fix.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Modify `ExprCoreType` dependencies.

Signed-off-by: Yury-Fridlyand <[email protected]>

Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
(cherry picked from commit a4f8066)

Co-authored-by: Yury-Fridlyand <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants