Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing math power function for SQL #2324

Merged
merged 19 commits into from
Apr 28, 2022
Merged

Conversation

comphead
Copy link
Contributor

Which issue does this PR close?

Closes #1493

Rationale for this change

Implementing power function for SQL

What changes are included in this PR?

Implementing power function for SQL

Are there any user-facing changes?

User can use math POWER function in SQL statements

@github-actions github-actions bot added datafusion Changes in the datafusion crate development-process Related to development process of DataFusion labels Apr 23, 2022
@comphead
Copy link
Contributor Author

Implement math POWER function with signatures like https://www.postgresql.org/docs/14/functions-math.html

Example usage

    "SELECT power(i32, 3) as power_i32,
             power(i64, 3) as power_i64,
             power(f32, 3) as power_f32,
             power(f64, 3) as power_f64,
             power(2, 3) as power_int_scalar,
             power(2.5, 3) as power_float_scalar
      FROM test";

Output

    "+-----------+-----------+-----------+-----------+------------------+--------------------+",
    "| power_i32 | power_i64 | power_f32 | power_f64 | power_int_scalar | power_float_scalar |",
    "+-----------+-----------+-----------+-----------+------------------+--------------------+",
    "| 8         | 8         | 1         | 1         | 8                | 15.625             |",
    "| 125       | 125       | 15.625    | 15.625    | 8                | 15.625             |",
    "| 0         | 0         | 0         | 0         | 8                | 15.625             |",
    "| -2744     | -2744     | -3048.625 | -3048.625 | 8                | 15.625             |",
    "|           |           |           |           | 8                | 15.625             |",
    "+-----------+-----------+-----------+-----------+------------------+--------------------+",

@comphead comphead marked this pull request as ready for review April 23, 2022 14:44
@xudong963 xudong963 added the enhancement New feature or request label Apr 23, 2022
@comphead comphead requested a review from xudong963 April 23, 2022 19:39
@@ -217,6 +217,11 @@ pub fn return_type(
}
}),

BuiltinScalarFunction::Power => match &input_expr_types[0] {
DataType::Int32 | DataType::Int64 => Ok(DataType::Int64),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about Int8,Int16, UInt....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now its changed to be like in https://www.postgresql.org/docs/14/functions-math.html

So the signature is either i64 or f64.
If user column is another col type with less size like i32, f32, i16, etc it will work, however the return type is i64 or f64.
Let me know if this is ok

@comphead
Copy link
Contributor Author

Hi @Ted-Jiang @xudong963, can you please review again, I'm looking forward to commence to other datafusion tasks

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @comphead -- looking good 👍

vec![
TypeSignature::Exact(vec![DataType::Int64, DataType::Int64]),
TypeSignature::Exact(vec![DataType::Float64, DataType::Float64]),
],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that postgres supports dp (aka double precision, aka Float64) and Numeric

https://www.postgresql.org/docs/12/functions-math.html

I think for the initial implementation it would be fine to support Int64 as well, but longer term we should add coercion rules to convert arguments to Float64 before they are actually passed to power

arg1.iter()
.zip(arg2.iter())
.map(|(a1, a2)| match (a1, a2) {
(Some(a1), Some(a2)) => Some($FUNC(a1, a2.try_into().unwrap())),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if .unwrap() is needed here -- it will cause a panic if the argument can't be converted to whatever $FUNC wants

Does it work if you do

Suggested change
(Some(a1), Some(a2)) => Some($FUNC(a1, a2.try_into().unwrap())),
(Some(a1), Some(a2)) => Some($FUNC(a1, a2.try_into()?)),

Instead?

Copy link
Contributor Author

@comphead comphead Apr 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I came up with (Some(a1), Some(a2)) => Some($FUNC(a1, a2.try_into().ok()?)),

let ctx = SessionContext::new();
ctx.register_table("test", Arc::new(table))?;
let sql = r"SELECT power(i32, 3) as power_i32,
power(i64, 3) as power_i64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A test that takes two columns in addition to a column and a scalar would also be good -- like power(i64, i64) perhaps

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@comphead comphead requested a review from alamb April 27, 2022 09:00
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the erroneously included test, I think this is looking good. Thank you @comphead . Nice first contribution

@@ -445,3 +445,169 @@ async fn case_builtin_math_expression() {
assert_batches_sorted_eq!(expected, &results);
}
}

#[tokio::test]
async fn case_sensitive_identifiers_aggregates() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this test is accidentally included in this PR. It currently exists in aggregates.rs:

https://github.com/apache/arrow-datafusion/blob/cc496f87caf8d13e4c17025e0d3b86643a21a243/datafusion/core/tests/sql/aggregates.rs#L1112

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, probably a merge issue. Tried to be in sync with fast changing files

@alamb alamb merged commit c3c02cf into apache:master Apr 28, 2022
@alamb
Copy link
Contributor

alamb commented Apr 28, 2022

Thanks again all!

@comphead comphead deleted the update_upstream branch April 28, 2022 12:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datafusion Changes in the datafusion crate development-process Related to development process of DataFusion enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement power function
4 participants