Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse map(k1,v1,k2,v2) from hive to trino #99

Merged
merged 1 commit into from
Aug 20, 2021

Conversation

JiajunBernoulli
Copy link
Contributor

@JiajunBernoulli JiajunBernoulli commented Jun 13, 2021

Hi, everybody!
I'm interested in converting SQL from hive to trino. This is my first PR. Here's a little bit of my work:

For hive, SELECT MAP(k1, v1, k2, v2) is

{k1=v1, k2=v2};

but for trino, sql should be SELECT MAP(ARRAY[k1,k2], ARRAY[v1,v2])

Welcome your suggestions.

Comment on lines 78 to 86
private void unparseMapOperator(SqlWriter writer, SqlCall call, int leftPrec, int rightPrec) {
writer.keyword(call.getOperator().getName()); // "MAP"
final SqlWriter.Frame frame = writer.startList("(", ")");
writer.keyword("ARRAY[");
// Even numbers are keys
for (int i = 0; i < call.operandCount(); i += 2) {
writer.sep(",");
call.operand(i).unparse(writer, leftPrec, rightPrec);
}
writer.keyword("], ARRAY[");
// Odd numbers are values
for (int i = 1; i < call.operandCount(); i += 2) {
call.operand(i).unparse(writer, leftPrec, rightPrec);
// Last comma is redundant
if (i < call.operandCount() - 1) {
writer.sep(",");
}
}
writer.keyword("]");
writer.endList(frame);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not we handle the transformation in Calcite2TrinoUDFConverter? I think if the operands need transformation, this will not take care of them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I understand.
So far, it seems operands number must be determined because CalciteTrinoUDFMap.getUDFTransformer(String calciteOpName, int numOperands). However map is variable length parameter, I can't put it in UDF_MAP.
Do we plan to support functions with variable length parameters? I thinks this is meaningful topic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Maybe we can introduce getUDFTransformer(String calciteOpName) that means the Op takes variable length parameters?

@JiajunBernoulli JiajunBernoulli force-pushed the hive2trino branch 3 times, most recently from bdae026 to 7612eec Compare June 19, 2021 10:40
@JiajunBernoulli
Copy link
Contributor Author

Because I can't put map in UDF_MAP, I can only handle the transformation in Calcite2TrinoUDFConverter.convertMapValueConstructor and replace "[]" with "()" in TrinoSqlDialect.unparseMapValueConstructor.

Copy link
Collaborator

@ljfgem ljfgem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your PR! I think we need to handle the case map(key, map(...))

// Odd numbers are values
List<RexNode> values = new ArrayList<>();
for (int i = 1; i < sourceOperands.size(); i += 2) {
values.add(sourceOperands.get(i));
Copy link
Collaborator

@ljfgem ljfgem Jul 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If value is also a map, we failed to convert it.
For example, if the view definition is:

SELECT MAP('key1', MAP('a', 'b', 'c', 'd'), 'key2', MAP('a', 'b', 'c', 'd')) AS col_map

it would only be converted to

SELECT MAP (ARRAY['key1', 'key2'], ARRAY[MAP ('a', 'b', 'c', 'd'), MAP ('a', 'b', 'c', 'd')]) AS "col_map"

However, the result should be

SELECT MAP (ARRAY['key1', 'key2'], ARRAY[MAP (ARRAY['a', 'c'], ARRAY['b', 'd']), MAP (ARRAY['a', 'c'], ARRAY['b', 'd'])]) AS "col_map"

I think we can change line 208 to be

List<RexNode> convertedOperands = visitList(call.getOperands(), (boolean[]) null);

and add corresponding unit tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you and will try to fix it.

@JiajunBernoulli JiajunBernoulli force-pushed the hive2trino branch 3 times, most recently from 73474a0 to 128fc87 Compare August 14, 2021 06:34
@JiajunBernoulli
Copy link
Contributor Author

I updated this PR, Do you have any suggestions?

@wmoustafa
Copy link
Contributor

@ljfgem PTAL.

@ljfgem
Copy link
Collaborator

ljfgem commented Aug 20, 2021

@ljfgem PTAL.

Thank you for reminding, integration test is running, will merge if there is no regression.

@ljfgem ljfgem merged commit 570add2 into linkedin:master Aug 20, 2021
KevinGe00 added a commit to KevinGe00/coral that referenced this pull request Feb 16, 2024
KevinGe00 added a commit that referenced this pull request Apr 11, 2024
…de layer (#491)

* Initial commit for migrating Cast

* add type derivation changes

* type derivation enhance and test debug

* remove testing files

* spotlessapply + comment out failing unit test

* update unit tests

* fix regression by TypeDerivationUtil on casts with join on simple aliases

* generalize case of casting with join on simple aliases

* remove extraneous input to dummy select

* temporary removal of type util postprocessor for i-test

* revert

* spotless checks

* update calcite version for linkedin-calcite #98/#99

* remove Calcite2TrinoUDFConverter

* clean up tests

* unused imports

* spotless

* improve javadoc/documentation

* fix javadoc

* improve javadoc

* grammar

* refactor + clean up

---------

Co-authored-by: aastha25 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants