-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support columns having the same alias #6543
Comments
Thanks @berkaysynnada for raising. This is quite old problem,DF has unique column name check in the planner. We planned to move this check one level upper, so the query will fail if outer query references the inner query containing duplicated aliases. Fir you scenario is it real world one? |
Not actually, it was a hypothetical trial. Your plan makes sense and thanks for letting me know about it. |
I also filed #6758 to think about the problem with large column names. |
I wonder if some potential solution for this issue would be to automatically add a string to make the columns unique in the arrow schema? |
that is good idea btw, currently we got bunch of issues
|
This is a legacy issue. Generally, we won't raise an error for having columns with the same name unless an outer subquery references that column name. In terms of this issue itself, we should fix it in the planner. |
🤯 I didn't realize this worked (arrow_dev) alamb@MacBook-Pro-8:~/Software/influxdb_iox2$ datafusion-cli
DataFusion CLI v26.0.0
❯ create table foo (x int) as values (1), (2), (3);
0 rows in set. Query took 0.003 seconds.
❯ select x as "my_col", x as "my col" from foo;
+--------+--------+
| my_col | my col |
+--------+--------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
+--------+--------+
3 rows in set. Query took 0.005 seconds.
❯ select x as "my_col", x+1 as "my col" from foo;
+--------+--------+
| my_col | my col |
+--------+--------+
| 1 | 2 |
| 2 | 3 |
| 3 | 4 | However, using ❯ select x as c1, x as c1 from foo;
Error during planning: Projections require unique expression names but the expression "foo.x AS c1" at position 0 and "foo.x AS c1" at position 1 have the same name. Consider aliasing ("AS") one of them.+--------+--------+ |
its failed as DF has projection uniqueness column name check. |
Oh man, 🤦 -- I missed the |
An extra motivation to get this right is that sqlite's sqllogictest suite has a lot of these. You'll get better test coverage if you support this. |
I wonder if someone has time to file a ticket with the idea to re(use) sqlite's sqllogictest suite? |
Another case is when selecting same value literals DataFusion CLI v37.0.0
❯ select 1, 1;
Error during planning: Projections require unique expression names but the expression "Int64(1)" at position 0 and "Int64(1)" at position 1 have the same name. Consider aliasing ("AS") one of them.
❯ |
Yes,I was thinking the other day to allow query like that and check uniqueness from outer queries only |
@alamb I don't think you can directly run SQLite's test suite against just datafusion, there's a lot of I have an early-stage OLTP database project in the works using datafusion and that survives a decent fraction of sqllogictest. This issue is one of the remaining big limitations, along with |
This would be great item in the broader scope of #12723, which intents to make DataFusion Logical Plans be state of the art. |
Describe the bug
When we give the same aliases for multiple columns (
SELECT ts as c1, inc_col as c1 FROM annotated_data_infinite
), builder gives such an error:Plan("Projections require unique expression names but the expression \"annotated_data_infinite.ts AS c1\" at position 0 and \"annotated_data_infinite.inc_col AS c1\" at position 1 have the same name. Consider aliasing (\"AS\") one of them.")
To Reproduce
Expected behavior
Postgre can handle it and gives result with two columns having the same name. I don't know this is an intentional behaviour in Datafusion or a bug, but I would like to open an issue.
Additional context
No response
The text was updated successfully, but these errors were encountered: