Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix(optimizer): don't merge CTEs with EXPLODE projections into outer scopes #2366

Merged
merged 1 commit into from
Oct 3, 2023

Conversation

georgesittas
Copy link
Collaborator

Before:

>>> query = """
... SELECT Name,
...        FruitStruct.`$id`,
...        FruitStruct.value
...   FROM
...        (SELECT Name,
...                explode(Fruits) as FruitStruct
...           FROM fruits_table)
... """
>>> from sqlglot.optimizer import optimize
>>> print(optimize(query, dialect="spark").sql(dialect="spark", pretty=True))
SELECT
  `fruits_table`.`name` AS `name`,
  EXPLODE(`fruits_table`.`fruits`).`$id` AS `$id`,
  EXPLODE(`fruits_table`.`fruits`).`value` AS `value`
FROM `fruits_table` AS `fruits_table`

Spark doesn't like EXPLODE(...).`$id` apparently, so this PR aims to fix this issue:

spark-sql (default)> SELECT EXPLODE(ARRAY(CAST(STRUCT('fooo') AS STRUCT<a: STRING>))).a;
[FIELD_NOT_FOUND] No such struct field `a` in `col`.
spark-sql (default)> WITH cte(col) AS (SELECT EXPLODE(ARRAY(CAST(STRUCT('fooo') AS STRUCT<a: STRING>)))) SELECT col.a FROM cte;
a
fooo
Time taken: 3.967 seconds, Fetched 1 row(s)

@georgesittas georgesittas requested a review from tobymao October 3, 2023 14:24
@tobymao tobymao merged commit 0e93890 into main Oct 3, 2023
@tobymao tobymao deleted the jo/explode_spark_fix branch October 3, 2023 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants