Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix(parser, duckdb): decode/encode in duckdb don't take charset #1993

Merged
merged 3 commits into from
Aug 4, 2023

Conversation

charsmith
Copy link
Contributor

Duckdb doesn't take a param for charset on decode/encode.

Copy link
Collaborator

@georgesittas georgesittas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, left a few of comments.

I'm not sure if we should alter the base parser for this. I would personally just add a couple of entries in DuckDB's Parser.FUNCTIONS (see how it's done in presto.py), or update the presto dialect so that it routes to _parse_decode/encode in the base parser and reuse that in duckdb.

cc: @tobymao

sqlglot/dialects/duckdb.py Outdated Show resolved Hide resolved
sqlglot/dialects/duckdb.py Outdated Show resolved Hide resolved
tests/dialects/test_duckdb.py Show resolved Hide resolved
sqlglot/parser.py Outdated Show resolved Hide resolved
sqlglot/dialects/duckdb.py Outdated Show resolved Hide resolved
sqlglot/dialects/duckdb.py Outdated Show resolved Hide resolved
self.unsupported(f"Unsupported charset {charset}")

expression = expression.copy()
del expression.args["charset"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this, happy to change this if there is a better way.

Copy link
Collaborator

@georgesittas georgesittas Aug 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just do return self.func(name, expression.this, expression.args.get("replace") if replace else None) (see other comment about replace flag).

Comment on lines +576 to +577
if "charset" in expression.args:
charset = expression.args["charset"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if "charset" in expression.args:
charset = expression.args["charset"]
charset = expression.args.get("charset")
if charset:

Comment on lines +174 to +178
FUNCTION_PARSERS = {
**parser.Parser.FUNCTION_PARSERS,
"ENCODE": lambda self: self._parse_encode_decode(exp.Encode),
"DECODE": lambda self: self._parse_encode_decode(exp.Decode),
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just do these inline I think, like:

            "DECODE": lambda self: self.expression(
                exp.Decode, this=self._parse_conjunction(), charset=exp.Literal.string("utf-8")
            ),
            "ENCODE": lambda self: self.expression(
                exp.Encode, this=self._parse_conjunction(), charset=exp.Literal.string("utf-8")
            ),

@@ -571,6 +571,20 @@ def datestrtodate_sql(self: Generator, expression: exp.DateStrToDate) -> str:
return self.sql(exp.cast(expression.this, "date"))


# Used for Presto and Duckdb which use functions that don't support charset, and assume utf-8
def encode_decode_sql(self: Generator, expression: exp.Expression, name: str) -> str:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you also need a boolean flag here to control whether or not replace will be generated.

@georgesittas
Copy link
Collaborator

Looks good, just a few final comments.

@georgesittas georgesittas merged commit c9dd971 into tobymao:main Aug 4, 2023
@georgesittas
Copy link
Collaborator

@charsmith I'll wrap this up, should be just a couple of changes. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants