-
Notifications
You must be signed in to change notification settings - Fork 773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: add apache doris dialect #2006
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for refactoring the previous PR.
I think we should revert all changes in the HTML files, because:
- They're not necessary since they're auto-generated
- There might be conflicts when the github action tries to generate them later
DATEINT_FORMAT = "'yyyyMMdd'" | ||
TIME_FORMAT = "'yyyy-MM-dd HH:mm:ss'" | ||
|
||
TIME_MAPPING = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Several entries here are also in MySQL's TIME_MAPPING
dict. Do Doris and MySQL have the same time mapping? Should we enhance MySQL's / reuse some existing ones here by expanding the superclass' dict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, mysql and doris have some similar mappings, but we found that some mappings are not supported when going from hive to doris, so we added some on the previous basis to adapt to the syntax conversion of hive to doris
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how similar is it to starrocks? we support that as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The degree of compatibility with starrocks is similar to that of mysql. starrcoks came from fork doris and went out alone, but with the development and improvement of the community later, it will become more and more different from starrocks, and I plan to spend more time to perfect it Convert different data sources into doris.thanks
"DATE_TRUNC": lambda args: exp.TimestampTrunc( | ||
this=seq_get(args, 1), unit=seq_get(args, 0) | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do this exact thing in both Postgres and Starrocks already. Let's dry it out into a helper in dialect.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's dry it out into a helper in dialect.py.
Sorry, I don't quite understand the meaning of this sentence, do you mean to implement this method in dialect.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes dry means don’t repeat yourself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please clean up any instance of copy and paste
sqlglot/dialects/doris.py
Outdated
exp.DateDiff: rename_func("DATEDIFF"), | ||
exp.RegexpLike: rename_func("REGEXP"), | ||
exp.Coalesce: rename_func("NVL"), | ||
exp.CurrentTimestamp: lambda self, e: "NOW()", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exp.CurrentTimestamp: lambda self, e: "NOW()", | |
exp.CurrentTimestamp: lambda *_: "NOW()", |
exp.Split: rename_func("SPLIT_BY_STRING"), | ||
exp.Quantile: rename_func("PERCENTILE"), | ||
exp.ApproxQuantile: rename_func("PERCENTILE_APPROX"), | ||
exp.TimeStrToUnix: rename_func("UNIX_TIMESTAMP"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be removed, already exists in MySQL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok,thanks
Edited based on all comments, hopefully incorporated, thanks |
Thanks @liujiwen-up, will review again soon |
def _to_date_sql(self: MySQL.Generator, expression: exp.TsOrDsToDate) -> str: | ||
this = self.sql(expression, "this") | ||
self.format_time(expression) | ||
return f"TO_DATE({this})" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unnecessary, let's get rid of it.
def _time_format( | ||
self: generator.Generator, expression: exp.UnixToStr | exp.StrToUnix | ||
) -> t.Optional[str]: | ||
time_format = self.format_time(expression) | ||
if time_format == Doris.TIME_FORMAT: | ||
return None | ||
return time_format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should factor this out into a helper in dialect.py
, we do the exact same thing in Hive.
Implemented dialect support for Apache Doris