Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: add apache doris dialect #2006

Merged
merged 14 commits into from
Aug 9, 2023
Merged

Feat: add apache doris dialect #2006

merged 14 commits into from
Aug 9, 2023

Conversation

liujiwen-up
Copy link
Contributor

Implemented dialect support for Apache Doris

Copy link
Collaborator

@georgesittas georgesittas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for refactoring the previous PR.

I think we should revert all changes in the HTML files, because:

  • They're not necessary since they're auto-generated
  • There might be conflicts when the github action tries to generate them later

sqlglot/dialects/doris.py Outdated Show resolved Hide resolved
sqlglot/dialects/doris.py Show resolved Hide resolved
DATEINT_FORMAT = "'yyyyMMdd'"
TIME_FORMAT = "'yyyy-MM-dd HH:mm:ss'"

TIME_MAPPING = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several entries here are also in MySQL's TIME_MAPPING dict. Do Doris and MySQL have the same time mapping? Should we enhance MySQL's / reuse some existing ones here by expanding the superclass' dict?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, mysql and doris have some similar mappings, but we found that some mappings are not supported when going from hive to doris, so we added some on the previous basis to adapt to the syntax conversion of hive to doris

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how similar is it to starrocks? we support that as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The degree of compatibility with starrocks is similar to that of mysql. starrcoks came from fork doris and went out alone, but with the development and improvement of the community later, it will become more and more different from starrocks, and I plan to spend more time to perfect it Convert different data sources into doris.thanks

Comment on lines +59 to +61
"DATE_TRUNC": lambda args: exp.TimestampTrunc(
this=seq_get(args, 1), unit=seq_get(args, 0)
),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do this exact thing in both Postgres and Starrocks already. Let's dry it out into a helper in dialect.py.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's dry it out into a helper in dialect.py.
Sorry, I don't quite understand the meaning of this sentence, do you mean to implement this method in dialect.py

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes dry means don’t repeat yourself

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please clean up any instance of copy and paste

sqlglot/dialects/doris.py Outdated Show resolved Hide resolved
sqlglot/dialects/doris.py Show resolved Hide resolved
exp.DateDiff: rename_func("DATEDIFF"),
exp.RegexpLike: rename_func("REGEXP"),
exp.Coalesce: rename_func("NVL"),
exp.CurrentTimestamp: lambda self, e: "NOW()",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
exp.CurrentTimestamp: lambda self, e: "NOW()",
exp.CurrentTimestamp: lambda *_: "NOW()",

sqlglot/dialects/doris.py Outdated Show resolved Hide resolved
sqlglot/dialects/doris.py Outdated Show resolved Hide resolved
exp.Split: rename_func("SPLIT_BY_STRING"),
exp.Quantile: rename_func("PERCENTILE"),
exp.ApproxQuantile: rename_func("PERCENTILE_APPROX"),
exp.TimeStrToUnix: rename_func("UNIX_TIMESTAMP"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be removed, already exists in MySQL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok,thanks

@georgesittas georgesittas changed the title add apache doris dialect Feat: add apache doris dialect Aug 8, 2023
@liujiwen-up
Copy link
Contributor Author

Edited based on all comments, hopefully incorporated, thanks

@georgesittas
Copy link
Collaborator

Thanks @liujiwen-up, will review again soon

docs/sqlglot/dialects/dialect.html Outdated Show resolved Hide resolved
sqlglot/dialects/doris.py Outdated Show resolved Hide resolved
Comment on lines +15 to +18
def _to_date_sql(self: MySQL.Generator, expression: exp.TsOrDsToDate) -> str:
this = self.sql(expression, "this")
self.format_time(expression)
return f"TO_DATE({this})"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessary, let's get rid of it.

Comment on lines +21 to +27
def _time_format(
self: generator.Generator, expression: exp.UnixToStr | exp.StrToUnix
) -> t.Optional[str]:
time_format = self.format_time(expression)
if time_format == Doris.TIME_FORMAT:
return None
return time_format
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should factor this out into a helper in dialect.py, we do the exact same thing in Hive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants