-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(ingest): Fix parse error when profiling BQ monthly partitioned tables #4591
fix(ingest): Fix parse error when profiling BQ monthly partitioned tables #4591
Conversation
""" | ||
partition_id_for_parse = partition.partition_id | ||
if len(partition_id_for_parse) == 6: | ||
partition_id_for_parse = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do something like this which I think is much clearer?:
partition_datetime = datetime.strptime(partition_id_for_parse,"%Y%M")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to use the 'dateutil.parser' anyway because we can have YEARLY / DAILY partition as well. @treff7es
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we use 'datetime.strptime' only for monthly case, we have to use two parsing functions ('datetime.strptime' and 'dateutil.parser.parse') to cover all cases. I think we'd better to use one function for the consistency.
Also, if we use 'datetime.strptime', there will be a difference. For example, when we parse '202204' with strptime, the result will be '2022-04-01 00:00:00', and if we parse '2022-04' with 'dateutil.parser.parse' in 2022-04-21 (the actual date), the result will be '2022-04-21 00:00:00'. @treff7es
It seems like these test fails are not triggered from the code change in this PR. |
Can you also check this PR? @anshbansal |
@sb-sebkim Let me talk with @treff7es who is assigned to this PR. |
I encountered ParserError while ingesting BigQuery profiling. This error occurs only when profiling monthly partitioned table. The error message is following:
bigquery partition_id of the MONTHLY partition is a format of {year}{month} such as 202204.
dateutil.parser.parse considers six digits string as {day}{month}{year}.
Thus we need to append a seperator like '-' between 2022 and 04 to fix this error.
Checklist