-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ingestion): schema, table filtering for redshift-usage #4396
Conversation
Trying to implement this feature request: |
@@ -223,7 +223,12 @@ def _get_redshift_history( | |||
event_dict["endtime"] = event_dict.get("endtime").__str__() | |||
|
|||
logger.debug(f"event_dict: {event_dict}") | |||
events.append(event_dict) | |||
# filter based on schema and table pattern | |||
if self.config.schema_pattern.allowed(event_dict['schema'])\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any plans to include view_pattern
as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rslanka, it seems like the script wasn't ingesting view related usage stats in the first place. I've looked at the code and couldn't find anything. Did a small test run as well; view usage didn't get ingested.
"email_domain": "acryl.io", | ||
"include_views": True, | ||
"include_tables": True, | ||
"schema_pattern": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not include a table_pattern
as well to this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I will include this as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this test case
@Abhiram98, looks great overall! It would be great if you could address the comments above. |
): | ||
events.append(event_dict) | ||
else: | ||
logger.debug( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add this to the source report as well. See [this example] (https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/src/datahub/ingestion/source/sql/sql_common.py#L187)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @rslanka, I have added this to the reporting as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. @Abhiram98 , could you please address the change requested here (and we can merge)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
…project#4396) * Filter based on table/schema pattern + documentation Co-authored-by: Ravindra Lanka <[email protected]>
Filter redshift usage statistics based on schema and table patterns.
Checklist