-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ingest/unity): enable hive metastore ingestion #9416
feat(ingest/unity): enable hive metastore ingestion #9416
Conversation
for unity-catalog enabled databricks workspaces
bfd5a53
to
25bd4db
Compare
schema_fields.extend( | ||
get_schema_fields_for_hive_column( | ||
col.name, col.data_type.lower(), description=col.comment | ||
with patch.object(HiveColumnToAvroConverter, "_STRUCT_TYPE_SEPARATOR", " "): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd much rather have converter = HiveColumnToAvroConverter(struct_type_separator=" "); converter.some_method(...)
We really shouldn't need to monkeypatch our own code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. This change is only done to replace existing logic as that interfered with unity catalog logic , when running unit tests. Let me add the impacted tests in other test batch and add a TODO here about need for this refractor. I'd prefer to do it in a separate PR.
metadata-ingestion/src/datahub/ingestion/source/unity/config.py
Outdated
Show resolved
Hide resolved
def schema_pattern_should__always_deny_information_schema( | ||
cls, v: AllowDenyPattern | ||
) -> AllowDenyPattern: | ||
v.deny.append(".*\\.information_schema") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this pattern of "there's a few extra db/schemas that we always want to deny" happens in other sources too, and we've been abusing the user-facing allow/deny pattern to set "system" configs.
When we revisit the sql common refactoring, I'd like to think about moving system-level deny patterns to a class variable instead of reusing the user-facing config
metadata-ingestion/src/datahub/ingestion/source/unity/report.py
Outdated
Show resolved
Hide resolved
9689cf7
to
f4e012e
Compare
for unity-catalog enabled databricks workspaces
Checklist