-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(ingest): move Airflow into datahub_provider
module
#2521
Conversation
datahub_provider
packagedatahub_provider
module
How do you feel about renaming |
@sunkickr makes sense - just made the change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just like my comment on tests, example airflow dags can also be included in the datahub_provider like in the sample_repo here: https://github.com/astronomer/airflow-provider-sample/tree/main/sample_provider/example_dags. This way all airflow stuff can live in the provider and the registry will be able to build a page for each example dag in the example_dags directory like here: https://registry.astronomer-stage.io/providers/google/example-dags/example-automl-nl-text-classification
@@ -19,9 +19,9 @@ | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may want to store tests for operators, hooks, and the lineagebackend in datahub_provider like done in the sample repo: https://github.com/astronomer/airflow-provider-sample/tree/main/tests. This way modules from datahub_provider won't need to be imported into datahub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I'm following the suggestion - the tests in both the sample repo and in this one are stored separate from the main source directories.
The datahub
imports of datahub_provider
are primarily for backwards compatibility, so I'm pretty ok with leaving those in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense was just suggesting because module tests are usually packaged with modules.
@sunkickr - I'd like to keep the I have gone through and updated all the imports to use |
@hsheth2 - Maybe we could just duplicate the example dags repo into the provider package so they are easily discoverable in both the datahub repo and the registry. Example dags located in the |
datahub_provider
moduledatahub_provider
module
Everything looks good to me structurally |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Airflow users expect stuff to be in
datahub_provider.{hooks,operators,lineage}
.Maintains backwards compatibility by re-exporting from the
datahub.integrations.airflow
files.Checklist