Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save manifest JSON files to the configured analytical database #4208

Closed
1 task done
bashyroger opened this issue Nov 4, 2021 · 5 comments
Closed
1 task done

Save manifest JSON files to the configured analytical database #4208

bashyroger opened this issue Nov 4, 2021 · 5 comments
Labels
enhancement New feature or request stale Issues that have gone stale

Comments

@bashyroger
Copy link

Is there an existing feature request for this?

  • I have searched the existing issues

Describe the Feature

While is is great that DBT generates JSON structured artifact / logging metadata as described here: https://docs.getdbt.com/reference/artifacts/dbt-artifacts/#notes
...they are of little use if they are not saved to the configured analytical database.

If this was done , then a package like the tails.com DBT artifacts package would not need the cumbersome steps indicated at the end: https://github.com/tailsdotcom/dbt_artifacts, the steps where you need to upload these files to your database...

This is even more problematic when you are solely on DBT cloud, as we are.

So while I realize that DBT is an T-L tool , I think an exception must be made for DBT related logging metadata that, if it is structured as JSON, can (for example) be loaded to a Snowflake variant data type with ease.

Describe alternatives you've considered

Doing this manually; not scalable / maintainable

Who will this benefit?

Everyone that wants to do analytics on DBT metadata or needs to store this logging metadata for compliance

Are you interested in contributing this feature?

No response

Anything else?

No response

@bashyroger bashyroger added enhancement New feature or request triage labels Nov 4, 2021
@ChenyuLInx
Copy link
Contributor

Hey @bashyroger, thanks for making this request! It is a great question and we have been thinking about it.

We believe the best way to support answering those questions must be via the dbt Cloud Metadata API. That’s the most reliable, resilient, least crufty way to access this info. We are aware that you still have to use the Admin API to get job ids to then pass over to Metadata API and it would be a separate step out of the normally scheduled dbt run. And actively looking for solutions to resolve it

At the same time:

  • We understand why dbt users want to access this information right within their warehouse (dbt models on dbt models!) It’s the tooling that feels comfortable to them!
  • It’s frustrating that dbt-core deployments via Airflow/Dagster/etc have this capability, whereas dbt Cloud users really don’t, outside of the PUT hack on Snowflake

So:

  • We should continue to drill into what use cases/questions are motivating this ask — and pass that information back to the Metadata team as valuable user feedback. One thing we are thinking of is an automated tap for loading from the Metadata API to warehouses.
  • We should look into the possibility of similar “for now” hacks on other databases, which are necessarily super specific to those databases. Niall (from Brooklyn Data Co, maintainer of the dbt_artifacts package) opened issues in dbt-redshift + dbt-bigquery. But this method has its own drawbacks, any failed step in a dbt Cloud job causes subsequent steps to be skipped. So the PUT hack on Snowflake (and comparable hacks on Redshift/BigQuery) would continue to have the limitation where if a job fails, its artifact won’t be uploaded.

@github-actions
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

@github-actions github-actions bot added the stale Issues that have gone stale label Jul 21, 2022
@eozturkTF
Copy link

Hello
Any news on this please?

Cheers

@github-actions github-actions bot removed the stale Issues that have gone stale label Jul 22, 2022
@github-actions
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

@github-actions github-actions bot added the stale Issues that have gone stale label Jan 19, 2023
@github-actions
Copy link
Contributor

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request stale Issues that have gone stale
Projects
None yet
Development

No branches or pull requests

4 participants