Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbt info command for listing information about a model #500

Closed
drewbanin opened this issue Aug 7, 2017 · 8 comments
Closed

dbt info command for listing information about a model #500

drewbanin opened this issue Aug 7, 2017 · 8 comments
Labels
enhancement New feature or request

Comments

@drewbanin
Copy link
Contributor

  • where it lives on disk
  • configs
  • package info
  • ancestors/descendants
@drewbanin drewbanin added the enhancement New feature or request label Aug 7, 2017
@adamhaney
Copy link

This might need to be an issue on its own but does dbt currently have any way of knowing information about where a column came from? Would it make sense to (eventually) also include information about ancestors/descendents of a given column on a model or would that be overly complicated/outside the scope of dbt? From my understanding of the code it doesn't appear like nodes have information about columns on models currently?

FWIW I'd be happy to help on this since I feel like I'm always the one in slack asking for stuff, it's about time I pitch in :) Still just trying to understand internals/organization/capabilities of dbt.

@drewbanin
Copy link
Contributor Author

@adamhaney i've been thinking about this a lot! I'd love it if dbt could provide some provenance information about where a given column comes from, how it's calculated, etc.

I think Postgres/Redshift stores column dependencies for views, so maybe that's something to think about here. In the more general case though, not sure how this could be implemented without doing some sql parsing. This is really intriguing, and I suspect we will get there eventually, but there's really nothing in place to support a feature like that in dbt at present.

As an aside: We need to do a better job of listing outstanding tasks which are well suited for first-time contributors. I'm thinking something like first timers only or even just a help us tag on issues.

@HarlanH
Copy link

HarlanH commented Sep 7, 2017

See also #415 and #375...

And yeah, Drew, I've worked on projects with an "up for grabs" tag on issues...

@drewbanin drewbanin added this to the 0.9.1 milestone Nov 10, 2017
@drewbanin drewbanin removed this from the 0.9.1 milestone Dec 14, 2017
@drewbanin
Copy link
Contributor Author

i think this use-case is adequately covered by the documentation site, or dbt list - going to close it but we can re-open if anyone feels strongly that this subcommand should exist

@konosp
Copy link

konosp commented Feb 16, 2020

Hi, it would be great if we can some how extract a dependency graph/tree programmatically. I am am trying to schedule the execution of each individual model through Airflow. This would allow to monitor daily and more easily the execution of all models in detail. However the only way at the moment to extract this chain of dependents is by running dbt compile and then parsing the run_results.json file. Especially ancestors/descendants is very useful for programmatic execution of model by model in the correct order.

@drewbanin
Copy link
Contributor Author

Hi @konosp - you can use the following command to emit structured node data without completing a full compilation on your dbt project:

 dbt ls --output=json --resource-type=model

The resulting output will contain one model per line in a json format. You can parse the depends_on value to build up a DAG.

Maybe a better alternative is indeed to run the dbt compile command, but consult the target/manifest.json file instead of the target/run_results.json file. The manifest contains an adjacency list representation of the DAG - try looking for the parent_map entry in the manifest.

@konosp
Copy link

konosp commented Mar 8, 2020

Thanks @drewbanin, ended up using manifest.json for the purposes of my project. Cheers

@ulan-gencap
Copy link

Hi @konosp - you can use the following command to emit structured node data without completing a full compilation on your dbt project:

 dbt ls --output=json --resource-type=model

The resulting output will contain one model per line in a json format. You can parse the depends_on value to build up a DAG.

Maybe a better alternative is indeed to run the dbt compile command, but consult the target/manifest.json file instead of the target/run_results.json file. The manifest contains an adjacency list representation of the DAG - try looking for the parent_map entry in the manifest.

As of Oct-2022, per dbt Labs Support, "**dbt list** is not a command available in Cloud that's just for the CLI."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants