Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include external tables in model #130

Closed
brabster opened this issue Jan 31, 2022 · 2 comments
Closed

Include external tables in model #130

brabster opened this issue Jan 31, 2022 · 2 comments
Labels
enhancement New feature or request

Comments

@brabster
Copy link

brabster commented Jan 31, 2022

Describe the feature

I'd like to configure an external table as part of my model

Describe alternatives you've considered

I've configured it as a source, which works OK - but when I'm doing development in my own schema I can't have my own version of the external table as well.

Additional context

I'm working with BigQuery at the moment. Not sure if I'm missing something about the thought process here, it seems projecting an external table into my model makes more sense in the models section of my schema.yml so that it just runs when I dbt run and it picks up the schema/dataset I'm working in.

Who will this benefit?

My client and I - unless you can explain why it doesn't make sense as a model, as I say maybe I'm just misthinking something.

@brabster brabster added enhancement New feature or request triage labels Jan 31, 2022
@aPeterHeise
Copy link
Contributor

Hi Brabster,

can you give more background & expectations please? The workflow in DBT is typically that you use sources for data that is loaded by 3rd party systems into your database; and models to reference such ONLY (via refs).

DBT external-tables could be seen as such a 3rd party tool.

So if you want external table support in your model, this would break the above logic.

@jtcohen6 jtcohen6 removed the triage label Feb 28, 2022
@jtcohen6
Copy link
Collaborator

Sorry for the delayed response! You're both right — external tables are a funny in-between. dbt can run the create external table statement for you, into a schema of your choice, which feels a lot like a model. At the same time, the thing that's being created is really just a naive pointer to data loaded by a 3rd party system; it doesn't (shouldn't IMO) contain any opinionated transformation/"business" logic, as a model does.

There's a lengthy conversation about this in the very first issue in the repository (#1); it's lengthy, but it will provide all the context behind the initial decision to treat external tables as sources (metadata + operation macro) instead of models (custom materialization + dbt run).

To be clear, the approach taken in this package is an opinionated one, and it doesn't in any way preempt you from taking a different tact. Many folks have created a custom materialization for external tables so that they will execute during dbt run.

As far as isolating external tables during development, you can use conditional expressions in your table definition like:

sources:
  - name: my_external_src
    schema: "{{ target.schema if target.name == 'dev' else 'my_external_src' }}"

I'm going to close this issue, since it's well-trod ground for this package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants