-
Notifications
You must be signed in to change notification settings - Fork 510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace deprecated adapter methods with relations #133
Conversation
jtcohen6
commented
Apr 26, 2019
- Fixes 0.13.0 compatibility #129
- Closes replaced get_columns_in_table with get_columns_in_relation #132
- Some breaking changes to contracts used by helper macros
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really great! Some small comments here in the interest of
- minimizing end-user confusion as we make these breaking changes
- pulling dbt-utils in line with the database-aware functionality provided by dbt
Using relations
here is a really good decision - this will cause things like union_tables
to automatically become database-aware (with your change). This is especially good because the source()
function returns a Relation, and there are lots of really cool use cases you can imagine here!
So, great work and thanks for opening this PR! Happy to discuss any of my comments in here :)
{%- set include_cols = [] %} | ||
{%- set cols = adapter.get_columns_in_table(schema_name, table_name) -%} | ||
{%- set cols = adapter.get_columns_in_relation(from) -%} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there might be merit to adding a check here that returns an error if a non-relation is provided to this macro (and others). This would have the added benefit of raising an error if someone passes an ephemeral model into star
.
Maybe:
{% if from.get('name') is none %}
{% do exceptions.raise_compiler_error("Macro star() expected a Relation but received the value: " ~ from) %}
{% endif %}
or something. We could wrap this up in its own little macro as a helper :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! We might add this error to all of the macros that now expect relation arguments in a break-y sort of way. Two q's:
- Is there a better way to infer the argument type, or is
arg.get('name') != none
the best way to confirm thatarg
is a Relation? - Is there a Jinja method you know of that lets us pass the name of the current macro? Relevant if we were to make
error_if_not_relation(arg)
its own macro, and wanted to pretty print a dynamic compiler error"Macro " ~ my_macro ~ " expected a Relation for argument " ~ arg"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are really good questions!
I don't think we have a great mechanism in place to assert the types of objects from the jinja context. We could conceivably add a context function or a jinja test (like abc is Relation
) but that doesn't exist today :/. I think our best option is going to be to look for a name
attribute. Here's some example code I played around with:
{% set things = [
'abc.def',
api.Relation.create(schema='ghi', identifier='jkl'),
] %}
{% for thing in things %}
{{ log(thing ~ " is relation? " ~ (thing is mapping and thing.name is defined), info=true) }}
{% endfor %}
Let's be sure to test this with dbt-labs/dbt-core#1416
We also don't have a way to get the macro name from the macro context. I think we should just anticipate making a macro to handle raising the exception, then pass in a string to use as the macro name from the caller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right on. Is an is_relation()
macro something that I should add to utils now, or do you anticipate creating it within the context of dbt proper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - let's add it to dbt-utils! Maybe name it something like _is_relation()
since I wouldn't be comfortable with folks using this externally (yet). If we had some way of locally scoping the macro I'd suggest that, but we can just indicate its private-ness with a naming convention for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly related to the PR you posted: from.name
does return a value, but from.get('name')
does not. The macro fails the Relation test as written above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yeah, we'll probably want to call from.name
then. @beckjake off the top of your head, can you think of a better way to tell if an object is a Relation in the jinja context than obj is mapping and obj.name is not none
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, .name
is a property that actually resolves to .path.get('identifier')
.
I think I'd do obj is mapping and 'type' in obj.get('metadata', {}))
. We should probably add a jinja test for this, or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ew, maybe you have to even check the type string:
obj is mapping and obj.get('metadata', {}).get('type', '').endswith('Relation')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented the most rigorous logic in a macro _is_incremental()
, which returns the following exception:
Compilation Error
Macro star expected a Relation but received the value: hello
> in macro _is_relation (macros/cross_db_utils/_is_relation.sql)
> called by macro star (macros/sql/star.sql)
> called by <Unknown>
@jtcohen6 is this ready for a final review? |
@drewbanin Yessir! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, this LGTM! Very nice work here @jtcohen6 - thanks so much for digging way into this! Merge when ready :)
@@ -1,25 +1,25 @@ | |||
{% macro get_tables_by_prefix_sql(schema, prefix, exclude='') %} | |||
{% macro get_tables_by_prefix_sql(schema, prefix, exclude='', database=target.database) %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will work great, but there's an important thing to know about python (and therefore jinja, I assume?) argument defaults. Check out this article on mutable default arguments.
This code is A-OK because target.database
is a string, which is immutable. It would only be a problem if the code read:
{% macro get_tables_by_prefix_sql(schema, prefix, exclude='', databases=[]) %}
or similar.
table_schema || '.' || table_name as ref | ||
from information_schema.tables | ||
table_schema as "table_schema", table_name as "table_name" | ||
from {{database}}.information_schema.tables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We spent a lot of time getting this working right with quoting in dbt-core. I think this code is fine for now, but in general, we'll probably want to create some sort of object (analogous to Relations) which encapsulates a database + schema and any applicable quoting configs.
This code won't work if a database is named my-database
on eg. Snowflake. While a user could pass in their database string as "my-database"
, that would cause problems when we create the Relation object in get_tables_by_prefix
. I'm ok with shipping this as-is, but let's keep examples like this in mind as we improve dbt's understanding of databases and quoting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fully agreed. I had tried using the adapter.quote()
method (necessary in the BQ version below, given our BQ test project name), but it caused errors on Snowflake. Noted for our future selves.
@@ -0,0 +1,5 @@ | |||
{% macro _is_relation(obj, macro) %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the intended behavior, but it's important to note: This will raise if the ref
of an ephemeral model is passed in here. The ref()
function returns a string for ephemeral models, vs. a Relation for any non-ephemeral model (or source). I believe all of the macros here that use _is_relation
are expecting a table/view, so that is A-OK!