Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace deprecated adapter methods with relations #133

Merged
merged 4 commits into from
May 8, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -258,7 +258,7 @@ from {{ref('my_model')}}
```

#### union_tables ([source](macros/sql/union.sql))
This macro implements an "outer union." The list of tables provided to this macro will be unioned together, and any columns exclusive to a subset of these tables will be filled with `null` where not present. The `column_override` argument is used to explicitly assign the column type for a set of columns. The `source_column_name` argument is used to change the name of the`_dbt_source_table` field.
This macro implements an "outer union." The list of relations provided to this macro will be unioned together, and any columns exclusive to a subset of these tables will be filled with `null` where not present. The `column_override` argument is used to explicitly assign the column type for a set of columns. The `source_column_name` argument is used to change the name of the`_dbt_source_table` field.

Usage:
```
Expand Down
2 changes: 1 addition & 1 deletion integration_tests/models/sql/test_nullcheck_table.sql
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

with nulled as (

{{ dbt_utils.nullcheck_table(tbl.schema, tbl.name) }}
{{ dbt_utils.nullcheck_table(tbl) }}

)

Expand Down
5 changes: 5 additions & 0 deletions macros/cross_db_utils/_is_relation.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{% macro _is_relation(obj, macro) %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the intended behavior, but it's important to note: This will raise if the ref of an ephemeral model is passed in here. The ref() function returns a string for ephemeral models, vs. a Relation for any non-ephemeral model (or source). I believe all of the macros here that use _is_relation are expecting a table/view, so that is A-OK!

{%- if not (obj is mapping and obj.get('metadata', {}).get('type', '').endswith('Relation')) -%}
{%- do exceptions.raise_compiler_error("Macro " ~ macro ~ " expected a Relation but received the value: " ~ obj) -%}
{%- endif -%}
{% endmacro %}
7 changes: 2 additions & 5 deletions macros/schema_tests/equality.sql
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,8 @@
{% endif %}

-- setup

{% set schema = model.schema %}
{% set model_a_name = model.name %}

{% set dest_columns = adapter.get_columns_in_table(schema, model_a_name) %}
{%- do dbt_utils._is_relation(model, 'test_equality') -%}
{% set dest_columns = adapter.get_columns_in_relation(model) %}
{% set dest_cols_csv = dest_columns | map(attribute='quoted') | join(', ') %}

with a as (
Expand Down
19 changes: 12 additions & 7 deletions macros/sql/get_tables_by_prefix.sql
Original file line number Diff line number Diff line change
@@ -1,16 +1,21 @@
{% macro get_tables_by_prefix(schema, prefix, exclude='') %}
{% macro get_tables_by_prefix(schema, prefix, exclude='', database=target.database) %}

{%- call statement('tables', fetch_result=True) %}
{%- call statement('get_tables', fetch_result=True) %}

{{ dbt_utils.get_tables_by_prefix_sql(schema, prefix, exclude) }}
{{ dbt_utils.get_tables_by_prefix_sql(schema, prefix, exclude, database) }}

{%- endcall -%}

{%- set table_list = load_result('tables') -%}
{%- set table_list = load_result('get_tables') -%}

{%- if table_list and table_list['data'] -%}
{%- set tables = table_list['data'] | map(attribute=0) | list %}
{{ return(tables) }}
{%- if table_list and table_list['table'] -%}
{%- set tbl_relations = [] -%}
{%- for row in table_list['table'] -%}
{%- set tbl_relation = api.Relation.create(database, row.table_schema, row.table_name) -%}
{%- do tbl_relations.append(tbl_relation) -%}
{%- endfor -%}

{{ return(tbl_relations) }}
{%- else -%}
{{ return([]) }}
{%- endif -%}
Expand Down
14 changes: 7 additions & 7 deletions macros/sql/get_tables_by_prefix_sql.sql
Original file line number Diff line number Diff line change
@@ -1,25 +1,25 @@
{% macro get_tables_by_prefix_sql(schema, prefix, exclude='') %}
{% macro get_tables_by_prefix_sql(schema, prefix, exclude='', database=target.database) %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will work great, but there's an important thing to know about python (and therefore jinja, I assume?) argument defaults. Check out this article on mutable default arguments.

This code is A-OK because target.database is a string, which is immutable. It would only be a problem if the code read:

{% macro get_tables_by_prefix_sql(schema, prefix, exclude='', databases=[]) %}

or similar.

{{ adapter_macro('dbt_utils.get_tables_by_prefix_sql', schema, prefix, exclude) }}
{% endmacro %}

{% macro default__get_tables_by_prefix_sql(schema, prefix, exclude='') %}
{% macro default__get_tables_by_prefix_sql(schema, prefix, exclude='', database=target.database) %}

select distinct
table_schema || '.' || table_name as ref
from information_schema.tables
table_schema as "table_schema", table_name as "table_name"
jtcohen6 marked this conversation as resolved.
Show resolved Hide resolved
from {{database}}.information_schema.tables
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We spent a lot of time getting this working right with quoting in dbt-core. I think this code is fine for now, but in general, we'll probably want to create some sort of object (analogous to Relations) which encapsulates a database + schema and any applicable quoting configs.

This code won't work if a database is named my-database on eg. Snowflake. While a user could pass in their database string as "my-database", that would cause problems when we create the Relation object in get_tables_by_prefix. I'm ok with shipping this as-is, but let's keep examples like this in mind as we improve dbt's understanding of databases and quoting

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fully agreed. I had tried using the adapter.quote() method (necessary in the BQ version below, given our BQ test project name), but it caused errors on Snowflake. Noted for our future selves.

where table_schema = '{{ schema }}'
and table_name ilike '{{ prefix }}%'
and table_name not ilike '{{ exclude }}'

{% endmacro %}


{% macro bigquery__get_tables_by_prefix_sql(schema, prefix, exclude='') %}
{% macro bigquery__get_tables_by_prefix_sql(schema, prefix, exclude='', database=target.database) %}

select distinct
concat(dataset_id, '.', table_id) as ref
dataset_id as table_schema, table_id as table_name

from {{schema}}.__TABLES_SUMMARY__
from {{adapter.quote(database)}}.{{schema}}.__TABLES_SUMMARY__
where dataset_id = '{{schema}}'
and lower(table_id) like lower ('{{prefix}}%')
and lower(table_id) not like lower ('{{exclude}}')
Expand Down
7 changes: 4 additions & 3 deletions macros/sql/nullcheck_table.sql
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
{% macro nullcheck_table(schema, table) %}
{% macro nullcheck_table(relation) %}

{% set cols = adapter.get_columns_in_table(schema, table) %}
{%- do dbt_utils._is_relation(relation, 'nullcheck_table') -%}
{% set cols = adapter.get_columns_in_relation(relation) %}

select {{ dbt_utils.nullcheck(cols) }}
from {{schema}}.{{table}}
from {{relation}}

{% endmacro %}
10 changes: 3 additions & 7 deletions macros/sql/star.sql
Original file line number Diff line number Diff line change
@@ -1,18 +1,14 @@
{% macro star(from, relation_alias=False, except=[]) -%}

{%- do dbt_utils._is_relation(from, 'star') -%}

{#-- Prevent querying of db in parsing mode. This works because this macro does not create any new refs. #}
{%- if not execute -%}
{{ return('') }}
{% endif %}

{%- if from.name -%}
{%- set schema_name, table_name = from.schema, from.name -%}
{%- else -%}
{%- set schema_name, table_name = (from | string).split(".") -%}
{%- endif -%}

{%- set include_cols = [] %}
{%- set cols = adapter.get_columns_in_table(schema_name, table_name) -%}
{%- set cols = adapter.get_columns_in_relation(from) -%}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there might be merit to adding a check here that returns an error if a non-relation is provided to this macro (and others). This would have the added benefit of raising an error if someone passes an ephemeral model into star.

Maybe:

{% if from.get('name') is none %}
  {% do exceptions.raise_compiler_error("Macro star() expected a Relation but received the value: " ~ from) %}
{% endif %}

or something. We could wrap this up in its own little macro as a helper :)

Copy link
Contributor Author

@jtcohen6 jtcohen6 Apr 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! We might add this error to all of the macros that now expect relation arguments in a break-y sort of way. Two q's:

  • Is there a better way to infer the argument type, or is arg.get('name') != none the best way to confirm that arg is a Relation?
  • Is there a Jinja method you know of that lets us pass the name of the current macro? Relevant if we were to make error_if_not_relation(arg) its own macro, and wanted to pretty print a dynamic compiler error "Macro " ~ my_macro ~ " expected a Relation for argument " ~ arg"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are really good questions!

I don't think we have a great mechanism in place to assert the types of objects from the jinja context. We could conceivably add a context function or a jinja test (like abc is Relation) but that doesn't exist today :/. I think our best option is going to be to look for a name attribute. Here's some example code I played around with:

{% set things = [
    'abc.def',
    api.Relation.create(schema='ghi', identifier='jkl'),
] %}


{% for thing in things %}
    {{ log(thing ~ " is relation? " ~ (thing is mapping and thing.name is defined), info=true) }}
{% endfor %}

Let's be sure to test this with dbt-labs/dbt-core#1416

We also don't have a way to get the macro name from the macro context. I think we should just anticipate making a macro to handle raising the exception, then pass in a string to use as the macro name from the caller.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right on. Is an is_relation() macro something that I should add to utils now, or do you anticipate creating it within the context of dbt proper?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - let's add it to dbt-utils! Maybe name it something like _is_relation() since I wouldn't be comfortable with folks using this externally (yet). If we had some way of locally scoping the macro I'd suggest that, but we can just indicate its private-ness with a naming convention for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly related to the PR you posted: from.name does return a value, but from.get('name') does not. The macro fails the Relation test as written above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah, we'll probably want to call from.name then. @beckjake off the top of your head, can you think of a better way to tell if an object is a Relation in the jinja context than obj is mapping and obj.name is not none?

Copy link
Contributor

@beckjake beckjake Apr 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, .name is a property that actually resolves to .path.get('identifier').

I think I'd do obj is mapping and 'type' in obj.get('metadata', {})). We should probably add a jinja test for this, or something.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ew, maybe you have to even check the type string:

obj is mapping and obj.get('metadata', {}).get('type', '').endswith('Relation')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented the most rigorous logic in a macro _is_incremental(), which returns the following exception:

Compilation Error
  Macro star expected a Relation but received the value: hello

  > in macro _is_relation (macros/cross_db_utils/_is_relation.sql)
  > called by macro star (macros/sql/star.sql)
  > called by <Unknown>

{%- for col in cols -%}

{%- if col.column not in except -%}
Expand Down
9 changes: 2 additions & 7 deletions macros/sql/union.sql
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,8 @@

{%- set _ = table_columns.update({table: []}) %}

{%- if table.name -%}
{%- set schema, table_name = table.schema, table.name -%}
{%- else -%}
{%- set schema, table_name = (table | string).split(".") -%}
{%- endif -%}

{%- set cols = adapter.get_columns_in_table(schema, table_name) %}
{%- do dbt_utils._is_relation(table, 'union_tables') -%}
{%- set cols = adapter.get_columns_in_relation(table) %}
{%- for col in cols -%}

{%- if col.column not in exclude %}
Expand Down
11 changes: 3 additions & 8 deletions macros/sql/unpivot.sql
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Pivot values from columns to rows.
Example Usage: {{ dbt_utils.unpivot(table=ref('users'), cast_to='integer', exclude=['id','created_at']) }}

Arguments:
table: Table name, required.
table: Relation object, required.
cast_to: The datatype to cast all unpivoted columns to. Default is varchar.
exclude: A list of columns to exclude from the unpivot operation. Default is none.
#}
Expand All @@ -19,13 +19,8 @@ Arguments:

{%- set _ = table_columns.update({table: []}) %}

{%- if table.name -%}
{%- set schema, table_name = table.schema, table.name -%}
{%- else -%}
{%- set schema, table_name = (table | string).split(".") -%}
{%- endif -%}

{%- set cols = adapter.get_columns_in_table(schema, table_name) %}
{%- do dbt_utils._is_relation(table, 'unpivot') -%}
{%- set cols = adapter.get_columns_in_relation(table) %}

{%- for col in cols -%}
{%- if col.column.lower() not in exclude|map('lower') -%}
Expand Down