Skip to content

Releases: dbt-labs/dbt-utils

1.3.0

28 Aug 18:32
e164c64
Compare
Choose a tag to compare

What's Changed

  • Remove "I have added an entry to CHANGELOG.md" from the PR template by @dbeatty10 in #903
  • Add a PR checklist item for "I have read the contributing guide..." as a catch-all by @dbeatty10 in #905
  • Simplify the PR checklist in relation to testing by @dbeatty10 in #907
  • Remove the PR checklist items related to the type of change by @dbeatty10 in #909
  • Align the PR description with dbt-core, dbt-adapters, etc. by @dbeatty10 in #911
  • Slugify handle empty strings by @dbeatty10 in #912
  • Contributors shouldn't edit the CHANGELOG.md directly anymore by @dbeatty10 in #916
  • Contributing guide instructions for allowing commits from maintainers by @dbeatty10 in #917
  • Move profiles config to project flags by @dbeatty10 in #926
  • Link to tests that support group_by_columns by @dbeatty10 in #931
  • Expand description for the at_least_one data test by @dbeatty10 in #933
  • Update not_null_proportion data test to use cross-database type_numeric() macro by @henriblancke in #800
  • Fix at_least_one test when group_by_columns is configured by @katieclaiborne-duet in #922
  • Delete the old unused logo file by @anks2024 in #936
  • Add quote_identifiers parameter to unpivot to handle case-sensitive column names by @error418 in #792
  • Add tox by @emmyoop in #919
  • Update release instructions by @dbeatty10 in #942

New Contributors

Full Changelog: 1.2.0...1.3.0

1.2.0

04 Jun 15:45
85ade29
Compare
Choose a tag to compare

What's Changed

New features

  • Add option to ignore columns in equality test by @brunocostalopes in #765
  • The equality test now accepts an additional argument, precision to aide in comparing floating point numbers by @rlh1994 in #765

Fixes

  • deduplicate macro for Databricks now uses the QUALIFY clause, which fixes NULL columns issues from the default natural join logic by @graciegoheen in #786
  • Use QUALIFY clause in deduplicate macro for Redshift by @yauhen-sobaleu in #811
  • get redshift external tables by @brendan-cook-87 in #753
  • Equality test will now raise an error when the second model has less columns than the first by @rlh1994 in #765

Documentation

  • Update documentation for get_column_values() to specify that the order_by argument must be expressed as an aggregate function by @bakerbryce in #872
  • Set the correct language identifier in code blocks within the documentation by @yamotech in #876
  • Fix typo of not_null_proportion in README.md by @PChambino in #853
  • Fix failing example for dbt_utils.deduplicate() in README.md by @pruoff in #856
  • Link to Haversine Distance article on Wikipedia by @dbeatty10 in #889

Under the hood

New Contributors

Full Changelog: 1.1.1...1.2.0

1.2.0-rc1

25 Apr 19:47
4469239
Compare
Choose a tag to compare
1.2.0-rc1 Pre-release
Pre-release

What's Changed

New features

  • add precision + exclude_columns option to equality test by @rlh1994 in #765

Fixes

  • get redshift external tables by @brendan-cook-87 in #753
  • deduplicate macro for Databricks now uses the QUALIFY clause, which fixes NULL columns issues from the default natural join logic by @graciegoheen in #786
  • Use QUALIFY clause in deduplicate macro for Redshift by @yauhen-sobaleu in #811

Documentation updates

  • Typo of not_null_proportion in README.md by @PChambino in #853
  • FIX: Failing example for dbt_utils.deduplicate() in README.md by @pruoff in #856
  • Update README.md by @bakerbryce in #872
  • Set the correct language identifier in the code block of the document by @yamotech in #876

Process updates

New Contributors

Full Changelog: 1.1.1...1.1.2rc1

1.1.1

06 Jun 18:36
74a661c
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: 1.1.0...1.1.1

1.1.0

01 May 23:35
965726e
Compare
Choose a tag to compare

What's Changed

New functionality

Documentation

Behind the scenes

New Contributors

Full Changelog: 1.0.0...1.1.0

1.0.0

06 Dec 01:03
Compare
Choose a tag to compare

Migration guide: https://docs.getdbt.com/guides/migration/versions/upgrading-to-dbt-utils-v1.0

Breaking changes:

🚨 surrogate_key() has been replaced by generate_surrogate_key(). You have to make a decision🚨

The original treated null values and blank strings the same, which could lead to duplicate keys being created. generate_surrogate_key() does not have this flaw. If needed, it's possible to opt into the legacy behavior by setting the following variable in your dbt project:

#dbt_project.yml
vars:
  surrogate_key_treat_nulls_as_empty_strings: true #turn on legacy behavior

Our recommendation is that existing users should opt into the legacy behaviour unless you are confident that either:

  • your surrogate keys never contained nulls, or
  • your surrogate keys are not used for incremental models, snapshots or other stateful artifacts and so can be regenerated with new values without issue.

dbt_utils.current_timestamp() is replaced by dbt.current_timestamp()

Postgres and Snowflake’s implementation of dbt.current_timestamp() differs from the old dbt_utils one (full details here). If you use Postgres or Snowflake and need identical backwards-compatible behaviour, use dbt.current_timestamp_backcompat().

Cross-db macros

All other cross-db macros have moved to the dbt namespace, with no changes necessary other than replacing dbt_utils. with dbt.. Review the cross database macros documentation for the full list, or the migration guide for a find-and-replace regex

Other functionality now native to dbt Core:

  • The expression_is_true test no longer has a dedicated condition argument. Instead, use where which is available natively to all tests
  • For the same reason, the deprecated unique_where and not_null_where tests have been removed

insert_by_period removed

The insert_by_period materialization has been moved to the experimental-features repo. To continue to use it, add the below to your packages.yml file:

packages:
  - git: https://github.com/dbt-labs/dbt-labs-experimental-features
    subdirectory: insert_by_period
    revision: XXXX #optional but highly recommended. Provide a full git sha hash, e.g. 1c0bfacc49551b2e67d8579cf8ed459d68546e00. If not provided, uses the current HEAD.

New features

  • get_single_value() — An easy way to pull a single value from a SQL query, instead of having to access the [0][0]th element of a run_query result.
  • safe_divide() — Returns null when the denominator is 0, instead of throwing a divide-by-zero error.
  • New not_empty_string test — An easier wrapper than using expression_is_true to check the length of a column.

Enhancements

  • Many tests are more meaningful when you run them against subgroups of a table. For example, you may need to validate that recent data exists for every turnstile instead of a single data source being sufficient. Add the new group_by_columns argument to your tests to do so. Review this article by the test's author for more information.
  • With the addition of an on-by-default quote_identifiers flag in the star() macro, you can now disable quoting if necessary.

Fixes

  • union() now includes/excludes columns case-insensitively
  • slugify() prefixes an underscore when the first char is a digit
  • The expression_is_true test doesn’t output * unless storing failures, a cost improvement for BigQuery.

New Contributors

Full Changelog: 0.9.6...1.0.0

1.0.0-rc1

29 Nov 05:51
Compare
Choose a tag to compare
1.0.0-rc1 Pre-release
Pre-release

This is the first release candidate for dbt utils 1.0. A full migration guide will accompany the final release, but here is the changelog:

New

  • New macro: get_single_value()
  • New macro: safe_divide()
  • New test: not_empty_string

Enhancements

  • group_by_columns in some tests
  • Able to prevent quote_identifiers in star()

Fixes

  • union now includes/excludes columns case-insensitively
  • slugify prefixes an underscore when the first char is a digit
  • expression_is_true doesn’t output * unless storing failures
  • star() only returns a * when no columns are found if running under the dbt compile command or execute=false mode. Returning * in these contexts allows SQLFluff's linting to work, but returning nothing under standard conditions ensure that you don't accidentally pull all your columns if you meant to exclude them all.
  • recency test can now be configured to ignore the time component, useful when your tested column is a date instead of a datetime.

Deprecated

Changing the calculation method for surrogate keys, even for the better, could have significant consequences in downstream uses (such as snapshots and incremental models which use this column as their unique_key). Because of this, it is possible to opt in to the legacy behaviour by setting the following variable in your dbt project:

#dbt_project.yml
vars:
  surrogate_key_treat_nulls_as_empty_strings: true #turn on legacy behaviour

By creating a new macro instead of updating the behaviour of the old one, we are requiring all projects who use this macro to make an explicit decision about which approach is better for their context. Our recommendation is that existing users should opt into the legacy behaviour unless you are confident that either a) your surrogate keys never contained nulls, or b) your surrogate keys are not used for incremental models, snapshots or other stateful artifacts and so can be regenerated with new values without issue.
Warning to package maintainers: you can not assume one behaviour or the other, as it can be enabled/disabled by the end user.

  • Deprecation: current_timestamp() replaced by dbt.current_timestamp(). Note that Postgres and Snowflake’s implementation of current_timestamp() differs between the old dbt_utils one and the new dbt one (full details here). For those adapters, use dbt.current_timestamp_backcompat() if you need identical, backwards-compatible behaviour. This discrepancy will hopefully be reconciled in some future version of dbt Core (maybe 2.0?)
  • All other cross-db macros have moved to dbt namespace with no changes necessary other than swapping out dbt_utils. for dbt.; see https://docs.getdbt.com/reference/dbt-jinja-functions/cross-database-macros
  • insert_by_period materialization moved to experimental repo
  • safe_add() only works with a list of arguments
  • deduplicate() changes:
    • group_by argument is removed, replaced by partition_by
    • relation_alias is removed, replaced by passing the alias directly to the relation argument
    • order_by is now mandatory. Pass a static value like 1 if you don’t care how they are deduplicated
  • table argument has been removed from unpivot. Use relation instead

Full Changelog: 0.9.6...1.0.0-rc1

0.9.6

28 Nov 22:43
Compare
Choose a tag to compare

Replace reference to dbt_utils.escape_single_quotes with dbt.escape_single_quotes in pivot macro

0.9.5

22 Nov 19:18
7c11123
Compare
Choose a tag to compare

What's Changed

  • Stop showing cross-db deprecation warnings in already-migrated macros by @joellabes in #725

NB: 0.9.3 and 0.9.4 were removed due to compatibility issues.

Full Changelog: 0.9.2...0.9.5

1.0.0-b2

04 Oct 05:40
Compare
Choose a tag to compare
1.0.0-b2 Pre-release
Pre-release

Small tweaks

Breaking Changes

  • Limit columns selected in expression_is_true when not storing failures to the database, to improve BQ resource consumption. by @elyobo in #686
  • Remove dbt_utils.current_timestamp(), and replace internal usages of dbt_utils.current_timestamp() with dbt.current_timestamp_backcompat() from dbt Core. This provides consistent behaviour to old versions of dbt utils, but brings all of the weirdness into one place (dbt Core). @colin-rogers-dbt in #694
  • 🚨 🚨 🚨 Replace surrogate_key with generate_surrogate_key, which treats null and '' differently by @dave-connors-3 in #685. Opt into the legacy behaviour by creating a variable like this in dbt_project.yml:
# dbt_project.yml
...
vars:
  dbt_utils:
    surrogate_key_treat_nulls_as_empty_strings: True

New Contributors

Full Changelog: 1.0.0-b1...1.0.0-b2