Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add docs for format: sql unit testing #5281

Merged
merged 9 commits into from
Apr 17, 2024
22 changes: 20 additions & 2 deletions website/docs/docs/build/unit-tests.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,9 +116,9 @@ unit_tests:
```
</file>

The previous example defines the mock data using the inline `dict` format, but you can also use `csv` either inline or in a separate fixture file.
The previous example defines the mock data using the inline `dict` format, but you can also use `csv` or `sql` either inline or in a separate fixture file.

You only have to define the mock data for the columns you care about. This enables you to write succinct and _specific_ unit tests.
When using the `dict` or `csv` format, you only have to define the mock data for the columns relevant to you. This enables you to write succinct and _specific_ unit tests.

:::note

Expand Down Expand Up @@ -279,6 +279,24 @@ unit_tests:

There is currently no way to unit test whether the dbt framework inserted/merged the records into your existing model correctly, but [we're investigating support for this in the future](https://github.com/dbt-labs/dbt-core/issues/8664).

## Unit testing a model that depend on ephemeral model(s)

If you want to unit test a model that depends on an ephemeral model, you must use `format: sql` for that input.

```yml
unit_tests:
- name: my_unit_test
model: dim_customers
given:
- input: ref('ephemeral_model')
format: sql
rows: |
select 1 as id, 'emily' as name
expect:
rows:
- {id: 1, first_name: emily}
```

## Additional resources

- [Unit testing reference page](/reference/resource-properties/unit-tests)
Expand Down
49 changes: 46 additions & 3 deletions website/docs/reference/resource-properties/data-formats.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,13 @@ title: "Supported data formats for unit tests"
sidebar_label: "Data formats"
---

Currently, mock data for unit testing in dbt supports two formats:
Currently, mock data for unit testing in dbt supports three formats:

- `dict` (default): Inline dictionary values.
- `csv`: Inline CSV values or a CSV file.
- `sql`: Incine SQL query or a SQL file. Note: For this format you must supply mock data for _all rows_.

We will support more in the future, so watch our [upgrade guides](/docs/dbt-versions/core-upgrade) and this page for updates.
## dict

The `dict` data format is the default if no `format` is defined.

Expand All @@ -28,6 +29,8 @@ unit_tests:

```

## csv

When using the `csv` format, you can use either an inline CSV string for `rows`:

```yml
Expand Down Expand Up @@ -57,4 +60,44 @@ unit_tests:
format: csv
fixture: my_model_a_fixture

```
```

## sql

Using this format:
- Provides more flexibility for the types of data you can unit test
- Allows you to unit test a model that depends on an ephemeral model

However, when using `format: sql` you must supply mock data for _all rows_.

When using the `sql` format, you can use either an inline SQL query for `rows`:

```yml

unit_tests:
- name: test_my_model
model: my_model
given:
- input: ref('my_model_a')
format: csv
rows: |
select 1 as id, 'gerda' as name, null as loaded_at union all
select 2 as id, 'michelle', null as loaded_at as name

```

Or, you can provide the name of a SQL file in the `tests/fixtures` directory (or the configured `test-paths` location) of your project for `fixture`:

```yml

unit_tests:
- name: test_my_model
model: my_model
given:
- input: ref('my_model_a')
format: sql
fixture: my_model_a_fixture

```

**Note:** Jinja is unsupported in SQL fixtures for unit tests.
44 changes: 33 additions & 11 deletions website/docs/reference/resource-properties/unit-tests.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ To run only your unit tests, use the command:
- We currently only support adding unit tests to models in your _current_ project.
- If your model has multiple versions, by default the unit test will run on *all* versions of your model. Read [unit testing versioned models](#unit-testing-versioned-models) for more information.
- Unit tests must be defined in a YML file in your `models/` directory.
- If you want to unit test a model that depends on an ephemeral model, you must use `format: sql` for that input.

<file name='dbt_project.yml'>

Expand All @@ -33,22 +34,20 @@ unit_tests:
tags: <string> | [<string>]
given:
- input: <ref_or_source_call> # optional for seeds
format: dict | csv
# if format csv, either define dictionary of rows or name of fixture
rows:
- {dictionary}
fixture: <fixture-name>
format: dict | csv | sql
# either define rows inline or name of fixture
rows: {dictionary} | <string>
fixture: <fixture-name> # sql or csv
- input: ... # declare additional inputs
expect:
format: dict | csv
# if format csv, either define dictionary of rows or name of fixture
rows:
- {dictionary}
fixture: <fixture-name>
format: dict | csv | sql
# either define rows inline of rows or name of fixture
rows: {dictionary} | <string>
fixture: <fixture-name> # sql or csv
overrides: # optional: configuration for the dbt execution environment
macros:
is_incremental: true | false
dbt_utils.current_timestamp: str
dbt_utils.current_timestamp: <string>
# ... any other jinja function from https://docs.getdbt.com/reference/dbt-jinja-functions
# ... any other context property
vars: {dictionary}
Expand Down Expand Up @@ -109,3 +108,26 @@ unit_tests:
fixture: valid_email_address_fixture_output

```

```yml

unit_tests:
- name: test_is_valid_email_address # this is the unique name of the test
model: dim_customers # name of the model I'm unit testing
given: # the mock data for your inputs
- input: ref('stg_customers')
rows:
- {email: [email protected], email_top_level_domain: example.com}
- {email: [email protected], email_top_level_domain: unknown.com}
- {email: badgmail.com, email_top_level_domain: gmail.com}
- {email: missingdot@gmailcom, email_top_level_domain: gmail.com}
- input: ref('top_level_email_domains')
format: sql
rows: |
select 'example.com' as tld union all
select 'gmail.com' as tld
expect: # the expected output given the inputs above
format: sql
fixture: valid_email_address_fixture_output

```
Loading