diff --git a/website/docs/docs/build/snapshots.md b/website/docs/docs/build/snapshots.md index f1fbc3046d1..f72f1eb75de 100644 --- a/website/docs/docs/build/snapshots.md +++ b/website/docs/docs/build/snapshots.md @@ -10,8 +10,7 @@ id: "snapshots" * [Snapshot properties](/reference/snapshot-properties) * [`snapshot` command](/reference/commands/snapshot) - -### What are snapshots? +## What are snapshots? Analysts often need to "look back in time" at previous data states in their mutable tables. While some source data systems are built in a way that makes accessing historical data possible, this is not always the case. dbt provides a mechanism, **snapshots**, which records changes to a mutable over time. Snapshots implement [type-2 Slowly Changing Dimensions](https://en.wikipedia.org/wiki/Slowly_changing_dimension#Type_2:_add_new_row) over mutable source tables. These Slowly Changing Dimensions (or SCDs) identify how a row in a table changes over time. Imagine you have an `orders` table where the `status` field can be overwritten as the order is processed. @@ -64,9 +63,9 @@ snapshots: [unique_key](/reference/resource-configs/unique_key): column_name_or_expression [check_cols](/reference/resource-configs/check_cols): [column_name] | all [updated_at](/reference/resource-configs/updated_at): column_name - [invalidate_hard_deletes](/reference/resource-configs/invalidate_hard_deletes): true | false [snapshot_meta_column_names](/reference/resource-configs/snapshot_meta_column_names): dictionary [dbt_valid_to_current](/reference/resource-configs/dbt_valid_to_current): string + [hard_deletes](/reference/resource-configs/hard-deletes): ignore | invalidate | new_record ``` @@ -82,9 +81,9 @@ The following table outlines the configurations available for snapshots: | [unique_key](/reference/resource-configs/unique_key) | A column(s) (string or array) or expression for the record | Yes | `id` or `[order_id, product_id]` | | [check_cols](/reference/resource-configs/check_cols) | If using the `check` strategy, then the columns to check | Only if using the `check` strategy | ["status"] | | [updated_at](/reference/resource-configs/updated_at) | If using the `timestamp` strategy, the timestamp column to compare | Only if using the `timestamp` strategy | updated_at | -| [invalidate_hard_deletes](/reference/resource-configs/invalidate_hard_deletes) | Find hard deleted records in source and set `dbt_valid_to` to current time if the record no longer exists | No | True | | [dbt_valid_to_current](/reference/resource-configs/dbt_valid_to_current) | Set a custom indicator for the value of `dbt_valid_to` in current snapshot records (like a future date). By default, this value is `NULL`. When configured, dbt will use the specified value instead of `NULL` for `dbt_valid_to` for current records in the snapshot table.| No | string | | [snapshot_meta_column_names](/reference/resource-configs/snapshot_meta_column_names) | Customize the names of the snapshot meta fields | No | dictionary | +| [hard_deletes](/reference/resource-configs/hard-deletes) | Specify how to handle deleted rows from the source. Supported options are `ignore` (default), `invalidate` (replaces the legacy `invalidate_hard_deletes=true`), and `new_record`.| No | string | - In versions prior to v1.9, the `target_schema` (required) and `target_database` (optional) configurations defined a single schema or database to build a snapshot across users and environment. This created problems when testing or developing a snapshot, as there was no clear separation between development and production environments. In v1.9, `target_schema` became optional, allowing snapshots to be environment-aware. By default, without `target_schema` or `target_database` defined, snapshots now use the `generate_schema_name` or `generate_database_name` macros to determine where to build. Developers can still set a custom location with [`schema`](/reference/resource-configs/schema) and [`database`](/reference/resource-configs/database) configs, consistent with other resource types. @@ -216,10 +215,14 @@ When you run the [`dbt snapshot` command](/reference/commands/snapshot): - The `dbt_valid_to` column will be updated for any existing records that have changed. - The updated record and any new records will be inserted into the snapshot table. These records will now have `dbt_valid_to = null` or the value configured in `dbt_valid_to_current` (available in dbt Core v1.9+). + + #### Note - These column names can be customized to your team or organizational conventions using the [snapshot_meta_column_names](#snapshot-meta-fields) config. - Use the `dbt_valid_to_current` config to set a custom indicator for the value of `dbt_valid_to` in current snapshot records (like a future date such as `9999-12-31`). By default, this value is `NULL`. When set, dbt will use this specified value instead of `NULL` for `dbt_valid_to` for current records in the snapshot table. - +- Use the [`hard_deletes`](/reference/resource-configs/hard-deletes) config to track hard deletes by adding a new record when row become "deleted" in source. Supported options are `ignore`, `invalidate`, and `new_record`. + + Snapshots can be referenced in downstream models the same way as referencing models — by using the [ref](/reference/dbt-jinja-functions/ref) function. ## Detecting row changes @@ -295,7 +298,7 @@ The `check` snapshot strategy can be configured to track changes to _all_ column ::: -**Example Usage** +**Example usage** @@ -345,15 +348,64 @@ snapshots: ### Hard deletes (opt-in) + + +In dbt v1.9 and higher, the [`hard_deletes`](/reference/resource-configs/hard-deletes) config replaces the `invalidate_hard_deletes` config to give you more control on how to handle deleted rows from the source. The `hard_deletes` config is not a separate strategy but an additional opt-in feature that can be used with any snapshot strategy. + +The `hard_deletes` config has three options/fields: +| Field | Description | +| --------- | ----------- | +| `ignore` (default) | No action for deleted records. | +| `invalidate` | Behaves the same as the existing `invalidate_hard_deletes=true`, where deleted records are invalidated by setting `dbt_valid_to`. | +| `new_record` | Tracks deleted records as new rows using the `dbt_is_deleted` [meta field](#snapshot-meta-fields) when records are deleted.| + +import HardDeletes from '/snippets/_hard-deletes.md'; + + + +#### Example usage + + + +```yaml +snapshots: + - name: orders_snapshot_hard_delete + relation: source('jaffle_shop', 'orders') + config: + schema: snapshots + unique_key: id + strategy: timestamp + updated_at: updated_at + hard_deletes: new_record # options are: 'ignore', 'invalidate', or 'new_record' +``` + + + +In this example, the `hard_deletes: new_record` config will add a new row for deleted records with the `dbt_is_deleted` column set to `True`. +Any restored records are added as new rows with the `dbt_is_deleted` field set to `False`. + +The resulting table will look like this: + +| id | status | updated_at | dbt_valid_from | dbt_valid_to | dbt_is_deleted | +| -- | ------ | ---------- | -------------- | ------------ | -------------- | +| 1 | pending | 2024-01-01 10:47 | 2024-01-01 10:47 | 2024-01-01 11:05 | False | +| 1 | shipped | 2024-01-01 11:05 | 2024-01-01 11:05 | 2024-01-01 11:20 | False | +| 1 | deleted | 2024-01-01 11:20 | 2024-01-01 11:20 | 2024-01-01 12:00 | True | +| 1 | restored | 2024-01-01 12:00 | 2024-01-01 12:00 | | False | + + + + + Rows that are deleted from the source query are not invalidated by default. With the config option `invalidate_hard_deletes`, dbt can track rows that no longer exist. This is done by left joining the snapshot table with the source table, and filtering the rows that are still valid at that point, but no longer can be found in the source table. `dbt_valid_to` will be set to the current snapshot time. This configuration is not a different strategy as described above, but is an additional opt-in feature. It is not enabled by default since it alters the previous behavior. For this configuration to work with the `timestamp` strategy, the configured `updated_at` column must be of timestamp type. Otherwise, queries will fail due to mixing data types. -**Example Usage** +Note, in v1.9 and higher, the [`hard_deletes`](/reference/resource-configs/hard-deletes) config replaces the `invalidate_hard_deletes` config for better control over how to handle deleted rows from the source. - +#### Example usage @@ -379,31 +431,16 @@ For this configuration to work with the `timestamp` strategy, the configured `up - - - - -```yaml -snapshots: - - name: orders_snapshot_hard_delete - relation: source('jaffle_shop', 'orders') - config: - schema: snapshots - unique_key: id - strategy: timestamp - updated_at: updated_at - invalidate_hard_deletes: true -``` - - - - - ## Snapshot meta-fields Snapshot tables will be created as a clone of your source dataset, plus some additional meta-fields*. -In dbt Core v1.9+ (or available sooner in [the "Latest" release track in dbt Cloud](/docs/dbt-versions/cloud-release-tracks)), these column names can be customized to your team or organizational conventions via the [`snapshot_meta_column_names`](/reference/resource-configs/snapshot_meta_column_names) config. +In dbt Core v1.9+ (or available sooner in [the "Latest" release track in dbt Cloud](/docs/dbt-versions/cloud-release-tracks)): +- These column names can be customized to your team or organizational conventions using the [`snapshot_meta_column_names`](/reference/resource-configs/snapshot_meta_column_names) config. +ess) +- Use the [`dbt_valid_to_current` config](/reference/resource-configs/dbt_valid_to_current) to set a custom indicator for the value of `dbt_valid_to` in current snapshot records (like a future date such as `9999-12-31`). By default, this value is `NULL`. When set, dbt will use this specified value instead of `NULL` for `dbt_valid_to` for current records in the snapshot table. +- Use the [`hard_deletes`](/reference/resource-configs/hard-deletes) config to track deleted records as new rows with the `dbt_is_deleted` meta field when using the `hard_deletes='new_record'` field. + | Field | Meaning | Usage | | -------------- | ------- | ----- | @@ -411,6 +448,7 @@ In dbt Core v1.9+ (or available sooner in [the "Latest" release track in dbt Clo | dbt_valid_to | The timestamp when this row became invalidated.
For current records, this is `NULL` by default or the value specified in `dbt_valid_to_current`. | The most recent snapshot record will have `dbt_valid_to` set to `NULL` or the specified value. | | dbt_scd_id | A unique key generated for each snapshotted record. | This is used internally by dbt | | dbt_updated_at | The updated_at timestamp of the source record when this snapshot row was inserted. | This is used internally by dbt | +| dbt_is_deleted | A boolean value indicating if the record has been deleted. `True` if deleted, `False` otherwise. | Added when `hard_deletes='new_record'` is configured. This is used internally by dbt | *The timestamps used for each column are subtly different depending on the strategy you use: @@ -444,6 +482,15 @@ Snapshot results (note that `11:30` is not used anywhere): | 1 | pending | 2024-01-01 10:47 | 2024-01-01 10:47 | 2024-01-01 11:05 | 2024-01-01 10:47 | | 1 | shipped | 2024-01-01 11:05 | 2024-01-01 11:05 | | 2024-01-01 11:05 | +Snapshot results with `hard_deletes='new_record'`: + +| id | status | updated_at | dbt_valid_from | dbt_valid_to | dbt_updated_at | dbt_is_deleted | +|----|---------|------------------|------------------|------------------|------------------|----------------| +| 1 | pending | 2024-01-01 10:47 | 2024-01-01 10:47 | 2024-01-01 11:05 | 2024-01-01 10:47 | False | +| 1 | shipped | 2024-01-01 11:05 | 2024-01-01 11:05 | 2024-01-01 11:20 | 2024-01-01 11:05 | False | +| 1 | deleted | 2024-01-01 11:20 | 2024-01-01 11:20 | | 2024-01-01 11:20 | True | + +
@@ -478,6 +525,14 @@ Snapshot results: | 1 | pending | 2024-01-01 11:00 | 2024-01-01 11:30 | 2024-01-01 11:00 | | 1 | shipped | 2024-01-01 11:30 | | 2024-01-01 11:30 | +Snapshot results with `hard_deletes='new_record'`: + +| id | status | dbt_valid_from | dbt_valid_to | dbt_updated_at | dbt_is_deleted | +|----|---------|------------------|------------------|------------------|----------------| +| 1 | pending | 2024-01-01 11:00 | 2024-01-01 11:30 | 2024-01-01 11:00 | False | +| 1 | shipped | 2024-01-01 11:30 | 2024-01-01 11:40 | 2024-01-01 11:30 | False | +| 1 | deleted | 2024-01-01 11:40 | | 2024-01-01 11:40 | True | + ## Configure snapshots in versions 1.8 and earlier @@ -486,7 +541,7 @@ Snapshot results: For information about configuring snapshots in dbt versions 1.8 and earlier, select **1.8** from the documentation version picker, and it will appear in this section. -To configure snapshots in versions 1.9 and later, refer to [Configuring snapshots](#configuring-snapshots). The latest versions use a more ergonomic snapshot configuration syntax that also speeds up parsing and compilation. +To configure snapshots in versions 1.9 and later, refer to [Configuring snapshots](#configuring-snapshots). The latest versions use an updated snapshot configuration syntax that optimizes performance.
diff --git a/website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.9.md b/website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.9.md index c45598bd6cc..6ade3d5013f 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.9.md +++ b/website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.9.md @@ -68,6 +68,7 @@ Beginning in dbt Core 1.9, we've streamlined snapshot configuration and added a - Standard `schema` and `database` configs supported: Snapshots will now be consistent with other dbt resource types. You can specify where environment-aware snapshots should be stored. - Warning for incorrect `updated_at` data type: To ensure data integrity, you'll see a warning if the `updated_at` field specified in the snapshot configuration is not the proper data type or timestamp. - Set a custom current indicator for the value of `dbt_valid_to`: Use the [`dbt_valid_to_current` config](/reference/resource-configs/dbt_valid_to_current) to set a custom indicator for the value of `dbt_valid_to` in current snapshot records (like a future date). By default, this value is `NULL`. When configured, dbt will use the specified value instead of `NULL` for `dbt_valid_to` for current records in the snapshot table. +- Use the [`hard_deletes`](/reference/resource-configs/hard-deletes) configuration to get more control on how to handle deleted rows from the source. Supported methods are `ignore` (default), `invalidate` (replaces legacy `invalidate_hard_deletes=true`), and `new_record`. Setting `hard_deletes='new_record'` allows you to track hard deletes by adding a new record when row becomes "deleted" in source. Read more about [Snapshots meta fields](/docs/build/snapshots#snapshot-meta-fields). diff --git a/website/docs/docs/dbt-versions/release-notes.md b/website/docs/docs/dbt-versions/release-notes.md index 5b023addc18..47c86ea34dd 100644 --- a/website/docs/docs/dbt-versions/release-notes.md +++ b/website/docs/docs/dbt-versions/release-notes.md @@ -18,6 +18,10 @@ Release notes are grouped by month for both multi-tenant and virtual private clo \* The official release date for this new format of release notes is May 15th, 2024. Historical release notes for prior dates may not reflect all available features released earlier this year or their tenancy availability. +## December 2024 + +- **New**: The [`hard_deletes`](/reference/resource-configs/hard-deletes) config gives you more control on how to handle deleted rows from the source. Supported options are `ignore` (default), `invalidate` (replaces the legacy `invalidate_hard_deletes=true`), and `new_record`. Note that `new_record` will create a new metadata column in the snapshot table. + ## November 2024 - **Enhancement**: Trust signal icons in dbt Explorer are now available for Exposures, providing a quick view of data health while browsing resources. To view trust signal icons, go to dbt Explorer and click **Exposures** under the **Resource** tab. Refer to [Trust signal for resources](/docs/collaborate/explore-projects#trust-signals-for-resources) for more info. - **Bug**: Identified and fixed an error with Semantic Layer queries that take longer than 10 minutes to complete. diff --git a/website/docs/reference/resource-configs/hard-deletes.md b/website/docs/reference/resource-configs/hard-deletes.md new file mode 100644 index 00000000000..ef6d70f3e6f --- /dev/null +++ b/website/docs/reference/resource-configs/hard-deletes.md @@ -0,0 +1,111 @@ +--- +title: hard_deletes +resource_types: [snapshots] +description: "Use the `hard_deletes` config to control how deleted rows are tracked in your snapshot table." +datatype: "boolean" +default_value: {ignore} +id: "hard-deletes" +sidebar_label: "hard_deletes" +--- + +Available from dbt v1.9 or with [Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) dbt Cloud. + + + + +```yaml +snapshots: + - name: + config: + hard_deletes: 'ignore' | 'invalidate' | 'new_record' +``` + + + + +```yml +snapshots: + [](/reference/resource-configs/resource-path): + +hard_deletes: "ignore" | "invalidate" | "new_record" +``` + + + + + +```sql +{{ + config( + unique_key='id', + strategy='timestamp', + updated_at='updated_at', + hard_deletes='ignore' | 'invalidate' | 'new_record' + ) +}} +``` + + + + +## Description + +The `hard_deletes` config gives you more control on how to handle deleted rows from the source. Supported options are `ignore` (default), `invalidate` (replaces the legacy `invalidate_hard_deletes=true`), and `new_record`. Note that `new_record` will create a new metadata column in the snapshot table. + +import HardDeletes from '/snippets/_hard-deletes.md'; + + + +:::warning + +If you're updating an existing snapshot to use the `hard_deletes` config, dbt _will not_ handle migrations automatically. We recommend either only using these settings for net-new snapshots, or [arranging an update](/reference/snapshot-configs#snapshot-configuration-migration) of pre-existing tables before enabling this setting. +::: + +## Default + +By default, if you don’t specify `hard_deletes`, it'll automatically default to `ignore`. Deleted rows will not be tracked and their `dbt_valid_to` column remains `NULL`. + +The `hard_deletes` config has three methods: + +| Methods | Description | +| --------- | ----------- | +| `ignore` (default) | No action for deleted records. | +| `invalidate` | Behaves the same as the existing `invalidate_hard_deletes=true`, where deleted records are invalidated by setting `dbt_valid_to` to current time. This method replaces the `invalidate_hard_deletes` config to give you more control on how to handle deleted rows from the source. | +| `new_record` | Tracks deleted records as new rows using the `dbt_is_deleted` meta field when records are deleted.| + +## Considerations +- **Backward compatibility**: The `invalidate_hard_deletes` config is still supported for existing snapshots but can't be used alongside `hard_deletes`. +- **New snapshots**: For new snapshots, we recommend using `hard_deletes` instead of `invalidate_hard_deletes`. +- **Migration**: If you switch an existing snapshot to use `hard_deletes` without migrating your data, you may encounter inconsistent or incorrect results, such as a mix of old and new data formats. + +## Example + + + +```yaml +snapshots: + - name: my_snapshot + config: + hard_deletes: new_record # options are: 'ignore', 'invalidate', or 'new_record' + strategy: timestamp + updated_at: updated_at + columns: + - name: dbt_valid_from + description: Timestamp when the record became valid. + - name: dbt_valid_to + description: Timestamp when the record stopped being valid. + - name: dbt_is_deleted + description: Indicates whether the record was deleted. +``` + + + +The resulting snapshot table contains the `hard_deletes: new_record` configuration. If a record is deleted and later restored, the resulting snapshot table might look like this: + +| id | dbt_scd_id | Status | dbt_updated_at | dbt_valid_from | dbt_valid_to | dbt_is_deleted | +| -- | -------------------- | ----- | -------------------- | --------------------| -------------------- | ----------- | +| 1 | 60a1f1dbdf899a4dd... | pending | 2024-10-02 ... | 2024-05-19... | 2024-05-20 ... | False | +| 1 | b1885d098f8bcff51... | pending | 2024-10-02 ... | 2024-05-20 ... | 2024-06-03 ... | True | +| 1 | b1885d098f8bcff53... | shipped | 2024-10-02 ... | 2024-06-03 ... | | False | +| 2 | b1885d098f8bcff55... | active | 2024-10-02 ... | 2024-05-19 ... | | False | + +In this example, the `dbt_is_deleted` column is set to `True` when the record is deleted. When the record is restored, the `dbt_is_deleted` column is set to `False`. diff --git a/website/docs/reference/resource-configs/invalidate_hard_deletes.md b/website/docs/reference/resource-configs/invalidate_hard_deletes.md index bdaec7e33a9..67123487fa1 100644 --- a/website/docs/reference/resource-configs/invalidate_hard_deletes.md +++ b/website/docs/reference/resource-configs/invalidate_hard_deletes.md @@ -1,9 +1,17 @@ --- +title: invalidate_hard_deletes (legacy) resource_types: [snapshots] description: "Invalidate_hard_deletes - Read this in-depth guide to learn about configurations in dbt." datatype: column_name +sidebar_label: invalidate_hard_deletes (legacy) --- +:::warning This is a legacy config — Use the [`hard_deletes`](/reference/resource-configs/hard-deletes) config instead. + +In Versionless and dbt Core 1.9 and higher, the [`hard_deletes`](/reference/resource-configs/hard-deletes) config replaces the `invalidate_hard_deletes` config for better control over how to handle deleted rows from the source. + +For new snapshots, set the config to `hard_deletes='invalidate'` instead of `invalidate_hard_deletes=true`. For existing snapshots, [arrange an update](/reference/snapshot-configs#snapshot-configuration-migration) of pre-existing tables before enabling this setting. Refer to +::: diff --git a/website/docs/reference/resource-configs/snapshot_meta_column_names.md b/website/docs/reference/resource-configs/snapshot_meta_column_names.md index 1230799f780..f1d29ba8bee 100644 --- a/website/docs/reference/resource-configs/snapshot_meta_column_names.md +++ b/website/docs/reference/resource-configs/snapshot_meta_column_names.md @@ -19,6 +19,7 @@ snapshots: dbt_valid_to: dbt_scd_id: dbt_updated_at: + dbt_is_deleted: ``` @@ -34,6 +35,7 @@ snapshots: "dbt_valid_to": "", "dbt_scd_id": "", "dbt_updated_at": "", + "dbt_is_deleted": "", } ) }} @@ -52,7 +54,7 @@ snapshots: dbt_valid_to: dbt_scd_id: dbt_updated_at: - + dbt_is_deleted: ``` @@ -71,6 +73,7 @@ By default, dbt snapshots use the following column names to track change history | `dbt_valid_to` | The timestamp when this row is no longer valid. | | | `dbt_scd_id` | A unique key generated for each snapshot row. | This is used internally by dbt. | | `dbt_updated_at` | The `updated_at` timestamp of the source record when this snapshot row was inserted. | This is used internally by dbt. | +| `dbt_is_deleted` | A boolean value indicating if the record has been deleted. `True` if deleted, `False` otherwise. | Added when `hard_deletes='new_record'` is configured. | However, these column names can be customized using the `snapshot_meta_column_names` config. @@ -92,18 +95,21 @@ snapshots: unique_key: id strategy: check check_cols: all + hard_deletes: new_record snapshot_meta_column_names: dbt_valid_from: start_date dbt_valid_to: end_date dbt_scd_id: scd_id dbt_updated_at: modified_date + dbt_is_deleted: is_deleted ``` The resulting snapshot table contains the configured meta column names: -| id | scd_id | modified_date | start_date | end_date | -| -- | -------------------- | -------------------- | -------------------- | -------------------- | -| 1 | 60a1f1dbdf899a4dd... | 2024-10-02 ... | 2024-10-02 ... | 2024-10-02 ... | -| 2 | b1885d098f8bcff51... | 2024-10-02 ... | 2024-10-02 ... | | +| id | scd_id | modified_date | start_date | end_date | is_deleted | +| -- | -------------------- | -------------------- | -------------------- | -------------------- | ---------- | +| 1 | 60a1f1dbdf899a4dd... | 2024-10-02 ... | 2024-10-02 ... | 2024-10-03 ... | False | +| 1 | 60a1f1dbdf899a4dd... | 2024-10-03 ... | 2024-10-03 ... | | True | +| 2 | b1885d098f8bcff51... | 2024-10-02 ... | 2024-10-02 ... | | False | diff --git a/website/docs/reference/snapshot-configs.md b/website/docs/reference/snapshot-configs.md index e6ee2eab4e8..018988a4934 100644 --- a/website/docs/reference/snapshot-configs.md +++ b/website/docs/reference/snapshot-configs.md @@ -64,9 +64,9 @@ snapshots: [+](/reference/resource-configs/plus-prefix)[strategy](/reference/resource-configs/strategy): timestamp | check [+](/reference/resource-configs/plus-prefix)[updated_at](/reference/resource-configs/updated_at): [+](/reference/resource-configs/plus-prefix)[check_cols](/reference/resource-configs/check_cols): [] | all - [+](/reference/resource-configs/plus-prefix)[invalidate_hard_deletes](/reference/resource-configs/invalidate_hard_deletes) : true | false [+](/reference/resource-configs/plus-prefix)[snapshot_meta_column_names](/reference/resource-configs/snapshot_meta_column_names): {} [+](/reference/resource-configs/plus-prefix)[dbt_valid_to_current](/reference/resource-configs/dbt_valid_to_current): + [+](/reference/resource-configs/plus-prefix)[hard_deletes](/reference/resource-configs/hard-deletes): string ``` @@ -99,8 +99,8 @@ snapshots: [strategy](/reference/resource-configs/strategy): timestamp | check [updated_at](/reference/resource-configs/updated_at): [check_cols](/reference/resource-configs/check_cols): [] | all - [invalidate_hard_deletes](/reference/resource-configs/invalidate_hard_deletes) : true | false [snapshot_meta_column_names](/reference/resource-configs/snapshot_meta_column_names): {} + [hard_deletes](/reference/resource-configs/hard-deletes): string [dbt_valid_to_current](/reference/resource-configs/dbt_valid_to_current): ``` diff --git a/website/sidebars.js b/website/sidebars.js index 8225df22bad..65a4584acde 100644 --- a/website/sidebars.js +++ b/website/sidebars.js @@ -970,17 +970,18 @@ const sidebarSettings = { label: "For snapshots", items: [ "reference/snapshot-properties", - "reference/resource-configs/snapshot_name", "reference/snapshot-configs", "reference/resource-configs/check_cols", + "reference/resource-configs/dbt_valid_to_current", + "reference/resource-configs/hard-deletes", + "reference/resource-configs/invalidate_hard_deletes", + "reference/resource-configs/snapshot_meta_column_names", + "reference/resource-configs/snapshot_name", "reference/resource-configs/strategy", "reference/resource-configs/target_database", "reference/resource-configs/target_schema", "reference/resource-configs/unique_key", "reference/resource-configs/updated_at", - "reference/resource-configs/invalidate_hard_deletes", - "reference/resource-configs/snapshot_meta_column_names", - "reference/resource-configs/dbt_valid_to_current", ], }, { diff --git a/website/snippets/_hard-deletes.md b/website/snippets/_hard-deletes.md new file mode 100644 index 00000000000..59c2e3af99e --- /dev/null +++ b/website/snippets/_hard-deletes.md @@ -0,0 +1,13 @@ + + +**Use `invalidate_hard_deletes` (v1.8 and earlier) if:** +- Gaps in the snapshot history (missing records for deleted rows) are acceptable. +- You want to invalidate deleted rows by setting their `dbt_valid_to` timestamp to the current time (implicit delete). +- You are working with smaller datasets where tracking deletions as a separate state is unnecessary. + +**Use `hard_deletes: new_record` (v1.9 and higher) if:** +- You want to maintain continuous snapshot history without gaps. +- You want to explicitly track deletions by adding new rows with a `dbt_is_deleted` column (explicit delete). +- You are working with larger datasets where explicitly tracking deleted records improves data lineage clarity. + +