[ADAP-1077] [Drops Table In Separate Transaction] - Tables are dropped then the _tmp table renamed, causing other queries referencing the table to fail #693

gtmehrer · 2023-12-19T16:22:51Z

Is this a new bug in dbt-redshift?

I believe this is a new bug in dbt-redshift
I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Models with table materialization are dropped then the transaction is committed. This causes queries referencing the refreshing model to fail if they run at the same time since the table doesn't exist.

I've tried setting autocommit: true in profiles.yml to no effect.

Expected Behavior

The old behavior was to drop/rename in one transaction.

Steps To Reproduce

Run dbt run command in dbt-redshift >=1.7.1 (core >=1.7.3) against a table model.

dbt-redshift 1.6.5 (core 1.6.9) runs in a single transaction

Relevant log output

[0m08:54:14.597104 [debug] [Thread-2  ]: On model.$$$$: BEGIN
�[0m08:54:14.646101 [debug] [Thread-2  ]: SQL status: SUCCESS in 0.0 seconds
�[0m08:54:14.669096 [debug] [Thread-2  ]: Applying DROP to: $$$$
�[0m08:54:14.671270 [debug] [Thread-2  ]: Using redshift connection "model.$$$$"
�[0m08:54:14.672171 [debug] [Thread-2  ]: On model.$$$$: /* {"app": "dbt", "dbt_version": "1.7.3", "profile_name": "redshift", "target_name": "$$$$", "node_id": "model.$$$$"} */
drop table if exists $$$$ cascade
�[0m08:54:14.719695 [debug] [Thread-2  ]: SQL status: SUCCESS in 0.0 seconds
�[0m08:54:14.723391 [debug] [Thread-2  ]: On model.$$$$: COMMIT

Environment

- OS: macOS sonoma
- Python: 3.8.13
- dbt-core: 1.7.4
- dbt-redshift: 1.7.1

Additional Context

No response

The text was updated successfully, but these errors were encountered:

mikealfare · 2024-02-02T21:05:28Z

This is a regression that requires some design, hence the refinement tag. I believe the issue is that we moved from psycopg2 to redshift-connector. redshift-connector implements DBAPI2.0 correctly, which assumes execution of statements one at a time (you need to build the transaction manually if you want it). Whereas psycopg2 allows you to pass it multiple statements and it will automatically "wrap" it in a transaction. The assumption of the latter scenario exists in many places throughout dbt-redshift. This change was made in 1.5, so I'm a little surprised that this was working as a single transaction in 1.6. Perhaps it's the interaction of this scenario, along with the implementation of the replace set of macros that handles swapping tables with materialized views and vice versa.

martynydbt · 2024-02-23T20:21:46Z

@VersusFacit i added a backport label!

VersusFacit · 2024-02-23T23:07:03Z

@gtmehrer Hey there! Thanks for the submission. Just an upfront disclaimer that autocommit is True by default.

I recreated your logs essentially as you described.

My model:

{{
config(
    materialized='table'
)
}}

select 2

The Logs

You gave some log output, but I'm curious whether that's a red herring of sorts. My thinking: If you dig into core, you'll notice that a new statement for DROP was added, yes! But that's just a log message. The underlying semantics haven't really changed on the calling of that drop statement. Maybe the order has changed. For verifying that, we need more info 🕵️‍♀️

Maybe Mike is right that some special interaction of the new replace macros (added in this PR by the way) could be responsible, but I actually think it's a higher level configuration problem.

In dbt land, autocommit is on by default due to the needs of analytics engineers paired with the inertia of the semantics of psycopg2. Mike tangentially evokes this above.

A possible builtin solution to your issue

This causes queries referencing the refreshing model to fail if they run at the same time since the table doesn't exist.

As in you are seeing the model get dropped while other threads attempt to pull from it, whereas you want the rename and drop to happen in the same transaction?

Could you try with autocommit: False to see if that addresses your issue? That would prevent the always-commit-after-an-operation semantics you're running up against, I think!

Request for more info

Also, if you continue to have troubles, could you say more on your use case? It's not immediately clear what your error is from this description or how it's being triggered, but I'd love to know more. I want to be able to reproduce this. Without your use case in front of me, my best attempts to replicate failed regardless of whether I left autocommit on.

VersusFacit · 2024-02-24T02:58:42Z

There is an alternate possibility the team is exploring: It could be we need to remove some BEGIN/COMMIT barriers in special edge cases like this one you may have found. I have a PR up for that just in case. We'd still very much appreciate you trying the autocommit adjustment as a sanity check and providing more insight into your use case 🖖

dbeatty10 · 2024-02-26T15:19:06Z

Mila and I looked at the BEGIN/COMMIT barriers in 1.6 vs. 1.7.

1.6

BEGIN
create  table
    "ci"."dbt_dbeatty"."my_model__dbt_tmp"
    
    
    
  as (
    

select 1 as id
  );
alter table "ci"."dbt_dbeatty"."my_model" rename to "my_model__dbt_backup"
alter table "ci"."dbt_dbeatty"."my_model__dbt_tmp" rename to "my_model"
COMMIT
BEGIN
drop table if exists "ci"."dbt_dbeatty"."my_model__dbt_backup" cascade
COMMIT
BEGIN

1.7

BEGIN
create  table
    "ci"."dbt_dbeatty"."my_model__dbt_tmp"
    
    
    
  as (
    

select 1 as id
  );
COMMIT
BEGIN
drop table if exists "ci"."dbt_dbeatty"."my_model" cascade
COMMIT
BEGIN
alter table "ci"."dbt_dbeatty"."my_model__dbt_tmp" rename to "my_model"
COMMIT
BEGIN
drop table if exists "ci"."dbt_dbeatty"."my_model__dbt_backup" cascade
COMMIT
BEGIN

Here's the diff showing two additional COMMIT/BEGIN pairs in 1.7 vs. 1.6:

12c12,16
< alter table "ci"."dbt_dbeatty"."my_model" rename to "my_model__dbt_backup"
---
> COMMIT
> BEGIN
> drop table if exists "ci"."dbt_dbeatty"."my_model" cascade
> COMMIT
> BEGIN

Here's the nitty-gritty details of how we got those commands (using jq):

Reprex

models/my_model.sql

{{ config(materialized="table") }}

select 1 as id

Install jq in macOS using Homebrew (or an alternative command for your environment):

brew install jq

Create virtual environments for dbt 1.6 and 1.7:

python3 -m venv dbt_1.6
source dbt_1.6/bin/activate
python -m pip install --upgrade pip
python -m pip install dbt-core~=1.6.0 dbt-postgres~=1.6.0 dbt-redshift~=1.6.0
source dbt_1.6/bin/activate
dbt --version
deactivate
python3 -m venv dbt_1.7
source dbt_1.7/bin/activate
python -m pip install --upgrade pip
python -m pip install dbt-core~=1.7.0 dbt-postgres~=1.7.0 dbt-redshift~=1.7.0
source dbt_1.7/bin/activate
dbt --version
deactivate

Use 1.6 to run the model and create JSON logs. Use jq to isolate the relevant log messages, then write them to their own file.

source dbt_1.6/bin/activate
rm -rf logs
dbt --log-format-file json run -s my_model
jq -r -c '.data | select(.node_info.resource_type == "model" and .node_info.node_name == "my_model") | select(.sql != null) | .sql' logs/dbt.log | while IFS= read -r line; do echo $line; done > dbt.log.my_model.1.6.sql
deactivate

Use 1.7 to run the model and create JSON logs. Use jq to isolate the relevant log messages, then write them to their own file.

source dbt_1.7/bin/activate
rm -rf logs
dbt --log-format-file json run -s my_model
jq -r -c '.data | select(.node_info.resource_type == "model" and .node_info.node_name == "my_model") | select(.sql != null) | .sql' logs/dbt.log | while IFS= read -r line; do echo $line; done > dbt.log.my_model.1.7.sql
deactivate

Look at the diff between the the SQL that was executed in 1.6 vs. 1.7:

diff dbt.log.my_model.1.6.sql dbt.log.my_model.1.7.sql

12c12,16
< alter table "ci"."dbt_dbeatty"."my_model" rename to "my_model__dbt_backup"
---
> COMMIT
> BEGIN
> drop table if exists "ci"."dbt_dbeatty"."my_model" cascade
> COMMIT
> BEGIN

mikealfare · 2024-02-26T23:42:18Z

@VersusFacit and I talked through this quite a bit and we're pretty sure we found the issue. The root cause is that we are setting a class variable on BaseRelation in such a way that it does not get properly updated at the implementation layer.

Diagnosis

We set renameable_relations in dbt-core here:
https://github.com/dbt-labs/dbt-core/blob/0a6d0c158e5a1956bd48e53793ca3b28faee2b34/core/dbt/adapters/base/relation.py#L44

and attempt to override it with the correct configuration for dbt-redshift here:

dbt-redshift/dbt/adapters/redshift/relation.py

Line 33 in d77f5ee

renameable_relations = frozenset(

Unfortunately, because BaseRelation is a frozen dataclass, the override does not take affect. We didn't notice this in 1.6 because dbt-redshift uses the default table materialization in 1.6. However, we needed to use the renameable relations concept in 1.7 because of interactions between tables and materialized views. This required us to override the default table materialization to update this logic:
https://github.com/dbt-labs/dbt-core/blob/0a6d0c158e5a1956bd48e53793ca3b28faee2b34/core/dbt/include/global_project/macros/materializations/models/table.sql#L40

to include this conditional:

dbt-redshift/dbt/include/redshift/macros/materializations/table.sql

Lines 40 to 44 in d77f5ee

    
           {% if existing_relation.can_be_renamed %} 
        
               {{ adapter.rename_relation(existing_relation, backup_relation) }} 
        
           {% else  %} 
        
               {{ drop_relation_if_exists(existing_relation) }} 
        
           {% endif %}

The issue is that existing_relation.renameable_relations is always empty, hence existing_relation.can_be_renamed is always False. The correct logic path passes the conditional, which would result in the original behavior from dbt-core. Instead, we run through the else path and drop the relation.

Demonstrating Test

The following test in dbt-redshift demonstrates the issue:

from dbt.adapters.redshift.relation import RedshiftRelation
from dbt.contracts.relation import RelationType


def test_renameable_relation():
    relation = RedshiftRelation.create(
        database="my_db",
        schema="my_schema",
        identifier="my_table",
        type=RelationType.Table,
    )
    assert relation.renameable_relations == frozenset({
        RelationType.View,
        RelationType.Table
    })

This test currently fails because relation.renameable_relations==().

Fix

We need to update this line:
https://github.com/dbt-labs/dbt-core/blob/0a6d0c158e5a1956bd48e53793ca3b28faee2b34/core/dbt/adapters/base/relation.py#L44
to use the field method like this:
https://github.com/dbt-labs/dbt-core/blob/0a6d0c158e5a1956bd48e53793ca3b28faee2b34/core/dbt/adapters/base/relation.py#L36
Since the default is a built-in (tuple), you should probably be able to avoid the lambda and do:

renameable_relations: SerializableIterable = field(default_factory=tuple)

While we're here, the same mistake was made in a related area on the next line of code:
https://github.com/dbt-labs/dbt-core/blob/0a6d0c158e5a1956bd48e53793ca3b28faee2b34/core/dbt/adapters/base/relation.py#L50
This should also use the field method to set the empty tuple as a default. The test should be updated (or a second one should be created) to check this is configured correctly as well.

The fix needs to be implemented in dbt-core>=1.6 and dbt-adapters>=1.8. The tests need to be implemented in dbt-snowflake>=1.6, dbt-redshift>=1.6, and dbt-bigquery>=1.7.

Risks

We've been operating under the assumption that we've been using the renameable_relations and replaceable_relations as expected. However, since these were both always False, we may be introducing new logic paths by fixing this bug. Put another way, resolving this bug is effectively implementing renameable_relations, replaceable_relations, and the associated replace macro in some places.

dbeatty10 · 2024-02-27T02:05:31Z

That is some great info @mikealfare and @VersusFacit 🧠

While we're in this area of code, here's one other thing that @VersusFacit and I noticed in #693 (comment):

in the last statement, my_model__dbt_backup is dropped if it exists.

It doesn't appear to ever get created, so it's a no-op. But it looks like it may be an unnecessary statement.

Maybe it will be naturally handled along with the other fixes?

VersusFacit · 2024-02-27T05:25:13Z

I dont think it will be since it's an always drop. I think it's just a leftover artifact. I can make a trick to track dealing with this at some point :)

gtmehrer · 2024-03-05T18:49:12Z

Thanks so much for digging into this! @VersusFacit I did try with autocommit set to both True and False with the same results either way.

Our use case is that we refresh many models frequently throughout the day, around every 30 mins. Outside processes querying dbt managed tables will fail when they attempt to reference a table that has been dropped (when the transaction is immediately commited by dbt). The table is generally only gone for a few seconds before it's recreated by another transaction so it's a little difficult to replicate without high query volume.

Also, it just occurred to me that we are using Redshift's snapshot isolation, which may result in a different error than serializable isolation would. https://docs.aws.amazon.com/redshift/latest/dg/r_STV_DB_ISOLATION_LEVEL.html

Let me know if I can provide more info or help with testing!

mikealfare · 2024-03-21T23:07:24Z

We merged the fix today. We'll need to do a patch release before it's available on PyPI, which should happen in the next week. Feel free to install from branch 1.7.latest if you want to test it in the meantime.

Closing as resolved.

AlessandroLollo · 2024-03-28T15:25:13Z

@mikealfare has the fix been included in dbt-redshift==1.7.5?
Thanks! 🙌

mikealfare · 2024-03-28T16:53:34Z

@mikealfare has the fix been included in dbt-redshift==1.7.5? Thanks! 🙌

It is. We published a few hours ago.

AlessandroLollo · 2024-04-02T08:58:43Z

Thank you @mikealfare 🙌

gtmehrer added bug Something isn't working triage labels Dec 19, 2023

dbeatty10 added the regression label Feb 2, 2024

mikealfare added refinement Product or leadership input needed and removed triage labels Feb 2, 2024

nathaniel-may added the support label Feb 7, 2024

mikealfare added High Severity bug with significant impact that should be resolved in a reasonable timeframe and removed refinement Product or leadership input needed labels Feb 7, 2024

VersusFacit self-assigned this Feb 15, 2024

martynydbt added the backport 1.7.latest label Feb 23, 2024

VersusFacit mentioned this issue Feb 24, 2024

Fix the renamed relations code against 1.8 #719

Closed

4 tasks

VersusFacit mentioned this issue Feb 27, 2024

[Bug] renamed_relations not providing overriding hook to adapters dbt-labs/dbt-core#9681

Closed

2 tasks

mikealfare self-assigned this Mar 21, 2024

mikealfare closed this as completed Mar 21, 2024

lucapazz mentioned this issue Mar 26, 2024

[Regression] Increased execution times with version 1.7 compared to version 1.3 #741

Open

2 tasks

dataders mentioned this issue Apr 18, 2024

[SPIKE] Investigate state of transactions #773

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ADAP-1077] [Drops Table In Separate Transaction] - Tables are dropped then the _tmp table renamed, causing other queries referencing the table to fail #693

[ADAP-1077] [Drops Table In Separate Transaction] - Tables are dropped then the _tmp table renamed, causing other queries referencing the table to fail #693

gtmehrer commented Dec 19, 2023 •

edited

Loading

mikealfare commented Feb 2, 2024

martynydbt commented Feb 23, 2024

VersusFacit commented Feb 23, 2024 •

edited

Loading

VersusFacit commented Feb 24, 2024

dbeatty10 commented Feb 26, 2024

1.6

1.7

Reprex

mikealfare commented Feb 26, 2024

dbeatty10 commented Feb 27, 2024

VersusFacit commented Feb 27, 2024

gtmehrer commented Mar 5, 2024

mikealfare commented Mar 21, 2024

AlessandroLollo commented Mar 28, 2024

mikealfare commented Mar 28, 2024

AlessandroLollo commented Apr 2, 2024

[ADAP-1077] [Drops Table In Separate Transaction] - Tables are dropped then the _tmp table renamed, causing other queries referencing the table to fail #693

[ADAP-1077] [Drops Table In Separate Transaction] - Tables are dropped then the _tmp table renamed, causing other queries referencing the table to fail #693

Comments

gtmehrer commented Dec 19, 2023 • edited Loading

Is this a new bug in dbt-redshift?

Current Behavior

Expected Behavior

Steps To Reproduce

Relevant log output

Environment

Additional Context

mikealfare commented Feb 2, 2024

martynydbt commented Feb 23, 2024

VersusFacit commented Feb 23, 2024 • edited Loading

The Logs

A possible builtin solution to your issue

Request for more info

VersusFacit commented Feb 24, 2024

dbeatty10 commented Feb 26, 2024

1.6

1.7

Reprex

mikealfare commented Feb 26, 2024

Diagnosis

Demonstrating Test

Fix

Risks

dbeatty10 commented Feb 27, 2024

VersusFacit commented Feb 27, 2024

gtmehrer commented Mar 5, 2024

mikealfare commented Mar 21, 2024

AlessandroLollo commented Mar 28, 2024

mikealfare commented Mar 28, 2024

AlessandroLollo commented Apr 2, 2024

gtmehrer commented Dec 19, 2023 •

edited

Loading

VersusFacit commented Feb 23, 2024 •

edited

Loading