Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(ingestion): Handle Redshift string length limit in Serverless mode #10051

Merged
merged 1 commit into from
Mar 19, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -822,7 +822,7 @@ def stl_scan_based_lineage_query(
WHERE
qs.step_name = 'scan' AND
qs.source = 'Redshift(local)' AND
qt.sequence < 320 AND -- See https://stackoverflow.com/questions/72770890/redshift-result-size-exceeds-listagg-limit-on-svl-statementtext
qt.sequence < 16 AND -- See https://stackoverflow.com/questions/72770890/redshift-result-size-exceeds-listagg-limit-on-svl-statementtext
sti.database = '{db_name}' AND -- this was required to not retrieve some internal redshift tables, try removing to see what happens
sui.user_name <> 'rdsdb' -- not entirely sure about this filter
GROUP BY sti.schema, sti.table, qs.table_id, qs.query_id, sui.user_name
Expand Down Expand Up @@ -909,7 +909,7 @@ def list_insert_create_queries_sql(
cluster = '{db_name}' AND
qd.start_time >= '{start_time}' AND
qd.start_time < '{end_time}' AND
qt.sequence < 320 AND -- See https://stackoverflow.com/questions/72770890/redshift-result-size-exceeds-listagg-limit-on-svl-statementtext
qt.sequence < 16 AND -- See https://stackoverflow.com/questions/72770890/redshift-result-size-exceeds-listagg-limit-on-svl-statementtext
ld.query_id IS NULL -- filter out queries which are also stored in SYS_LOAD_DETAIL
ORDER BY target_table ASC
)
Expand Down Expand Up @@ -994,7 +994,7 @@ def temp_table_ddl_query(start_time: datetime, end_time: datetime) -> str:
query_type IN ('DDL', 'CTAS', 'OTHER', 'COMMAND')
AND qh.start_time >= '{start_time_str}'
AND qh.start_time < '{end_time_str}'
AND qt.sequence < 320
AND qt.sequence < 16
GROUP BY qh.start_time, qh.session_id, qh.transaction_id, qh.user_id
ORDER BY qh.start_time, qh.session_id, qh.transaction_id, qh.user_id ASC
)
Expand Down
Loading