Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add always false filter to queries to improve speed (particularly on Snowflake) #129

Merged
merged 4 commits into from
Nov 15, 2019

Conversation

DylanBaker
Copy link
Collaborator

We run limit 0 queries in order to test the validity of the SQL Looker generates. On lots of warehouses, that processes no data and is very fast (BigQuery and Redshift, notably).

On Snowflake, it still processes a lot of data and there isn't an EXPLAIN functionality we can use right now. The proposed solution is to add a 1=2 filter to all queries, which does stop it from processing data.

@joshtemple joshtemple added this to the 0.1.1 milestone Nov 15, 2019
@JarredLHumphrey
Copy link

JarredLHumphrey commented Nov 15, 2019

Just an FYI that you can see the query planner output via the Snowflake instances UI at:
https://<snowflakehost>.com/console#/monitoring/queries

Once on that page, click the query id you want the plan for, and then click the Profile tab.

I took the liberty to run two queries to show the difference in output for LIMIT 0 vs WHERE 1=2.

Query: SELECT * FROM INFORMATION_SCHEMA.COLUMNS LIMIT 0;
Plan: Screen Shot 2019-11-15 at 1 59 52 PM

Query: SELECT * FROM INFORMATION_SCHEMA.COLUMNS WHERE 1=2 LIMIT 0;
Plan:
Screen Shot 2019-11-15 at 1 59 34 PM

This demonstrates that the filtering is done as part of the initial generator step so less data is fetched to the instance before the limit is applied.

@DylanBaker DylanBaker merged commit 042dde9 into master Nov 15, 2019
@DylanBaker DylanBaker deleted the feature/where-false branch November 15, 2019 21:36
@codecov
Copy link

codecov bot commented Nov 15, 2019

Codecov Report

Merging #129 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #129   +/-   ##
=======================================
  Coverage   66.52%   66.52%           
=======================================
  Files           9        9           
  Lines         684      684           
=======================================
  Hits          455      455           
  Misses        229      229
Impacted Files Coverage Δ
spectacles/client.py 57.03% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7d1f5de...8ea13fd. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants