Support max_concurrency for running Looker queries #110

iserko · 2019-11-11T22:27:11Z

Situation was that once we ran Spectacles on our Looker instance
with 128 explores in our model, it killed the instance in a way that
it stopped responding and eventually crashed. I've diagnosed that it
was merely the number of running queries at a time that caused it.
It scheduled pretty much all 128 queries immediately.

So this adds a flag --max-concurrency which you can use to tweak
and set how much load you want to put on your Looker instance.

I've tried our staging server with 10 and it works nicely and doesn't
crash anything.

Of the things I did:

made the validate SQL function async and moved the loop code in the parent function
got rid of creating an asyncio loop as I don’t believe you need it (get_event_loop creates it if one does not exist)
when you specify max_concurrency the code will only create <max_concurrency> queries as will not queue up more until the previous ones are finished

codecov · 2019-11-12T14:41:58Z

Codecov Report

Merging #110 into master will decrease coverage by 0.97%.
The diff coverage is 31.74%.

@@            Coverage Diff             @@
##           master     #110      +/-   ##
==========================================
- Coverage   66.89%   65.92%   -0.98%     
==========================================
  Files           9        9              
  Lines         716      719       +3     
==========================================
- Hits          479      474       -5     
- Misses        237      245       +8

Impacted Files	Coverage Δ
spectacles/client.py	`58.33% <25%> (+1.29%)`	⬆️
spectacles/validators.py	`56.04% <30.9%> (-1.59%)`	⬇️
spectacles/runner.py	`47.36% <50%> (ø)`	⬆️
spectacles/cli.py	`77.39% <50%> (+0.19%)`	⬆️
spectacles/exceptions.py	`52.17% <0%> (-26.09%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e875b3c...bd289f7. Read the comment docs.

Situation was that once we ran Spectacles on our Looker instance with 128 explores in our model, it killed the instance in a way that it stopped responding and eventually crashed. I've diagnosed that it was merely the number of running queries at a time that caused it. It scheduled pretty much all 128 queries immediately. So this adds a flag --max-concurrency which you can use to tweak and set how much load you want to put on your Looker instance. I've tried our staging server with 10 and it works nicely and doesn't crash anything.

joshtemple

Thanks @iserko, this is really awesome and will be extremely important so people don't overload their instances. This is my first time writing async code in Python, so would love it if you could elaborate on the general approach here and if/why it's better so I can learn.

I want to benchmark this a bit because I have a hunch it will be faster, but I'm getting an error with --max_concurrency 0. Added a few more comments as well.

spectacles/cli.py

joshtemple · 2019-11-15T03:43:26Z

spectacles/cli.py

@@ -352,6 +353,14 @@ def _build_sql_subparser(
            user's branch to the revision of the branch that is on the remote. \
            WARNING: This will delete any uncommited changes in the user's workspace.",
    )
+    subparser.add_argument(
+        "--max-concurrency",
+        default=0,


It would be good to set the default to a higher number (10? 20?) so that a user who doesn't read the documentation avoids maxing out their instance accidentally. Better to provide a performance benefit for an informed user than a performance risk for an uniformed one. Open to suggestions on what that number should be.

joshtemple · 2019-11-15T16:04:09Z

spectacles/runner.py

        sql_validator = SqlValidator(self.client, self.project)
        sql_validator.build_project(selectors)
-        errors = sql_validator.validate(mode)
+        errors = sql_validator.validate(mode, max_concurrent)


Let's keep the variable name consistent throughout.

joshtemple · 2019-11-18T20:28:59Z

spectacles/validators.py

+            len(task_calls),
+            max_concurrency,
+        )
+        while task_calls:


When I run this with max_concurrency = 0, I get stuck in this loop forever and queries are never launched. query_task_ids is empty as defined above, so it continually waits for non-existent query task IDs to complete. Any idea what's going on here?

I wonder if you need to change if len(query_task_ids) >= max_concurrency: to a >?

DylanBaker · 2019-11-19T23:14:42Z

One thing I realised here is that I think we lose the 250 limit on fetching queries? I can test this later on a big enough explore, but I think we will get errors when fetching the queries if it does over 293 or whatever that number is?

joshtemple · 2019-11-23T01:44:32Z

@iserko, inspired by your PR, I did a bunch of digging on asyncio and found some ways to improve this approach.

A few major changes:

I've added a asyncio.BoundedSemaphore object to the validator which is used to block when the maximum number of queries are running against the data warehouse (releasing a slot once we get the result back)
I've used an asyncio.Queue object to handle tracking of the running queries. This allows us to periodically check for query results as a coroutine as well, pulling all Looker query task IDs, extracting results, and pushing the running or pending queries back onto the Queue.
I renamed --max-concurrency to --query-slots, which I think is a more intuitive name for the flag. I set the default to 10. Interested to hear your and @DylanBaker's thoughts on this though, the name is open for discussion for sure.

Let me know what you think of these changes.

I've adjusted the approach to use an asyncio Queue (to preserve state for running query task IDs) and a BoundedSemaphore (to limit the number of queries running concurrently). I've also renamed the flag to query_slots, which is certainly open to discussion.

iserko force-pushed the max_concurrency_queries branch 2 times, most recently from f0f31b5 to 07c4691 Compare November 12, 2019 14:41

iserko force-pushed the max_concurrency_queries branch 3 times, most recently from d15410c to e29c39d Compare November 14, 2019 12:08

iserko force-pushed the max_concurrency_queries branch from e29c39d to 106fedd Compare November 14, 2019 12:09

DylanBaker requested review from joshtemple and DylanBaker November 14, 2019 14:41

joshtemple reviewed Nov 18, 2019

View reviewed changes

joshtemple force-pushed the max_concurrency_queries branch from cc3d447 to 68ebf6b Compare November 23, 2019 01:48

joshtemple added 2 commits November 22, 2019 20:50

Fix flake8

66fcdb5

Merge branch 'master' into max_concurrency_queries

8f5633f

joshtemple added this to the 0.1.1 milestone Nov 23, 2019

joshtemple added 4 commits November 25, 2019 14:23

Switch back to concurrency for naming

69b326a

Change import to be relative, may need to revert this

1ae3e35

Update unit tests to work with new architecture

b316f07

Fix flake8

bd289f7

DylanBaker approved these changes Nov 25, 2019

View reviewed changes

joshtemple merged commit 14cb800 into spectacles-ci:master Nov 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support max_concurrency for running Looker queries #110

Support max_concurrency for running Looker queries #110

iserko commented Nov 11, 2019 •

edited

Loading

codecov bot commented Nov 12, 2019 •

edited

Loading

joshtemple left a comment

joshtemple Nov 15, 2019

joshtemple Nov 15, 2019

joshtemple Nov 18, 2019

DylanBaker commented Nov 19, 2019

joshtemple commented Nov 23, 2019

Support max_concurrency for running Looker queries #110

Support max_concurrency for running Looker queries #110

Conversation

iserko commented Nov 11, 2019 • edited Loading

codecov bot commented Nov 12, 2019 • edited Loading

Codecov Report

joshtemple left a comment

Choose a reason for hiding this comment

joshtemple Nov 15, 2019

Choose a reason for hiding this comment

joshtemple Nov 15, 2019

Choose a reason for hiding this comment

joshtemple Nov 18, 2019

Choose a reason for hiding this comment

DylanBaker commented Nov 19, 2019

joshtemple commented Nov 23, 2019

iserko commented Nov 11, 2019 •

edited

Loading

codecov bot commented Nov 12, 2019 •

edited

Loading