-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable APM profiling for edxapp #749
Comments
We should roll out to Stage, then Edge, then Prod. |
DD support ticket for latency issues we encountered during the most recent rollout attempt: https://help.datadoghq.com/hc/requests/1909564 |
It seems like the newer version might be more efficient, so we should switch to using it. edx/edx-arch-experiments#749
I think I've managed to repro slow gunicorn startup on a sandbox instance. Profiling setupAdded to
And then:
(Can also restart workers with To get DD profiling data on both sides, pushed buttons in instructor dashboard and made calls to Gunicorn reproIn a dev terminal, make short HTTP calls to the LMS 1-2 times per second: For each config:
nginx output will look something like this:
The initial transition of For comparison, here's
In this sample, it appears that those calls that were recorded as a 499 did eventually get received by the LMS and were all processed in a burst about 10 seconds after workers actually started. EvaluationAfter the 503s end: Find the number of seconds from the first 499 to the first 200. This is the "startup period". Profiling offWith profiling off, the startup period lasts 12 seconds. Profiling onWith the below profiling config, the startup period lasts 20 seconds.
|
Additional configurations to establish a baseline:
19 seconds (with one 499 a few seconds after the first 200s); 18; 18
21; 22; 21 Pretty consistent. I'll keep this disabled for now since it's not needed for repro, and since we'll probably only want to use it when we want to actually look at the generated profiles:
To experiment with:
|
With a baseline of
On to the toggles... Turning every profiling feature off (except for profiling itself) gets to the "good" situation:
11, 9, 11
19, 19
12, 11, 11
18, 17, 17
16, 16, 16
14, 13, 13
13, 15, 15 |
Ultimately, we want to enable APM profiling for edxapp, when we think it is safe.
Notes:
The text was updated successfully, but these errors were encountered: