-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIRFLOW-4311] Remove sleep in LocalExecutor #5096
Conversation
It looks interesting. I recently noticed that LocalExecutor works very slowly. When I added tests of several DAGs in tests, they did over 50 minutes, so I received a timeout error on Travis. I am definitely for this PR, but travis reports a error
|
Think it's the flaky CI, the other jobs run successfully and the test also completes successfully locally and on my own Travis setup. I'll rerun. |
Codecov Report
@@ Coverage Diff @@
## master #5096 +/- ##
==========================================
+ Coverage 76.86% 76.86% +<.01%
==========================================
Files 463 463
Lines 29799 29795 -4
==========================================
- Hits 22905 22903 -2
+ Misses 6894 6892 -2
Continue to review full report at Codecov.
|
@BasPH good findings. |
If it is really that much of a difference (
Everyone needs a break even the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Awesome job @BasPH
A sleep(0) might be still useful - I think from my C days that would give other processes a better chance to be scheduled, but not sleep for any time. But it's late and I need sleep, so I could be making stuff up. |
Given we exec a process it doesn't matter anyway. Nm |
You mean the Java |
The LocalExecutor currently sleeps after running a task. This is unnecessary and the time could be used more efficiently for running the next task. I wrote a performance test (not added in code, see below) which ran a set of dummy tasks much faster as the executor doesn't sleep in between tasks. (with parallelism=2, 1000 dummy tasks completed in 8 seconds vs 505 seconds)
I'm confused about the reason behind the sleeps. The very first Airflow commit added them, I'm guessing for debugging, and it hasn't been looked at ever since.
Jira
Description
Tests
I did not add tests, however wrote one for testing the performance with and without changes:
Commits
Documentation
Code Quality
flake8