-
Notifications
You must be signed in to change notification settings - Fork 300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wait failed pieces greatly increase image download time #1083
Comments
Use the latest helm charts version(v0.5.46) with optimized image scheduling. |
Can you provide all log of |
the dfdaemon log: |
I tried Helm chart 0.5.46, which installed dragonfly v2.0.2-rc.9. It was amazing, the delay to wait failed piece was not that high, normally below 0.5s , that gave the stable image download time of about 6.3s. And I noticed that all dfdaemons download image from a CDN instance, that's different from the previous version, v2.0.2-rc.4, which always try to download image from another dfdaemon. I guess that's part of the optimized image scheduling. |
In this log:
But the piece downloads slowly from other peers, ended at:
Do you limit upload or download with low speed ? |
I'm using the default values.yaml from the helm chart 0.5.38:
|
So, although I get ideal result from dragonfly v2.0.2-rc.9, I think that's because all these dfdaemons download image from CDN instance, the wait failed pieces delay is not that big. Obviously when the number of nodes increase to hundreds, a lot of downloading will happen between dfdaemons, maybe the wait failed pieces delay will increase again. How can we decrease these delay ? |
Please try latest version. |
Bug report:
I deployed dragonfly v2.0.2-rc.4 through Helm chart 0.5.38. While distribute an image to 16 nodes, the image download time is higher than before.
before, we ran dragonfly v2.0.2-alpha.6, the average image download time is 7.06s, now, it's 8.98s.
I checked the logs , and found several dfdaemon instances have more than 2s delay when they wait for failed pieces:
From 21:36:00.667 to 21:36:03.272, there was 2.6s delay.
Not all dfdaemon instances had 2s delay, only 3, or 4 instances had, but most of other instances downloaded imge from these delayed instances , so the average download time increased a lot.
Expected behavior:
Avoid this kind of delay time, decrease the image download time.
How to reproduce it:
Using helm chart 0.5.38 to deploy dragonfly v2.0.2-rc.4, then distribute image to several nodes.
Environment:
uname -a
): Linux 3.10.0-1160.31.1.el7.x86_64The text was updated successfully, but these errors were encountered: