-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Randomize requeue time #312
Comments
Definitely makes sense to at least jitter the poll interval a bit. I wonder whether we should jitter it once for each controller at startup (such that they all poll at a predictable interval, but offset from each other), or just jitter each poll interval slightly. |
May I pick this up? |
Sure @eugercek if you want to take a stab at this one, thank you for your willingness! |
Sorry for late response 😞 What is the current approach we want to implement? Can you guys elaborate on that. |
I'm stepping down due to lack of information. For other brave ones I think you need to elaborate on the preferred solution and some pointer on codebase. 👋🏻 |
Hello everybody, just saw this issue labled as good first issue and after reading up on the issue and the proposed solution
I wonder if it already could be sufficient to introduce jitter here
crossplane-runtime/pkg/reconciler/managed/reconciler.go Lines 516 to 520 in 492ad7c
Then it would be predictable for each individual controller. The alternative could be to kind of mirror the behaviour of the I think both options could be done with the first one being slightly more convenient to implement. What would be a good value for the jitter? And if you think my proposal is fine, then I am also happy to quickly open a PR for that! |
What problem are you facing?
With the current
PollInterval
system (at least with the defaults) every controller within a provider will get the same interval. This can be problematic as the number of controllers in a provider increases. At least after a controller restart, all resources will essentially be enqueued at the same time.This is an example of what I'm talking about (interval is set to 5 minutes for most controllers):
rate(container_cpu_usage_seconds_total{container="provider-nine"}[1m])
This is our own internal provider but I'm pretty sure a similar picture could be replicated with other crossplane providers, if all controllers get the same
PollInteral
.How could Crossplane help solve your problem?
We could solve this problem outside of crossplane-runtime by randomizing the
PollInterval
by a bit for every controller in our provider.But I would prefer if this would already be handled by crossplane-runtime internally as others might also run into this. One idea would be to add a (configurable) random factor to
RequeueAfter: r.pollInterval
. That random factor could also be zero by default in order to not break the current expectations.The text was updated successfully, but these errors were encountered: