-
Notifications
You must be signed in to change notification settings - Fork 894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support scaleDownDelaySeconds & fast rollbacks with canary strategy #557
Comments
Here is a proposed syntax: spec:
strategy:
canary:
# Duration after a completed upgrade, in which a fast-tracked rollback to the older ReplicaSet will occur.
# If omitted, it will look to the value in scaleDownDelaySeconds, and then zero.
rollbackWindow: 24h
# Duration in seconds that an older ReplicaSet will remain scaled up after a completed upgrade.
# Only applicable when trafficRouting is enabled.
scaleDownDelaySeconds: 3600
scaleDownDelayRevisionLimit: 2
trafficRouting:
smi: {} Note that the above syntax would allow for an independent control of a rollback window, and scaleDownDelay. The rollbackWindow allows a fast-tracked rollback to occur even when the older ReplicaSet has been scaled down. In the above example, the rollout would keep two older replicaSets fully scaled 100%, each for a total of 1 hour. However if the Rollout moved to the older ReplicaSet within 24 hours, it would skip analysis and steps, but still have to take some time to scale up the older ReplicaSet. |
I spawned #574 from this bug to introduce "rollback windows" to enable more control over fast rollbacks for both blue-green and canary. This bug will remain open to support |
it's more than 1.5 years, any update for this ? thanks |
Praise for supporting this. |
This issue is stale because it has been open 60 days with no activity. |
unstale this please bot |
Hello, is this still something that the Argo project plans on supporting? |
The blue-green strategy has a neat feature in that when the rollout is moving to an older ReplicaSet which is still in its scaleDownDelaySeconds, we perform what is called a "fast-tracked rollback". With a fast-tracked rollback, the rollout will skip all steps, analysis, etc... This feature allows multiple, older versions of the blue-green stack to exist and still run, allowing a rollout to quickly update to a previous stack.
However, when using the canary strategy, the only time we perform a fast-tracked rollback is if the user re-applies a manifest which is equal to the stable pod spec and the rollout has not yet completed it's upgrade. If an older pod spec is applied that is equal to previous scaled-down ReplicaSets, then the rollout will still go through it's normal cycle of steps, analysis, etc...
I think the canary strategy should have the same functionality as blue-green, in that multiple, older ReplicaSets could continue to still run fully scaled for some user-defined period of time. And if a Rollout spec is re-applied which is equal to one of the scaled ReplicaSets, we will also perform a fast-tracked rollback. Note that leaving older ReplicaSets scaled up, would only work for mesh and ingress enabled canary, and not the weighted replica count canary, because allowing older stacks to remain up with the normal canary would mean that traffic would reach the older ReplicaSets.
In order to support fast-tracked rollback for the normal, weighted replica count canary, we could annotate a deadline on the older ReplicaSets, and if we find that we are moving to that older ReplicaSet within the deadline, we would skip all steps, analysis, etc...
The text was updated successfully, but these errors were encountered: