Support scaleDownDelaySeconds & fast rollbacks with canary strategy #557

jessesuen · 2020-06-26T09:16:22Z

The blue-green strategy has a neat feature in that when the rollout is moving to an older ReplicaSet which is still in its scaleDownDelaySeconds, we perform what is called a "fast-tracked rollback". With a fast-tracked rollback, the rollout will skip all steps, analysis, etc... This feature allows multiple, older versions of the blue-green stack to exist and still run, allowing a rollout to quickly update to a previous stack.

However, when using the canary strategy, the only time we perform a fast-tracked rollback is if the user re-applies a manifest which is equal to the stable pod spec and the rollout has not yet completed it's upgrade. If an older pod spec is applied that is equal to previous scaled-down ReplicaSets, then the rollout will still go through it's normal cycle of steps, analysis, etc...

I think the canary strategy should have the same functionality as blue-green, in that multiple, older ReplicaSets could continue to still run fully scaled for some user-defined period of time. And if a Rollout spec is re-applied which is equal to one of the scaled ReplicaSets, we will also perform a fast-tracked rollback. Note that leaving older ReplicaSets scaled up, would only work for mesh and ingress enabled canary, and not the weighted replica count canary, because allowing older stacks to remain up with the normal canary would mean that traffic would reach the older ReplicaSets.

In order to support fast-tracked rollback for the normal, weighted replica count canary, we could annotate a deadline on the older ReplicaSets, and if we find that we are moving to that older ReplicaSet within the deadline, we would skip all steps, analysis, etc...

jessesuen · 2020-06-26T09:49:41Z

Here is a proposed syntax:

spec:
  strategy:
    canary:
      # Duration after a completed upgrade, in which a fast-tracked rollback to the older ReplicaSet will occur.
      # If omitted, it will look to the value in scaleDownDelaySeconds, and then zero.
      rollbackWindow: 24h

      # Duration in seconds that an older ReplicaSet will remain scaled up after a completed upgrade.
      # Only applicable when trafficRouting is enabled.
      scaleDownDelaySeconds: 3600
      scaleDownDelayRevisionLimit: 2
      trafficRouting:
        smi: {}

Note that the above syntax would allow for an independent control of a rollback window, and scaleDownDelay. The rollbackWindow allows a fast-tracked rollback to occur even when the older ReplicaSet has been scaled down.

In the above example, the rollout would keep two older replicaSets fully scaled 100%, each for a total of 1 hour. However if the Rollout moved to the older ReplicaSet within 24 hours, it would skip analysis and steps, but still have to take some time to scale up the older ReplicaSet.

jessesuen · 2020-07-08T00:36:25Z

I spawned #574 from this bug to introduce "rollback windows" to enable more control over fast rollbacks for both blue-green and canary. This bug will remain open to support scaleDownDelaySeconds in the canary strategy.

lapwingcloud · 2022-03-17T02:12:48Z

it's more than 1.5 years, any update for this ? thanks

awx-fuyuanchu · 2022-09-16T09:34:37Z

Praise for supporting this.

github-actions · 2022-12-07T02:41:19Z

This issue is stale because it has been open 60 days with no activity.

lapwingcloud · 2022-12-07T09:36:59Z

unstale this please bot

nebojsa-prodana · 2024-05-13T08:30:49Z

Hello, is this still something that the Argo project plans on supporting?

jessesuen added enhancement New feature or request canary Canary related issue traffic-routing labels Jun 26, 2020

jessesuen changed the title ~~scaleDownDelay for canary strategy (with traffic routing)~~ Fast-tracked rollbacks with canary strategy Jun 26, 2020

jessesuen mentioned this issue Jul 8, 2020

Option to skip rollout on helm rollback #537

Closed

jessesuen changed the title ~~Fast-tracked rollbacks with canary strategy~~ More controls on fast-tracked rollbacks Jul 8, 2020

jessesuen mentioned this issue Jul 8, 2020

Rollback windows for fast-tracked rollbacks #574

Closed

jessesuen changed the title ~~More controls on fast-tracked rollbacks~~ Support scaleDownDelaySeconds & fast rollbacks with canary strategy Jul 8, 2020

dthomson25 added this to the v0.10 milestone Jul 20, 2020

jessesuen modified the milestones: v0.10, v0.11 Nov 3, 2020

jessesuen modified the milestones: v1.0, v1.1 Jan 4, 2021

jessesuen removed this from the v1.1 milestone May 14, 2021

github-actions bot added the no-issue-activity label Dec 7, 2022

github-actions bot removed the no-issue-activity label Dec 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support scaleDownDelaySeconds & fast rollbacks with canary strategy #557

Support scaleDownDelaySeconds & fast rollbacks with canary strategy #557

jessesuen commented Jun 26, 2020 •

edited

Loading

jessesuen commented Jun 26, 2020 •

edited

Loading

jessesuen commented Jul 8, 2020

lapwingcloud commented Mar 17, 2022

awx-fuyuanchu commented Sep 16, 2022

github-actions bot commented Dec 7, 2022

lapwingcloud commented Dec 7, 2022

nebojsa-prodana commented May 13, 2024

Support scaleDownDelaySeconds & fast rollbacks with canary strategy #557

Support scaleDownDelaySeconds & fast rollbacks with canary strategy #557

Comments

jessesuen commented Jun 26, 2020 • edited Loading

jessesuen commented Jun 26, 2020 • edited Loading

jessesuen commented Jul 8, 2020

lapwingcloud commented Mar 17, 2022

awx-fuyuanchu commented Sep 16, 2022

github-actions bot commented Dec 7, 2022

lapwingcloud commented Dec 7, 2022

nebojsa-prodana commented May 13, 2024

jessesuen commented Jun 26, 2020 •

edited

Loading

jessesuen commented Jun 26, 2020 •

edited

Loading