Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mention about achieving zero-downtime rolling updates #912

Closed
wants to merge 1 commit into from
Closed

Add mention about achieving zero-downtime rolling updates #912

wants to merge 1 commit into from

Conversation

tyranron
Copy link

This PR adds mention achieving zero-downtime rolling updates to Nginx Ingress Controller docs.

Based on investigation made in #322.

@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://github.com/kubernetes/kubernetes/wiki/CLA-FAQ to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


  • If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
  • If you signed the CLA as a corporation, please sign in with your organization's credentials at https://identity.linuxfoundation.org/projects/cncf to be authorized.
  • If you have done the above and are still having issues with the CLA being reported as unsigned, please email the CNCF helpdesk: [email protected]

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label Jun 27, 2017
@k8s-reviewable
Copy link

This change is Reviewable

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Jun 27, 2017
@coveralls
Copy link

Coverage Status

Coverage remained the same at 44.239% when pulling e16efc0 on tyranron:nginx-rolling-restart-docs into 1468fcb on kubernetes:master.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 44.239% when pulling e16efc0 on tyranron:nginx-rolling-restart-docs into 1468fcb on kubernetes:master.

@aledbf
Copy link
Member

aledbf commented Jun 28, 2017

@tyranron I am not sure this is a good advice. If you use liveness and readiness probes + a strategy in the deployment, you don't need to use the hook

spec:
  replicas: 3
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0

@kfox1111
Copy link

As I understand it, the concern is how long it takes nginx-ingress to notice when a pod becomes unready? how does kube-proxy handle it? Cause a pod can go unready for reasons other then deletion.

@aledbf
Copy link
Member

aledbf commented Jun 28, 2017

@kfox1111 good question. Please check this comment kubernetes-retired/contrib#1140 (comment) (and the next ones from thockin)

@tyranron
Copy link
Author

@aledbf that just does not work in most cases. It often works for simple single-container Pods, but for multi-container Pods it works quite rare. The reasons why this happens are discussed exactly in a issue you've referred above.

To refresh: Pod can still receiving new traffic after receiving SIGTERM, because in Kubernetes SIGTEM means not just "terminate gracefully", but "wait a sec, and terminate gracefuly". But most applications do not do that behavior out-of-the-box.

I understand that this solution is not ideal and looks quite strange. But this simple hack, proposed by @foxylion in #322 just does work in almost all cases. Even @thockin agreed a bit with that.

@tyranron
Copy link
Author

Yet another investigation/confirmation of situation in relative thread.

lifecycle:
preStop:
exec:
command: ["sleep, "15"]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a typo here, you're missing a "
This should be:

        lifecycle:
          preStop:
            exec:
              command: ["sleep", "15"]

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 31, 2018
@rikatz
Copy link
Contributor

rikatz commented Feb 14, 2018

@aledbf We use the same strategy mentioned here, as there are some (ugly) java processes that are pretty damn slow to stop.

So, using a preStop command worked fine for us, as the POD is marked as 'stopping' and is removed from the Service previously to fully stop.

IMHO it's a good advice :)

@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 17, 2018
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@tyranron tyranron deleted the nginx-rolling-restart-docs branch April 16, 2018 07:26
@stealthybox
Copy link
Member

Did we ever update the docs with this information?
It's good advice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants