Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Renewal check causes excessive CPU usage #110

Closed
gohai opened this issue Dec 14, 2017 · 9 comments
Closed

Renewal check causes excessive CPU usage #110

gohai opened this issue Dec 14, 2017 · 9 comments
Labels
Milestone

Comments

@gohai
Copy link
Contributor

gohai commented Dec 14, 2017

We're using lua-resty-auto-ssl to provide HTTPS for a sizable number of domains.

In production we're seeing patterns of elevated CPU utilization on our server instances, roughly every 24 hours, which correlates with a large number of accesses (GET and SET) into the Redis key-value store used as backend for storing the certificates.

Any idea what is causing this? Is nginx perhaps re-cycling workers after a while, at which point the shdict needs to be rebuilt? (But why the SETs?)

@luto
Copy link
Collaborator

luto commented Dec 14, 2017

By default the renewal check is happening every 24 hours, which probably causes your high load. To make it less often adapt the renew_check_interval setting (default: 86400 seconds). Although I am not sure, if the load you are seeing is expected. How many different certificates/domains do you serve?

@gohai
Copy link
Contributor Author

gohai commented Dec 14, 2017

@luto Is see 19645 keys in my Redis, so 9822 domains? (assuming one key with :timestamp and one with :latest per domain)

I'll have a look at the renewal code tomorrow, thanks for pointing this out. The spike we're seeing is roughly 50 minutes long, where CPU utilization goes from ca. 2 to 45 percent. Redis accesses go up from ca. 50 to 1500 per minute. (We have two instances connected to the same Redis store.) Do those numbers sound reasonable?

@luto
Copy link
Collaborator

luto commented Dec 14, 2017

Those numbers look a little high to me. But the instances we're currently running are in the 50-300 domains range. @GUI (the maintainer of this project) probably has a larger setup to compare some numbers.

Note that I have never worked with the renewal code myself. But since this is afaik the only thing running daily, it may very well be the culprit. Doing some back on the envelope calculations with your numbers, I suspect that there is still potential for optimization.

@brianlund
Copy link
Contributor

If you want to confirm it's the renewals, it's pretty easy to just set it to every 2nd day or so and see if the spikes move with it. I ran some tests with 30.000 domains (we're planning to eventually run it with all of our +100.000 domains) and 50 minutes sounds reasonable for 10.000 domains. We ended up with some forked code (https://github.com/simplesite/lua-resty-auto-ssl thanks to https://github.com/ryokdy) that limits the renewal checks as we saw it would take well over 10 hours in our case.

@gohai
Copy link
Contributor Author

gohai commented Dec 17, 2017

It is indeed the renewal check, thank you @luto @brianlund. I will have a look at the fork mentioned.

@gohai gohai changed the title Elevated CPU usage every ~ 24h Renewal check causes excessive CPU usage Dec 17, 2017
@luto
Copy link
Collaborator

luto commented Dec 17, 2017

Extracting the code out of that fork into a PR would be great 😅 seems quite useful.

Cookies for anyone who does it ;)

@brianlund
Copy link
Contributor

Can't say no to cookies :) Picked out the code to store certificates and paired it with my own changes to the renewal job.

There is also code for deleting expired certificates, if this gets included I'll create a PR for that.

@gohai
Copy link
Contributor Author

gohai commented Dec 22, 2017

@brianlund Thank you for the PR! I'll see to test this in the new year.

@luto luto added the bug label Dec 25, 2017
brianlund added a commit to brianlund/lua-resty-auto-ssl that referenced this issue Jan 2, 2018
GUI added a commit that referenced this issue Jan 29, 2018
@GUI GUI added this to the v0.12.0 milestone Jan 29, 2018
@GUI
Copy link
Collaborator

GUI commented Feb 5, 2018

Should be fixed in the v0.12.0 release thanks to @brianlund's #111 PR. Thanks everyone!

@GUI GUI closed this as completed Feb 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants