Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fall back to expired cached results when datasources error #19414

Closed
rarkins opened this issue Dec 15, 2022 · 11 comments · Fixed by #20795
Closed

Fall back to expired cached results when datasources error #19414

rarkins opened this issue Dec 15, 2022 · 11 comments · Fixed by #20795
Assignees
Labels
priority-2-high Bugs impacting wide number of users or very important features type:feature Feature (new functionality)

Comments

@rarkins
Copy link
Collaborator

rarkins commented Dec 15, 2022

What would you like Renovate to be able to do?

Reuse cached results instead of aborting runs when external datasource hosts fail

If you have any ideas on how this should be implemented, please tell us here.

  • Do not immediately expire cached datasource results. e.g. expire them in Redis at 10x the TTL we set
  • If the fresh lookup (e.g. to npmjs) succeeds, use it and update Redis
  • If results have passed their TTL, but then the "fresh" lookup fails, reuse the cached results. e.g. instead of an ExternalHostError for npm or Docker Hub, return the old results

If this needs to be implemented manager-by-manager then focusing on Docker Hub and npmjs would be best.

Is this a feature you are interested in implementing yourself?

Maybe

@rarkins rarkins added type:feature Feature (new functionality) priority-2-high Bugs impacting wide number of users or very important features status:ready labels Dec 15, 2022
@zharinov
Copy link
Collaborator

zharinov commented Dec 15, 2022

Can be implemented in the decorator level, but shouldn't be a default behavior since we're also using decorators for caching intermediate results.

@rarkins
Copy link
Collaborator Author

rarkins commented Dec 16, 2022

This should apply to any form of package cache, i.e. also including file-based.

We need to work out terminology for the concepts of:

  • The date at which result is stale and should be rechecked, but can be reused if necessary
  • The date at which result should be expunged and not reused

Redis uses the term TTL for the latter, so maybe we could use ttl for the hard limit and recheckAfter for the first.

Redis will take care of expunging records itself, but file cache does not. So file cache should delete the record and return empty result if the ttl has been reached.

@rarkins
Copy link
Collaborator Author

rarkins commented Dec 20, 2022

A related performance improvement: some datasources could be made more efficient if they were allowed to return metadata about the result, which is then passed back to it after soft expiry.

For example with npm, it could be as simple as returning the etag and/or last-modified headers. Then, these could be used for the requery to (hopefully) get back a 304 response instead of needing the full one.

❯ curl -I -H 'If-None-Match: "e7edea410bbe0a818a4c0dbf7b96484d"' https://registry.npmjs.org/@renovate/pep440
HTTP/2 304
date: Tue, 20 Dec 2022 08:37:49 GMT
cf-ray: 77c70ee96888991a-ARN
age: 287
cache-control: public, max-age=300
etag: "e7edea410bbe0a818a4c0dbf7b96484d"
last-modified: Wed, 06 Apr 2022 19:21:55 GMT
vary: Accept-Encoding
cf-cache-status: HIT
x-amz-replication-status: COMPLETED
server: cloudflare

@rarkins
Copy link
Collaborator Author

rarkins commented Dec 20, 2022

In theory each datasource could cache the metadata separately/independently however it's better to combine it so that:

  • one cache return trip instead of two
  • you need the cached results to return anyway, if you get a 304

Also, there may be some paginated datasources where having the existing results can mean reduced data fetching even when something has changed (e.g. filtering by last updated more recently than the last fetch)

@rarkins rarkins changed the title Fall back to expired cached datasource results instead of ExternalHostError Fall back to expired cached results when datasources error Jan 13, 2023
@rarkins
Copy link
Collaborator Author

rarkins commented Jan 13, 2023

I've implemented this for npm, and would now like to add this partially for the cache decorator. To close this issue:

  • The current TTL used should be considered a "soft" expire time, and the actual cache expire should be set to 24h
  • If the decorator gets back a soft expired result, it should still call the datasource
  • If the datasource returns an error, but a soft expired result exists, then the decorator should log a warning and return the soft expired result

There might be some cases where we should also replace a result of null but for now let's just stick with when the datasource throws.

@viceice
Copy link
Member

viceice commented Jan 13, 2023

Idea for decorator:

  • check cache
  • if has cache and no etag, return cache (old cache or not yet hard expired)
  • if cache has etag and not soft-expired, return cache
  • if cache has etag and is soft-expired, call fetcher with etag
  • if fetcher returns etag update cache soft-expire and return cached value
  • call fetcher
  • if result has etag, do softexpire and store long term, else use current short hard expire for cache
  • return result

@rarkins
Copy link
Collaborator Author

rarkins commented Jan 13, 2023

So if there's an etag, never consider the cache "hot" and return data without checking?

@viceice
Copy link
Member

viceice commented Jan 13, 2023

no, updated idea.

@zharinov @JamieMagee Any idea how to pass the etag to the callback inside the cache decorator?

@JamieMagee
Copy link
Contributor

You'd have to refactor out the etag and make it available from a parameter or static function call. Those are the only things accessible to a decorator.

@zharinov zharinov assigned zharinov and unassigned zharinov Jan 15, 2023
@rarkins
Copy link
Collaborator Author

rarkins commented Feb 21, 2023

Let's set aside the performance improvement related to etag and just get the error fallback working with the decorator if possible

@renovate-release
Copy link
Collaborator

🎉 This issue has been resolved in version 35.19.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
priority-2-high Bugs impacting wide number of users or very important features type:feature Feature (new functionality)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants