You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks to @shawnbot and @ultrasaurus for brining this to our attention on College Scorecard. After debugging with them, we unfortunately uncovered a rather nasty bug in how our caching layer interacts with certain API setups. Here's what's happening:
Let's say you have 2 API backends with the following setup:
Staging API Backend:
Frontend Host: api.data.gov
Frontend URL Prefix: /staging/
Backend Host: my-staging-server.example.com
Backend URL Prefix: /
Production API Backend:
Frontend Host: api.data.gov
Frontend URL Prefix: /production/
Backend Host: my-prod-server.example.com
Backend URL Prefix: /
If your APIs responded with cache control headers instructing api.data.gov to cache responses, then it's possible identical requests to staging and production would get mixed up (eg, https://api.data.gov/staging/foo and https://api.data.gov/production/foo).
The problem is that at the step when we perform the caching we've rewritten the URL with the backend URL prefix (so /staging/foo is rewritten to /foo). However we are not taking into account the backend host at this phase, which is what causes collisions (since /production/foo will also be rewritten to the exact same /foo). So basically this could cause unexpected caching collisions for any API backends allowing caching that use the exact same URL structure as other API backends.
This was definitely an unfortunate and glaring oversight on our part. I believe this bug has been around for as long as we've been offering caching, so I'm just baffled we hadn't run into it before. Not all API setups would be affected by this (it hinges upon multiple backends with the same URL structure), but it's definitely possible people have been encountering this without realizing it (for example, staging and production results might have been mixed up which might not be noticeable if the API and data is the same, but might prove much more problematic in other situations).
The text was updated successfully, but these errors were encountered:
In addition to ensuring hosts are treated separately, this update also ensures all API backends are kept completely separate in the cache even if the backend host is the same (I don't think we had any instances of this, but this could technically happen if the backend hostname was the same, but being served from different HTTP ports).
Thanks to @shawnbot and @ultrasaurus for brining this to our attention on College Scorecard. After debugging with them, we unfortunately uncovered a rather nasty bug in how our caching layer interacts with certain API setups. Here's what's happening:
Let's say you have 2 API backends with the following setup:
api.data.gov
/staging/
my-staging-server.example.com
/
api.data.gov
/production/
my-prod-server.example.com
/
If your APIs responded with cache control headers instructing api.data.gov to cache responses, then it's possible identical requests to staging and production would get mixed up (eg,
https://api.data.gov/staging/foo
andhttps://api.data.gov/production/foo
).The problem is that at the step when we perform the caching we've rewritten the URL with the backend URL prefix (so
/staging/foo
is rewritten to/foo
). However we are not taking into account the backend host at this phase, which is what causes collisions (since/production/foo
will also be rewritten to the exact same/foo
). So basically this could cause unexpected caching collisions for any API backends allowing caching that use the exact same URL structure as other API backends.This was definitely an unfortunate and glaring oversight on our part. I believe this bug has been around for as long as we've been offering caching, so I'm just baffled we hadn't run into it before. Not all API setups would be affected by this (it hinges upon multiple backends with the same URL structure), but it's definitely possible people have been encountering this without realizing it (for example, staging and production results might have been mixed up which might not be noticeable if the API and data is the same, but might prove much more problematic in other situations).
The text was updated successfully, but these errors were encountered: