-
Notifications
You must be signed in to change notification settings - Fork 341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling Partial Content / 206 #144
Comments
Some discussion at TPAC. There are many use cases for range requests, initial approach might be to identify bare minimum and only support those. E.g.,
(0) and (1) are a straw-man target for now. Thoughts? |
So from those:
Here are some things I'm wondering about:
|
|
I mean the HTTP cache, not the SW cache. Or do you mean it's effectively an implementation decision? |
Ah I see. Maybe we should talk about that too. |
Any hope that this will be implemented, in terms of being able to do it from a SW cache? It certainly would help with a problem I currently have with the Audio element and range headers |
I need to do some research to see what browsers do today vs the spec & decide if the service worker spec needs changes, or browsers need to fix bugs. In the mean time, check out https://samdutton.github.io/samples/service-worker/prefetch-video/ - it's a bit hacky, but it works. It constructs ranged responses on the fly. |
Thanks for the link, the workaround worked for my audio problem too. |
So, I did the research on range requests. It's a mess. https://docs.google.com/document/d/1SphP-WNxqzZrSv_6ApC9_FpM-m_tLzm57oL3SNGg-40/edit#heading=h.1k8r6xdc6vfo I think fetch should say something about the minimum browsers are required to support here. |
Also, unless I'm missing something, fetch should also create a network error if a 206 is returned when no range was requested. |
@jakearchibald if you have time to work on this (or can find someone else) that would be great. I'm kinda swamped. |
I'll talk to our media teams and see what their expectations are and get as far as I can. |
FWIW, I've written some basic tests to see if the HTTP (!SW) cache that Fetch uses supports partial content (i.e., range requests). AFAICT, the only browser that does anything is Safari TP, which will store a partial response (i.e., one whose request had a It will not serve a subset of an already-cached response to a request containing Firefox and Chrome don't even handle the simple case (again, according to my test; I talked to @mcmanus about this this AM, and he was a bit surprised to hear that). Edge doesn't support the HTTP cache from Fetch, AFAICT. |
In particular: * Be more specific about terminology * Detail more clearly how requests are to be modified Tests: web-platform-tests/wpt#5137. During review we decided to postpone #144 (poorly implemented if at all) and #307 (also poorly implemented despite security implications). Fixes #336 and fixes #373.
I've been thinking about this again, here's a brain dump:
|
I don't think we do range requests for anything but media elements, though that might have changed recently for image elements now that I think of it? |
There's downloads, which I guess are outside of spec land. We want to use them in background fetch. |
Downloads are somewhat defined, as they interact with navigation and |
I pitched the following to our security team:
My assumption is that APIs that range requests are only used by media elements and downloads (which I'll need to verify). This means you can't interpret a portion of a resource as script/css/etc. Additionally, APIs consuming ranged responses should ensure all parts of a range have the same first entry in the response url list for a given resource. However, there's still a worry that this new capability carries significant risk, and that we should look for another way forward. The alternative solution is to find a way to mark a request as "allowed privileged headers", and allow the Range header in that case. Modifying the request in any way would remove the "allowed privileged headers" flag, meaning you couldn't take an internally-created Range request & change the URL & make the request, but you could do fetch(fetchEvent.request) if it had a Range header. This means In addition to this, we should still:
Additionally, APIs consuming ranged responses should ensure all parts of a range have the same first entry in the response url list for a given resource. |
Just a note here on
We (Facebook) are actually using this currently on XHR as a feature to circumvent CORS preflight requests requirements for some cross origin requests. We add the range as query string parameters and have the cross origin server interpret them equivalent to range headers. We are currently in transitioning some of these things to fetch and breaking this would be a big problem for us. Adding Range to the header safe-list however, would probably meet our requirements and in fact would likely be the superior solution anyway, since it will allow us to fully remain within HTTP semantics and allow for better caching at various levels. |
@DanielBaulig thanks! If we shipped what I proposed it'd have broken XHR too, since XHR uses fetch. Is the server sending a 206 status in this case? I guess you're using this in a super-safe way that doesn't allow the original content to be interpreted as script? |
@jakearchibald I thought it did, but I just double checked and it actually doesn't return a 206, but a 200, so there shouldn't be any acute breakage. We are only applying this to media content fetched through XHR/fetch and currently always have well aligned requests to the same byte ranges, meaning the byte ranges of two requests will always either be equal or excluding each other, there will never be partial overlaps, so we shouldn't have any cacheability regressions. That said, the reason I stumbled on this issue thread in the first place is that we are looking into changing this and using less fixed and well aligned byte ranges that could end up (partially) overlapping. Since that would break caching (the query string parameters are included in the browsers cache keys, if they are not identical, there won't be a cache hit), we were looking into implementing caching for these requests in SW by breaking the byte range query parameters out of the cache key for these requests, which then lead me to this thread and also to my reply on this W3C SW thread regarding Cache API supporting Range requests and responses. Since we are not actually returning 206 though, we should not be seeing any problems from this proposal being implemented. My bad. |
Still, I think we should test what browsers actually do when the server returns a 206. If browsers just return it as-is I don't think we should change the behavior into returning a network error. It's not worth the risk. |
@DanielBaulig would you mind to show an example of such artificial range parameter? How could I test the behavior you described? |
Seconded, it'd be interesting to know which Facebook URLs support this. |
All browsers seem to allow a 206 partial response for script elements. As in, it will execute the body of the response, as if it was a 200. Chrome security were a little worried about this, and were keen on making this an error. But yeah, we'd need to do a lot of testing. |
@sirdarckcat We use bytestart, byteend query string parameters for video playback on Facebook to deal with CORS restrictions. If we added normal Range headers we would have to do preflight requests to resources served from our CDN origin. |
I'm going to take another swing at this. These cases should work:
Since the service worker can rewrite fetches, it opens up the following attacks: Attack 1A media element makes two requests: Request: Resource A. No-cors. Cross-origin. Byte range 0-5000. Request: Resource A. No-cors. Cross-origin. Byte range 200-5000. In this case, resource A isn't a valid media resource, but its 200th byte is now leaked as Solution: The media element must not allow a mixture of opaque and non-opaque responses for a given piece of media. Attack 2A media element makes two requests: Request: Resource A. No-cors. Cross-origin. Byte range 0-5000. Request: Resource A. No-cors. Cross-origin. Byte range 200-5000. Again, the 200th byte is leaked. Solution: If the media element receives opaque data, the last URL in each response's URL list must be identical. Looking at #145, it seems Chrome is fine as long as the responses are all the same origin. However, I'm worried about origin A redirecting/rewriting to lots of different places in origin B. I need to spec where a media element goes for the second part of some media, if the first part results in a redirect. Browsers behave differently here. Attack 3A media element makes two requests: Request: Resource A. No-cors. Cross-origin. Byte range 0-5000. Request: Resource A. No-cors. Cross-origin. Byte range 200-5000. Again, the 200th byte is leaked. Solution: In the first fetch, the start of the byte range returned does not match the start of the byte range requested. This should be rejected. We could reject this in the fetch spec, but I don't think we should block it for manual This should be blocked by the media element, but we could have a helper in the fetch spec to make this easier. Attack 4A script element makes a request: Request: Resource A. No-cors. Cross-origin. In this case resource A is an html resource like: <p>Foo</p>
<script>const gender = 'female';</script>
<p>Bar</p> …and the browser has been previously tricked (perhaps using a media element) into making a request for the range that contains Given that the script element accepts partial responses, this is a tricky one. It seems that we want to continue to support the case where the server has, unprompted by the range header, returned a partial response. The difference in this case is the server was promted. Solution: The response needs to know if its associated request had a Range header. Fetch should reject if the original request did not have a range header, but the service worker provides a response that is opaque, partial, and was requested with a range header. @annevk I'm interested in your thoughts on the solution for attack 4. |
(I chatted about this with @annevk on IRC & he's happy with the attack 4 solution) |
I'm a little concerned with
from Attack 2. In particular if that's what we want to do in the face of redirects. If anything we probably want to compare the last URL. (Safer would be the whole list, but unfortunately no-cors cross-origin to same-origin is already non-opaque.) |
My intent is to ensure that all the requests have gone to the same place, as in, the service worker hasn't tried to combine multiple sources. If the server wants to redirect each request, that seems weird but fine. Although I'll test what browsers actually do in this case. |
@annevk ahh yes, I get it now. I've updated the solution to Attack 2 to be last url in the list. |
@jakearchibald Would it be possible to copy-and-paste your security analysis in #144 (comment) to a more official-sounding URL? It would be nice if it was linked from the Fetch standard to make it discoverable. It would be nice if it said something about the threat model, but that might be infeasible if it came down to "the threat model of the whole web platform". |
@ricea I was going to leave notes in the spec next to the solutions. Does that work? The solution for Attack 4 will be in the fetch spec, but the others will be in the HTML spec. Is this what you meant? If not, I'm happy to put the text somewhere else, I'm just not sure where. |
@jakearchibald SGTM. My concern is just to make sure it is discoverable and gets the widest possible review. |
This is part of #144. The aim is to allow APIs to use the range header for no-cors requests, and allow them to pass through a service worker, but disallow modification of these requests, and disallow developers creating their own no-cors ranged requests. Tests: web-platform-tests/wpt#10348.
This is part of #144. The aim is to allow APIs to use the range header for no-cors requests, and allow them to pass through a service worker, but disallow modification of these requests, and disallow developers creating their own no-cors ranged requests. Tests: web-platform-tests/wpt#10348.
This is part of #144. The aim is to allow APIs to use the Range header for "no-cors" requests, and allow them to pass through a service worker, but disallow modification of these requests, and disallow developers creating their own "no-cors" ranged requests. Tests: web-platform-tests/wpt#10348.
See new PR: whatwg/html#7655. Needs some more work and lots of WPT but ready for initial feedback. |
whatwg/html#7655 was merged but this is still open. Should this be closed? |
I think that did solve what browsers do with media, but overall range support is still bad so we should probably keep this open to keep us honest. |
Discussed in #97 - breaking out into a separate issue.
The text was updated successfully, but these errors were encountered: