-
Notifications
You must be signed in to change notification settings - Fork 339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
preload, destinations, and module scripts #486
Comments
I strongly prefer (1) as I think (2) can cause serious confusion, even if we'd find a way to cleanly spec it. One example where I think it won't be trivial to explain why |
I disagree on the explanatory issues, but setting that aside for now, why is type used and encouraged there when the preload spec does not use type="" at all? Are implementations doing nonstandard processing of the type="" attribute that is not in the spec? |
Hmm, you're right that type is missing from the "obtain the preload resource" processing model (It is present in "appropriated times" processing.) |
How would the (The real problem here is that JavaScript modules don't have their own MIME type. If you want different parsing, you should communicate that there.) |
@annevk - I think that varies based on the option we would choose to take. Current preload implementation (and spec's spirit, despite processing model omissions that @domenic pointed out) treats In the (2) option @domenic is proposing, |
Somewhat related: We're investigating caching parsed versions of scripts when they're added to the cache API & we've hit a similar problem. Our current plan is to parse as a script, then try parsing as a module if it bails. We wouldn't have to do this if the info was on the request object. But meh, we're just playing with ideas right now. |
@jakearchibald - IIUC, defining a |
@yoavweiss only if there was a way of creating a request object with that destination. |
Right... That seems like something we'd want to do anyway. Talking to @slightlyoff last week, he mentioned that currently there's no way for SW to fetch a script as a script. So in terms of network priority, an altered script request will have a significantly lower priority than an equivalent unaltered request, which isn't great. |
From what I remember the problem with that was actually making sure it is indeed used for the stated purpose. It's not entirely clear to me preload guarantees that at the moment, probably in part because the preload cache is not defined... Because if it's not used for the stated purpose, you can circumvent CSP to some extent, which uses these tags. |
Doesn't an empty destination let you circumvent CSP in similar ways? Preload's implementation in Blink/WebKit currently prevents that by using the same request pipelines that prevent such "type-mismatch reuse" on regular resource requests. |
Empty should result in the ultimate fallback of CSP, so no, I don't think so. And yeah, preload not being defined has been a known problem for a while, I wish it were fixed. (See also #354.) |
Oh, OK. I thought empty is subject to |
OK, so it sounds like there's more appetite for (1), a new destination. I can work on the appropriate spec PRs after we settle the remaining questions:
|
It still seems cleaner to give module scripts their own MIME type, especially for other parts of the ecosystem (e.g., what if you put them in the Cache API or generate them dynamically), but I guess we're not going to go over that again? |
Indeed. But it's kind of crucial for usefully preloading modules, which are expected to be deep graphs. If we don't add this to preload, then preload is largely useless for modules, and we'll need to invent a new way of preloading module graphs. Maybe that would be for the best though, because then we can avoid this whole "new destination" business?
How can this be true? Why do we have separate destinations for script, serviceworker, sharedworker, and worker if they can all be processed the same way?
No, I don't think so. |
@domenic I forgot that request destination and type are distinct for a reason. Ugh. However, if you need to preload a graph, perhaps something like rel=modulepreload is better. That still doesn't allow a e.g. a service worker to just fetch a module script and the browser to start lazily compiling it though, so I still think we should consider the MIME type strongly. It's the only thing that's tightly coupled with the thing we care about, the response. |
I don't think it's feasible to change the MIME type; it means people won't be able to use JavaScript modules until they upgrade their server. In any case, let's keep that as a separate thread, if you want to continue pursuing it. rel=modulepreload sounds pretty good to me. It does lack an imperative API counterpart (in non-DOM contexts), but then, so does rel=preload, right? Since fetch() doesn't have the ability to set the destination? So there's no way with current technology to fetch a module script (or module script graph) and have the browser start lazily compiling. Same for any non-"" destination, really. I guess speccing rel=modulepreload is fairly simple: it just performs "fetch a module script graph" given a URL. If people are on board with that, I can do that pretty simply. Would love to hear @yoavweiss's thoughts. |
Wouldn't you also need to store the result of that fetch someplace? (The part of preload that isn't really defined.) |
I assumed the HTTP cache would suffice, but I guess there is some history here where it does not? |
Well, OK, I guess even if the HTTP cache doesn't suffice, the module map suffices. So it should be fine. |
A couple more thoughts:
This leads me to the following proposal:
In this world, I'll work on modulepreload now, but still would love to hear more... |
This allows preloading module script graphs. The processing model for this turns out to be different enough that simply extending rel="preload" is not a good option. Closes whatwg/fetch#486.
Do we need the destination override given that https://github.com/dherman/esprit supposedly makes it possible to tell whether something is a module script or not? |
It seems arbitrary to add a new ref type just for module. Why can't rel=preload as=modulescript not work? Are we going to add rel=prefecthmodule as well? |
@annevk if that is the thing dherman was referencing in his email, it is a custom parser that is attempting to implement a proposal that has been rejected by TC39 a couple times (where you add
@rniwa the above discussion goes into some fairly extensive detail on why rel=preload as=modulescript can't work, but I guess it was not summarized nicely for newcomers. Let me try. Here are the main reasons why rel=preload as=modulescript doesn't work.
You might then say, why don't we just patch the preload spec to do an "if
We could. However, since prefetch is only about hinting to the browser to populate the HTTP cache, a better solution might be something that expands the hint to ask for all subresources, e.g. HTML documents with images and CSS files with |
Not expressing an opinion on this thread (yet anyway) but just a clarification:
No, the point is that it's an implementation technique that allows you to cache the work of parsing such that you can reuse the result regardless of whether the source ends up being used as a classic script or module script. IOW, it's an implementation technique that demonstrates the feasibility of a preload mechanism where the author doesn't specify what kind of JS payload is being parsed, but that gets all the same performance benefits since you can have a single pre-parser that works for any kind of JS payload. |
I guess it's very silly question, but may I ask why this one is a requirement?
It seems to me, that in context of modules, preload is a remedy for waterfall problem, so ideal solution would be creating a flat list of all modules used in the app (as |
And especially once custom resolvers are considered - having the resolver called randomly and multiple times by the cache will break a lot of nice expectations one could otherwise have for these representing a direct tree load. |
To clarify further, |
+1. This seems like a prime example of perfect as the enemy of good. Now that browser support for modules is actually rolling out, it's becoming more important to be able to actually load them efficiently. Closure Library is currently working on integrating modules into our debug loader (in which case we already have a flat list of all recursive dependencies), but with zero support for preloading modules, there seems to be no way to interleave non-module execution between two modules without waiting for a bunch of serial fetches. Given that |
I expect it'd actually be easier to implement recursive semantics since then you could reuse the code already in place for Regardless, recursive semantics vs. not are a tiny, tiny detail in the overall tapestry of problems with naively integrating |
@domenic the argument there seems to jump to some conclusions already.
I don't see it as obvious that this is a necessity. Can a preload cache not be used by the module loading algorithm to populate the module map?
As discussed above, a simple solution may be to simply skip this. |
The problems in "There are three issues with that:", especially the first two, are IMO the most fatal. As for your suggestion to populate the preload cache and then later populate the module map, it's interesting, but I don't think we should keep modules in two separate in-memory caches. |
I see. My knowledge of the internals isn't great, but I assumed that if it was a straightforward preload cache using the same algorithm that would avoid those three issues you refer to? Multi-layering of caches happens naturally anytime caches are used. When hinging arguments on points like this it would help to flesh them out a little more I think. |
No; those issues are independent of the preload cache, and are about fetch's architecture (destinations). |
Ok then, moving to the next blocker it sounds like you're specifically referring to the messiness of varying Thanks for taking the time to explain - and please do point me to where this is discussed elsewhere if it is just rehashing. |
as=script does not preload classic worker scripts, indeed. You have to use as=worker. (Or as=sharedworker, as=serviceworker.) I don't think this is being discussed elsewhere; we're just assuming a decent amount of shared context on what is in the fetch spec (e.g. familiarity with the fetch destination concept), since this is the fetch spec repo. I'm OK taking the time to extract info from the spec into this issue thread as required. |
As far as I can tell from a quick glance at the spec it sounds like the distinction between |
Sure. It's still a problem though, even if you think it's a lower-priority problem. |
Certainly, but from the perspective of being messy I guess while it is from a spec perspective, it doesn't pass on as much cognitive overhead to the average user if |
It's just another argument for introducing a different rel, where we can reuse symmetric as= values instead of having a (small) combinatorial explosion. I'm unclear why you think that reusing link rel=preload is a good idea. Do you think it will be faster if we do? Definitely not; it means we'd have to do a lot more work to reconcile the two processing models. |
@domenic do you see the combinatorial possibilities extending beyond these four? Or can we agree that a unified I'm advocating rel=preload simply because the naive user-facing conceptual models of the web should embrace outward simplicity and reusability as far as possible to remain understandable to all, but I'm more than happy to step back on that if there are untenable spec differences here. Note also that I'm not trying to throw out the concept of a full graph preload entirely - simply to say that what is needed urgently is a spec that will allow a low-level network-layer caching of module resources. A separate rel for a graph-based preload could potentially be addressed later on - perhaps then more generally too in a way that applies similarly for css, picking up a truly unique semantic meaning then in the process. |
Yes; we just added service worker in the last year or so. The future of the web is long.
I'll try to be more clear. Breaking the mapping of destinations to as="" values is pretty bad no matter what. We have a serious conflict here between destinations as they're used in fetch (both classic and module scripts go through as="script"; both classic and module workers go through as="worker"; etc.), and as="" as it's used in preload (we need to distinguish between classic and module scripts/workers/etc.). We need some way to signal "destination=script but treat it as a module script". as=modulescript is a pretty bad way to do that because it breaks the correspondence between as="" and destination.
Kind of the point of my post above is that there are untenable spec, implementation, and mental model differences here. It seems you didn't find it convincing, so I think it's a good exercise to go through and help me expand on the points; maybe you can provide your own summary if I manage to convey enough information, and that will be more convincing to others. But it's my strong feeling that based on available evidence, the differences are too large.
Well, we already have that in HTTP/2. So I'm not sure how urgent it is. But yes, it'd be nice. Anyway, as I said above, I don't think recursive vs. not is a very interesting aspect of this whole discussion. I think it'll be about as easy to implement either one; maybe a bit easier to implement the recursive one. The hard parts are unrelated to recursive vs. not, but about fetch destinations, credentials modes, preload cache vs. module map, effect of document mutations, etc. Those are the parts that convinced me we need a separate rel="", for the separate processing model. |
Any new top-level execution goal will likely use the same principles of a module graph at this point. If it is possible to unify on a module parse goal that distinguishes binary formats by header bytes (like wasm and ast binaries, and any future specs), then combinations can be avoided where the binary header space would become the new "version space" of web-based parse goals. If this route isn't taken, then any spec work should be designed to prepare for a much much larger combinatorial explosion here. So I'd argue this doesn't have to be the case if we can follow the first path above.
Ah, I wasn't aware of this, but of course it seems a result of the process. Would it be too late at this point to alter the destination names of module scripts to go through "modulescript" and "moduleworker" etc? Or would there be other concerns with such a change?
I will try to understand the fetch spec concerns a little better here. Perhaps it might help to start considering what issues might arise here to do with credentials modes that are unique to modules and not to As you know I'm arguing for preload cache over module map. Document mutation details seem to be fleshed out in the preload spec as well, so naively I would just assume reusing the terminology for things such as these would be beneficial. |
No, that's not correct. The preloading concerns for service worker vs. page vs. worker are orthogonal to module graph vs. classic script. They're largely about implementation-level things like prioritization and not really related to moduleness.
My understanding is that would break their integration into other aspects of the system which process scripts in the same way. Maybe it is changeable, but if the motivation is simply avoiding another rel="" value by introducing N new as="" values, it doesn't seem worth it.
Yes, that also is not a good decision, I believe. Compared to HTTP/2, the benefits are tiny, whereas if we can create something module-specific that takes care of all the module-related pieces for us and avoids loading the bytes into two separate caches, we're in much better shape. |
The cancellation time of HTTP/2 PUSH is a huge problem for avoiding cache redundancy. The best shape for the web is one where a module graph doesn't even have to make a request to the network - which is the goal I'm after here. We don't need network-based preload for this - we need a flat hinting scheme which preload can offer us. I don't actually care about the rel to be perfectly honest - the above is all I'm interested in. But please think ahead to the new parsing formats of the web if designing a new rel - that's the combinatorial explosion to worry about. |
To follow up on the above, I just wanted to illustrate here very briefly my exact argument, to be sure that nothing is being missed. The ideal optimization workflow I'd like to see for modules would be simply inlining the flat preloading and integrity information as a production optimization step: <link rel="preload" href="/module-dep.js" integrity="..." />
<link rel="preload" href="/module-deep-dep.js" integrity="..." />
<script type="module" src="/module.js" integrity="..."></script>
I understand there are a lot of spec concerns here to do with the exact mechanics, but the overall workflow in the above is what I'd really love to see. |
This allows preloading module scripts, and optionally their descendants. The processing model for this turns out to be different enough that simply extending rel="preload" is not a good option. Closes whatwg/fetch#486.
This allows preloading module scripts, and optionally their descendants. The processing model for this turns out to be different enough that simply extending rel="preload" is not a good option. Closes whatwg/fetch#486.
This allows preloading module scripts, and optionally their descendants. The processing model for this turns out to be different enough that simply extending rel="preload" is not a good option. Closes whatwg/fetch#486. Tests: https://github.com/w3c/web-platform-tests/blob/master/preload/modulepreload.html
This allows preloading module scripts, and optionally their descendants. The processing model for this turns out to be different enough that simply extending rel="preload" is not a good option. Closes whatwg/fetch#486. Tests: https://github.com/w3c/web-platform-tests/blob/master/preload/modulepreload.html
Just want to say that this whole discussion thread was super helpful. I’ve been figuring out how to use a But it did take me a while to find this thread and all the other resources I found before this were lacking. A summary of this as a blog post or MDN article would probably be quite useful for the next person who has to figure this out ;) |
@whatwg/documentation interested in documenting some of the above on MDN? |
The problem
Given
<link rel="preload" href="foo.js" as="script">
, we don't know whether foo.js will be a classic script or module script. This means that we can't parse it ahead of time. Additionally, for module scripts I'd expect preloading the module script to also preload its dependencies.Potential solutions
We could solve this in one of two ways, as far as I can see:
<link rel="preload" href="foo.js" as="module">
or some other similar destination (e.g.modulescript
)<link rel="preload" href="foo.js" as="script" type="module">
(2) was initially rather attractive to me, for matching the syntax used by the
<script>
element. Note that there's no conflict with the existing semantic oftype=""
being a MIME type, since in practice that is not used by any part of the HTML or preload specs.But now I am less sure. It seems hard to spec it in a clean way. We'd either need to thread the type="" metadata through all the fetch locations (including e.g. service worker, not just the preload spec, right?), or we'd need to create some sort of "shadow destination", so breaking the 1:1 mapping of as="" and request destinations. That sounds not great.
I'd love thoughts on the best way to go here. If (1) is the way to go then we can just Bikeshed the destination name and update HTML to use it when fetching module scripts.
/cc @addyosmani @yoavweiss @whatwg/modules
The text was updated successfully, but these errors were encountered: