-
Notifications
You must be signed in to change notification settings - Fork 648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: serialize data as strings, reduce bytes per flush #1551
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1551 +/- ##
==========================================
+ Coverage 90.46% 90.54% +0.07%
==========================================
Files 352 355 +3
Lines 12662 12751 +89
==========================================
+ Hits 11455 11545 +90
+ Misses 1207 1206 -1
Continue to review full report at Codecov.
|
case "\\": | ||
return "\\\\"; | ||
case "</script": | ||
return "\\u003C/script"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think like my PR in fluurt
, this could be just <\/script>
— or did you already try that and it didn’t work for JSON-related reasons?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some cross browser testing for spaces in the closing script and all are pretty good about it. For some reason I didn’t test spaces before the closing angle bracket. I’ll test that and update this if there is no issue there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I double-checked the JSON spec and it looks like </script
can be escaped as the slightly shorter <\\/script
, instead of \\u003C/script
— it also looks like trailing garbage before the final >
doesn't change how it closes the parent opening script tag, so that's not a concern either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah this is great! Thanks for doing some research on this one. To be clear are you saying </script abc>
does not get parsed in any browser as the closing script?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, the opposite — it gets parsed everywhere according to spec as the closing script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lame, okay then. We'll leave it as is, but incorporate your <\\/script
suggestion.
renderedComponents.forEach(function(renderedComponent) { | ||
fakeArray.concat(renderedComponent); | ||
if (renderedComponents) { | ||
JSON.parse("[" + renderedComponents.slice(10) + "]").forEach(function( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this — it looks like in the future it could even throttle itself based on how much distress the user’s CPU is in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that was one of the things I liked as well, we can lazily parse the JSON.
case "</script": | ||
return "\\u003C/script"; | ||
default: | ||
return "\\u" + match.charCodeAt(0).toString(16); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this output leading zeroes if the charCode
is less than 4 digits? Or do you not need to bother, since the characters it catches will only end up at those higher codepoints anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only escape two characters both with 4 digits \u2028|\u2029. Having said that I think this would be clearer to just include the full escaped sequence, ie:
case "\u2028":
return "\\u2028";
case "\u2029":
return "\\u2029";
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes more sense to me upon reading it. Might even be faster too — possibility of a Map
instead of a switch
statement too, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't benched Map
vs switch
, but I could only assume switch
is faster? Am I wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm maybe I'm wrong: https://jsperf.com/if-switch-lookup-table/113 (tested in Chrome).
Will need to do a more specific bench than this, but if it is similar perf it would make the code a bit nicer I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, at this point, I’ve been burnt too many times before on my guesses for what’s faster inside JavaScript engines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the micro benchmarks lie quite often. Always have to do a real bench. At least for SSR we can get away with just benching a few node versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
case "\\": | ||
return "\\\\"; | ||
case "</script": | ||
return "\\u003C/script"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I double-checked the JSON spec and it looks like </script
can be escaped as the slightly shorter <\\/script
, instead of \\u003C/script
— it also looks like trailing garbage before the final >
doesn't change how it closes the parent opening script tag, so that's not a concern either.
if (type === "string") { | ||
runtimeId = renderedComponents; | ||
globalKey += runtimeId + "_C"; | ||
globalKey = runtimeId; | ||
} else { | ||
globalKey += (runtimeId = DEFAULT_RUNTIME_ID) + "C"; | ||
globalKey = runtimeId = DEFAULT_RUNTIME_ID; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about shortening this to:
if (type === "string") {
runtimeId = renderedComponents;
} else {
runtimeId = DEFAULT_RUNTIME_ID;
}
globalKey = runtimeId;
Or maybe even:
globalKey = runtimeId = (
type === "string" ? renderedComponents : DEFAULT_RUNTIME_ID
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah good catch. I realized that one as well. With the changes I've made globalKey
and runtimeId
are one and the same. I think I'll just remove globalKey
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! Realized I could shorten it further, since really the type
here would either be string
or undefined
. I just made it runtimeId = renderedComponents || DEFAULT_RUNTIME_ID
.
The API for init
should be one of init()
, init("runtimeID")
or init(serializedComponentDataObject)
.
b8dd120
to
4a05d75
Compare
Circling back to this after some discussions in the Marko Discord about using SubResource Integrity for Content Security Policies. This PR currently serializes script data as: <script>JSON.M+=',{"l":1,"w":[["s0-0-5-5",0,{},{"f":1}]],"t":["/marko-test$1.0.0/render/fixtures/components-await-title/components/hello/index.marko"]}'</script> For statically-generated hashes, it could look more like: <script
data-s=',{"l":1,"w":[["s0-0-5-5",0,{},{"f":1}]],"t":["/marko-test$1.0.0/render/fixtures/components-await-title/components/hello/index.marko"]}'
integrity="sha256-CSXorXvZcTkaix6Yvo6HppcZGetbYMGWSFlBw8HfCJo="
>JSON.M+=document.currentScript.dataset.s</script> |
When I take a look at this for Marko 6 I need to remember your solution here. In theory, by using an attribute to encode the JSON we should have better server side perf since escaping HTML quotes is less expensive than escaping script content. |
Description
This PR makes two main optimizations:
JSON
instead ofwindow
as our global of choice 🤭.fixes #1457
Details
Why
JSON
instead ofwindow
? It is fewer bytes and has less chance of conflicting with other global variables. We sorted the available global keys with browser support of at least IE10 andJSON
is one of the shortest available namespaces and it somewhat aligns with what is happening. 🤷♂️The global added is now a setter as well which still allows for the immediate initialization of components. Previously we returned a fake array with a
concat
method.Checklist: