Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: serialize data as strings, reduce bytes per flush #1551

Closed
wants to merge 2 commits into from

Conversation

DylanPiercey
Copy link
Contributor

@DylanPiercey DylanPiercey commented Apr 17, 2020

Description

This PR makes two main optimizations:

  1. Instead of serializing component initialization data as JavaScript, it instead puts it inside a JavaScript string.
  2. Reduces the number of bytes per flush by using string concat and JSON instead of window as our global of choice 🤭.

fixes #1457

Details

Why JSON instead of window? It is fewer bytes and has less chance of conflicting with other global variables. We sorted the available global keys with browser support of at least IE10 and JSON is one of the shortest available namespaces and it somewhat aligns with what is happening. 🤷‍♂️

The global added is now a setter as well which still allows for the immediate initialization of components. Previously we returned a fake array with a concat method.

Checklist:

  • I have read the CONTRIBUTING document and have signed (or will sign) the CLA.
  • I have updated/added documentation affected by my changes.
  • I have added tests to cover my changes.

@codecov
Copy link

codecov bot commented Apr 17, 2020

Codecov Report

Merging #1551 (4a05d75) into master (fe172b1) will increase coverage by 0.07%.
The diff coverage is 89.73%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1551      +/-   ##
==========================================
+ Coverage   90.46%   90.54%   +0.07%     
==========================================
  Files         352      355       +3     
  Lines       12662    12751      +89     
==========================================
+ Hits        11455    11545      +90     
+ Misses       1207     1206       -1     
Impacted Files Coverage Δ
commitlint.config.js 0.00% <0.00%> (ø)
packages/marko/browser-refresh.js 0.00% <0.00%> (ø)
packages/marko/express.js 0.00% <0.00%> (ø)
packages/marko/hot-reload.js 0.00% <0.00%> (ø)
...ges/marko/src/compiler/ast/AttributePlaceholder.js 0.00% <0.00%> (ø)
packages/marko/src/compiler/util/deresolve.js 100.00% <ø> (ø)
...omponents/TransformHelper/handleComponentEvents.js 93.93% <ø> (ø)
...ponents/TransformHelper/handleComponentPreserve.js 91.30% <ø> (ø)
...ts/TransformHelper/handleComponentPreserveAttrs.js 100.00% <ø> (ø)
...ags/components/TransformHelper/handleLegacyBind.js 93.33% <ø> (ø)
... and 871 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fe172b1...4a05d75. Read the comment docs.

case "\\":
return "\\\\";
case "</script":
return "\\u003C/script";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think like my PR in fluurt, this could be just <\/script> — or did you already try that and it didn’t work for JSON-related reasons?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some cross browser testing for spaces in the closing script and all are pretty good about it. For some reason I didn’t test spaces before the closing angle bracket. I’ll test that and update this if there is no issue there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I double-checked the JSON spec and it looks like </script can be escaped as the slightly shorter <\\/script, instead of \\u003C/script — it also looks like trailing garbage before the final > doesn't change how it closes the parent opening script tag, so that's not a concern either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah this is great! Thanks for doing some research on this one. To be clear are you saying </script abc> does not get parsed in any browser as the closing script?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the opposite — it gets parsed everywhere according to spec as the closing script.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lame, okay then. We'll leave it as is, but incorporate your <\\/script suggestion.

renderedComponents.forEach(function(renderedComponent) {
fakeArray.concat(renderedComponent);
if (renderedComponents) {
JSON.parse("[" + renderedComponents.slice(10) + "]").forEach(function(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this — it looks like in the future it could even throttle itself based on how much distress the user’s CPU is in

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that was one of the things I liked as well, we can lazily parse the JSON.

case "</script":
return "\\u003C/script";
default:
return "\\u" + match.charCodeAt(0).toString(16);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this output leading zeroes if the charCode is less than 4 digits? Or do you not need to bother, since the characters it catches will only end up at those higher codepoints anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only escape two characters both with 4 digits \u2028|\u2029. Having said that I think this would be clearer to just include the full escaped sequence, ie:

      case "\u2028":
        return "\\u2028";
      case "\u2029":
        return "\\u2029";

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes more sense to me upon reading it. Might even be faster too — possibility of a Map instead of a switch statement too, I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't benched Map vs switch, but I could only assume switch is faster? Am I wrong?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm maybe I'm wrong: https://jsperf.com/if-switch-lookup-table/113 (tested in Chrome).

Will need to do a more specific bench than this, but if it is similar perf it would make the code a bit nicer I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, at this point, I’ve been burnt too many times before on my guesses for what’s faster inside JavaScript engines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the micro benchmarks lie quite often. Always have to do a real bench. At least for SSR we can get away with just benching a few node versions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

case "\\":
return "\\\\";
case "</script":
return "\\u003C/script";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I double-checked the JSON spec and it looks like </script can be escaped as the slightly shorter <\\/script, instead of \\u003C/script — it also looks like trailing garbage before the final > doesn't change how it closes the parent opening script tag, so that's not a concern either.

Comment on lines 259 to 264
if (type === "string") {
runtimeId = renderedComponents;
globalKey += runtimeId + "_C";
globalKey = runtimeId;
} else {
globalKey += (runtimeId = DEFAULT_RUNTIME_ID) + "C";
globalKey = runtimeId = DEFAULT_RUNTIME_ID;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about shortening this to:

if (type === "string") {
  runtimeId = renderedComponents;
} else {
  runtimeId = DEFAULT_RUNTIME_ID;
}

globalKey = runtimeId;

Or maybe even:

globalKey = runtimeId = (
  type === "string" ? renderedComponents : DEFAULT_RUNTIME_ID
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good catch. I realized that one as well. With the changes I've made globalKey and runtimeId are one and the same. I think I'll just remove globalKey.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Realized I could shorten it further, since really the type here would either be string or undefined. I just made it runtimeId = renderedComponents || DEFAULT_RUNTIME_ID.

The API for init should be one of init(), init("runtimeID") or init(serializedComponentDataObject).

@DylanPiercey DylanPiercey force-pushed the what-have-i-done-to-json branch from b8dd120 to 4a05d75 Compare April 19, 2020 20:26
@tigt
Copy link
Contributor

tigt commented Oct 5, 2020

Circling back to this after some discussions in the Marko Discord about using SubResource Integrity for Content Security Policies.

This PR currently serializes script data as:

<script>JSON.M+=',{"l":1,"w":[["s0-0-5-5",0,{},{"f":1}]],"t":["/marko-test$1.0.0/render/fixtures/components-await-title/components/hello/index.marko"]}'</script>

For statically-generated hashes, it could look more like:

<script
  data-s=',{"l":1,"w":[["s0-0-5-5",0,{},{"f":1}]],"t":["/marko-test$1.0.0/render/fixtures/components-await-title/components/hello/index.marko"]}'
  integrity="sha256-CSXorXvZcTkaix6Yvo6HppcZGetbYMGWSFlBw8HfCJo="
>JSON.M+=document.currentScript.dataset.s</script>

@DylanPiercey DylanPiercey changed the base branch from master to v4 February 17, 2021 02:41
@DylanPiercey
Copy link
Contributor Author

Circling back to this after some discussions in the Marko Discord about using SubResource Integrity for Content Security Policies.

This PR currently serializes script data as:

<script>JSON.M+=',{"l":1,"w":[["s0-0-5-5",0,{},{"f":1}]],"t":["/marko-test$1.0.0/render/fixtures/components-await-title/components/hello/index.marko"]}'</script>

For statically-generated hashes, it could look more like:

<script
  data-s=',{"l":1,"w":[["s0-0-5-5",0,{},{"f":1}]],"t":["/marko-test$1.0.0/render/fixtures/components-await-title/components/hello/index.marko"]}'
  integrity_no="sha256-CSXorXvZcTkaix6Yvo6HppcZGetbYMGWSFlBw8HfCJo="
>JSON.M+=document.currentScript.dataset.s</script>

When I take a look at this for Marko 6 I need to remember your solution here. In theory, by using an attribute to encode the JSON we should have better server side perf since escaping HTML quotes is less expensive than escaping script content.

@DylanPiercey
Copy link
Contributor Author

In our benchmarks there wasn't enough of a performance difference to justify this change at the moment.
I do still think there is a path forward here, perhaps following what @tigt had above, but I think it will be addressed in a different PR. #1457 will remain open.

@DylanPiercey DylanPiercey deleted the what-have-i-done-to-json branch October 9, 2024 20:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance - use JSON.parse while sending the json to browser.
2 participants