-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make fast enough for untriggered use #91
Comments
I've been talking with a number of Firefox/performance engineers (Gijs, Emilio, Rob, Greg) about this and have some useful information to share. Profiling Fathom 3.0 in Price TrackerMost of the conversations centered around improving Fathom as-is (blocking the main thread) in Price Tracker. TL;DR: This case study helped me to develop some general performance strategies shared below. General strategies for writing a performant ruleset (executed in the main thread)
Ways to improve the Fathom library itself
On parallelism: running Fathom off the main thread
Open questions
|
I'd note that it should be possible also to push in the CSS Working Group to get something like That should make it work everywhere. Apparently there are a few old requests for that: https://lists.w3.org/Archives/Public/www-style/2012Oct/0683.html cssom-view is unmaintained atm, but I'd be happy to help out there. |
I filed w3c/csswg-drafts#4122 to try to standardize something that would've helped here. |
Is it possible to tell what style accesses trigger a layout flush?Per Emilio:
var start = performance.now();
for (var i = 0; i < 1000 /* insert/remove zeros as needed */; i++) {
document.documentElement.style.display = i % 2 == 0 ? "none" : "";
theApiYouWantToTest();
}
console.log(performance.now() - start);
Note that the In reality, it's possible to use something like Why we probably need to make Fathom asyncAs noted here, the original, sync implementation of This improvement would be on top of any performance improvements to One less-than-ideal option is to add a new, async pre-processing step to Fathom that runs before the ruleset is executed. This step would only run if Fathom's The best option, however, is for Fathom itself to be made async. Something like: const results = await rules.against(document); ...and inside the ruleset where it uses Making Fathom async will enable further concurrency (see item 3) as well. @erikrose , Should we break out "Make Fathom async" into a separate issue for discussion? |
Yes, please. Is seems to me it should be possible to take a middle-of-the-road approach as well: call the existing synchronous Fathom routines in a requestAnimationFrame() callback, thus calling geometry-using routines like isVisible() at the optimal time without requiring a rewrite of the Fathom execution machinery. Correct? On the same subject, I do notice that requestAnimationFrame() itself probably ceases to call its callbacks on background or otherwise invisible tabs. Whether this is a problem depends on the application, but it's something to keep in mind. |
Fathom in Firefox: Initial performance discussion with the Performance team on the Smoot projectBackground:I filed a Performance Review Request[1] outlining the high level details for the Fathom/Smoot project, which is expected to be the first Firefox application of Fathom, and Erik and I met with dothayer from the performance team last week. Key takeaways:
Next StepsIn following the Recommended Plan[2], here are the next steps:
AppendixReferencesThe Performance discussion was based largely on these restricted access documents on Mozilla's Google Drive:
Annotations
|
A few other bits and pieces:
|
Here is a record of my latest notes on next steps for performance, since I was moved off the Fathom team. References: How do we know when Fathom is “fast enough”?
What to do next:
Profile ReaderMode/ Plan
|
80% of Fathom's time is spent calling DOM routines. Zibi was telling me that Fluent had the same problem and solved it by turning to DOM bindings, lowering its DOM accesses to direct C++ calls rather than going through the JS layer, which requires the runtime generation of reflection objects (different than X-Rays, which are for insulating content scripts from the page's monkeypatching). We could have Fathom compile rulesets to Rust. Or we could at least compile the parts which do DOM access, run them all at once up front, and ship their results back to JS. Zibi says the communication is fairly expensive. Lots of design space to explore here, obviously. |
Victor got 10x speed improvement by stubbing out getComputedStyle C++-side. He had suspicions that the time was largely going into flushes, but we lack evidence of it. Flushes don't show up in the flame graph. There is a fair amount of XRay overhead. 6% goes to |
dthayer ported the entirety of isVisible() to C++ and got a 10x speedup based on running it over every node of an Amazon page. This is on top of his using versions of routines that avoid flushes. (The lack of node pruning in this experiment might offset the fact that Pricewise rulesets were only 67% isVisible() and new-password ones only 17%.) |
See also this performance work: https://bugzilla.mozilla.org/show_bug.cgi?id=1709171. |
Because it needs access to the DOM, Fathom currently wants to run on the main thread. Unless run in response to a user action, it can create a little jank, taking upwards of 40ms to run a ruleset, so we dare not run it, for example, on every page load. Can we speed it up or find a way to run it offthread?
One approach is to make Fathom run faster. About 80% of its runtime on the Pricewise ruleset is spent in DOM routines. Those do a lot of flushing of layout and other pipeline stages, redoing calculations unnecessarily. Is this a major source of wasted time? Measure. Are there lower-level hooks we can use? (
window.windowUtils.getBoundsWithoutFlushing()
might be a faster way of getting element size, for example. mattn suggested it. Also see https://developer.mozilla.org/en-US/docs/Mozilla/Firefox/Performance_best_practices_for_Firefox_fe_engineers, which has, for instance, routines to get the window size and scroll without flushing things.) Other ideas?Can we run Fathom offthread without losing access to too much signal? Reader Mode currently serializes the markup (only) and ships it offthread to parse. Could we do something like that but also apply CSS ourselves offthread? Would that preserve enough signal for most rulesets? Would it be too slow or battery-hungry on 2-core mobile devices?
This bug is done when we can blithely run a Fathom ruleset on every Firefox page load without concern for dragging down the UX.
The text was updated successfully, but these errors were encountered: