Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: async fs calls in resolve and package handling #15211

Closed
wants to merge 9 commits into from

Conversation

patak-dev
Copy link
Member

Description

getRealPath in resolve is taking ~8% of the workload in https://github.com/yyx990803/vite-dev-server-perf, and it is calling fs.realPathSync.native. After this PR, it takes ~2% by using fsp.realPath (equivalent to the .native sync version).

I tested the speed improvement using the test in the repo instead of the cpuprofile, but it isn't stable on my machine. I think that even if we now have a ton of promises being generated, the non-blocking fs calls should still make this change worth it. We need to test in some other benchmarks.

The PR also changes a fs.readFileSync in package.ts to be async, and that triggered a whole bunch of async/await changes in the codebase too.


What is the purpose of this pull request?

  • Bug fix
  • New Feature
  • Documentation update
  • Other

Copy link

stackblitz bot commented Dec 1, 2023

Review PR in StackBlitz Codeflow Run & review this pull request in StackBlitz Codeflow.

@patak-dev patak-dev added the performance Performance related enhancement label Dec 1, 2023
@patak-dev
Copy link
Member Author

I don't know why Windows isn't happy (at least in CI) when using fsp.realPath. Using fs.realPathSync.native works, so for now we are back to sync for windows.

This makes this PR a bit more controversial, because it will mean a small regression on Windows due to the extra promises being generated. But a lot of them are needed due to the fsp.readFile change, and I ended up reducing the number of created promises at refactor: avoid creating promises in tryCleanFsResolve. So I think this may still be a good step.

If someone with more Windows knowledge finds a way to make fsp.realPath work, appreciated! But now we can also do that in another PR.

@patak-dev
Copy link
Member Author

@userquin tested in his Windows machine, and d3c8f90 is working for him with the same node version as CI. @sapphi-red, maybe you could try too when you have some time? Even if it works, I don't think we can just fallback to sync for CI 🤔

@userquin
Copy link
Contributor

userquin commented Dec 3, 2023

It works on first run after build (pnpm test-serve), second run fails with same CI errors (node 20.10.0 LTS):

imagen

@sapphi-red
Copy link
Member

sapphi-red commented Dec 3, 2023

I found a bug in Node (nodejs/node#51031) and maybe that's the culprit. I pushed a commit to use fs.realpath.native instead of fsp.realpath.

@patak-dev
Copy link
Member Author

I found a bug in Node (nodejs/node#51031) and maybe that's the culprit. I pushed a commit to use fs.realpath.native instead of fsp.realpath.

Woah! Amazing find @sapphi-red 👏🏼 ❤️

I'm curious to see if this PR improves the benchmarks on your machine. I don't know why there is so much variance when I measure on mine.

@sapphi-red
Copy link
Member

Here's the results of https://github.com/yyx990803/vite-dev-server-perf on my Windows machine. It's seems on a par.

before

1000 TS modules (50x20) loaded in: 1049ms (runs: [1274,1040,1060,1021,1049])
1000 TS modules (20x50) loaded in: 965ms (runs: [1008,934,946,966,965])
2000 TS modules (100x20) loaded in: 2237ms (runs: [2355,2252,2226,2237,2213])
5000 TS modules (250x20) loaded in: 5960ms (runs: [6592,6022,5874,5915,5960])
10000 TS modules (400x25) loaded in: 12265ms (runs: [13409,12106,12134,12265,12337])

after

1000 TS modules (50x20) loaded in: 1069ms (runs: [1257,1031,1051,1069,1071])
1000 TS modules (20x50) loaded in: 956ms (runs: [1033,956,971,941,946])
2000 TS modules (100x20) loaded in: 2249ms (runs: [2358,2249,2228,2191,2268])
5000 TS modules (250x20) loaded in: 5972ms (runs: [6610,5983,5936,5972,5927])
10000 TS modules (400x25) loaded in: 12152ms (runs: [13560,11932,11866,12162,12152])

I'm not sure why the "before" run is faster than the one in #15195 (comment) though 🤔 .

@patak-dev patak-dev mentioned this pull request Dec 9, 2023
4 tasks
packages/vite/src/node/ssr/ssrExternal.ts Outdated Show resolved Hide resolved
Comment on lines -586 to +596
if (
(res = tryResolveRealFile(
fileName + fileExt.replace('js', 'ts'),
preserveSymlinks,
))
)
return res
if ((res = tryResolveFile(fileName + fileExt.replace('js', 'ts'))))
return getRealPath(res, options.preserveSymlinks)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we're changing the pattern for when the real path is resolved? I kinda prefer the old way as the function would return the finalized resolved path directly, but maybe I'm missing an optimization made here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See 0faf06c, doing it in this way avoids creating many promises in this function.

But I don't think we will be doing it this way if we merge #15279. I'm unsure whether generating all the promises is justified. readdirSync takes almost no time compared to other things, and we could expand the tree in parallel (but keeping get/set sync). There are some fsp.readFile in this PR though. I think we need to reevaluate and maybe break this one in pieces after #15279

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea of this PR is fine to merge too even if there's no surface improvement. But IIRC what dominik discovered for tsconfck was that async within graph is better for large fs, sync within graph is better for small fs.

See 0faf06c, doing it in this way avoids creating many promises in this function.

I don't think we should optimize the code to avoid creating promises, it should work fine like before as long as it's scheduled 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why won't we try to avoid creating 10+ promises for each tryCleanFsResolve 🤔
Doing an await for them in a for loop isn't free either. We have merged PRs to move from a .startsWith('/') to [0] === '/'. I would imagine this has a way bigger effect, more thinking on the garbage collector, etc. But I think we will revert this change if we want to merge this one.

About populating the fs tree cache async, ya, we need to review what works better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I can't argue that no await is definitely faster that some awaits 😄 I don't mind as much now looking more at it, but initially I didn't think it was worth the perf. startsWith was different that it was an easy change, otherwise I would keep startsWith for readability.

@patak-dev patak-dev marked this pull request as draft December 9, 2023 13:56
@patak-dev
Copy link
Member Author

Closing as this PR has a lot of conflicts now, we can review if we want to apply similar changes later on but this isn't seems to be something we need to do right now.

@patak-dev patak-dev closed this Oct 28, 2024
@patak-dev patak-dev deleted the perf/async-realpath branch October 28, 2024 11:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance related enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants