-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build Tooling: Optimize build by spawning worker pool #15230
Conversation
40ec580
to
4338f9e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great. Some nice refactoring of the build scripts.
I think the new code needs some kind of error handling. The previous implementation seemed to handle this more gracefully.
* @type {Object<string,Function>} | ||
*/ | ||
const BUILD_TASK_BY_EXTENSION = { | ||
async '.scss'( file ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😮I'd never considered this as a way to define object properties. Very fancy!
bin/packages/build-worker.js
Outdated
const extension = path.extname( file ); | ||
const task = BUILD_TASK_BY_EXTENSION[ extension ]; | ||
if ( task ) { | ||
await task( file ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I experimented by putting some erroneous code into a file, and the result is an unhandled promise error (I suppose originating from babel.transformFileAsync
). The taskbar ticked up to 100%, but the command didn't exit.
I wonder if some error handling could be added to the async function that this file exports, and the callback
function provided with error details so that the parent process can handle exiting gracefully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if some error handling could be added to the async function that this file exports, and the
callback
function provided with error details so that the parent process can handle exiting gracefully.
Interesting. Nice catch! We should definitely handle this gracefully. I'll take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is handled better in c4f908b . It's not perfect in that I think the whole process should halt immediately when an error is encountered, but the nature of worker-farm
makes this a bit difficult to implement. But at least it won't hang, nor will it show it as "Unhandled Promise exception".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a bit of a play with it, and see what you mean.
Ideally, the build.js
script would return a non-zero exit code if an error occurs, as that would allow, for example, a CI job to end as early as possible.
One option might be to use process.exitCode = 1;
in build.js
if errors are encountered on a worker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
worker-farm also has a maxRetries
config option, which is by default infinity
. I wonder if that should be capped, potentially at 0
since I wouldn't expect a build error to suddenly cure itself (I imagine this is for things like HTTP requests).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a bit of a play with it, and see what you mean.
For posterity's sake, and since I was unclear in my original comment, the issue is that other workers may still be running when the error is handled, and there is no easy way through worker-farm
to end the process only once the current round of worker handlers has completed. If the process is terminated early, subsequent worker finishes will loudly complain about the lack of a parent process to whom they expect to report.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
worker-farm also has a
maxRetries
config option, which is by defaultinfinity
. I wonder if that should be capped, potentially at0
since I wouldn't expect a build error to suddenly cure itself (I imagine this is for things like HTTP requests).
It's a good idea, though in practice I don't expect this would occur. From its documentation, the retries are relevant for process termination. Failed builds wouldn't fall under this classification.
Normally I'd say that 0
would be the more sensible default, but since it might risk that we appear to succeed with the build despite a worker having terminated, I'd rather it get stuck trying to build infinitely than to produce a bad build. At least then it's obvious that something is wrong.
More preferable would be to cap the retries, but also detect and fail the whole build if one of the processes terminates unexpected. It's not clear to me how this might be done though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For posterity's sake, and since I was unclear in my original comment, the issue is that other workers may still be running when the error is handled, and there is no easy way through worker-farm to end the process only once the current round of worker handlers has completed. If the process is terminated early, subsequent worker finishes will loudly complain about the lack of a parent process to whom they expect to report.
process.exitCode = 1
won't cause the process to be terminated early. It can be set in advance of the process exiting. I can push a commit demonstrating what I mean if that helps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
process.exitCode = 1
won't cause the process to be terminated early. It can be set in advance of the process exiting. I can push a commit demonstrating what I mean if that helps?
Ah, my quoted text was meant as a clarification and not to dismiss, though I think I focused too much on this point to absorb the rest of your comment 😓 I agree with your assessment though, certainly the process should have a non-zero exit code if any build errors occur.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
process.exitCode = 1
won't cause the process to be terminated early. It can be set in advance of the process exiting. I can push a commit demonstrating what I mean if that helps?
Pushed (with co-authorship) in cb51568. I confirmed that before this change, an error in the build would yield a status code 0
, and 1
after.
Before:
node ./bin/packages/build.js && echo $?
# ...
0
After:
node ./bin/packages/build.js && echo $?
# ...
1
a9afb77
to
c4f908b
Compare
I did a little more tweaking in 6663433 to leverage the |
|
||
module.exports = async ( file, callback ) => { | ||
const extension = path.extname( file ); | ||
const task = BUILD_TASK_BY_EXTENSION[ extension ]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noting for future: I've been planning in my head how I might implement something like #13124 . I don't plan to make the changes here, since it's already a fairly significant refactor.
I expect this task could be separated out into a function buildFile
, which is optionally enhanced to read from / write to a cache. I'm thinking in some cases we may want to opt out of the caching behavior, e.g. in npm run package-plugin
, since it seems there could be a (very-low-probability-but-possible) chance that hash collisions could yield a bad result†. That said, we should always be testing (manually and automated) the artifact, but as of today, we're only doing the former.
let buildFile = ( file ) => { /* ... */ };
if ( hasCache ) {
buildFile = firstMatch( [ fromCache, buildFile ] );
}
buildFile = flow( [ buildFile, toCache ] );
† This assumes that an implementation would be similar to babel-loader
in checking a cache folder for a file named by the MD5 (or other fast hash) of the source file's content.
cc @noisysocks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps store the cache in a directory within the repository that is .gitignore
d? That way, when npm run package-plugin
runs git clean -fxd
it will empty the cache.
Not sure I'd worry about collisions. I'd use SHA1 as it's fast, has a low collision rate, and is good enough for git's object storage model which what inspired my approach in #13124.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps store the cache in a directory within the repository that is
.gitignore
d? That way, whennpm run package-plugin
runsgit clean -fxd
it will empty the cache.
That's true, I'd not considered that. My plan was to do similar to what's done with babel-loader
, i.e. a folder at node_modules/.cache
. Seems it would work just the same as you propose (since node_modules
is gitignore
d).
Not sure I'd worry about collisions. I'd use SHA1 as it's fast, has a low collision rate, and is good enough for git's object storage model which what inspired my approach in #13124.
I'm not overly worried, I'll just inevitably feel uncomfortable about the lack of being able to say that it's "impossible". I don't feel too strongly about the specific hash algorithm, just that it's fast. Some of what I've read seemed to imply MD5 beats out SHA1 on this front 🤷♂ [1][2][3]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome! 😍
I'm only seeing a modest (~20%) speedup when I compare to master
, which I suspect is because my machine has less cores than yours. Still, a speed up is a speed up!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thumbs up from me too, thanks for addressing all the feedback 😃
Thanks you both! And thanks @talldan for identifying all of the very hard-to-spot oversights! |
This reverts commit 4338f9e.
Co-Authored-By: Daniel Richards <[email protected]>
Co-Authored-By: Daniel Richards <[email protected]>
cb51568
to
b9c1191
Compare
I was hoping to see some improvement on Travis build times with these changes, but ultimately it appears it won't have much of an impact (for better or worse). As reference, Travis containers provide two cores (docs). Control: https://travis-ci.com/WordPress/gutenberg/builds/113749307 |
There was a mistake here, unfortunately. Stylesheets should only be built from their entrypoints, not from the individual files. This worked fine for the full build, but for Fix at #15920 |
Caching, if I recall, is what made a huge difference in Travis! We'll need to do regular cleanup of the cache files though so that they don't grow into several GBs and cause Travis builds to slow down because of time spent downloading cache files from S3. |
Yeah, I can definitely see this making the bigger difference. Do you know anything about cleanup? I had a similar need for a side-project of mine where I ended up using a combination of |
I opened an issue at #15967 to ensure this task is tracked. We can continue implementation discussion there. |
Previously: #8093
This pull request seeks to try to optimize
npm run build
. The previous effort to convert synchronous scripts to asynchronous equivalents from #8093 was adapted here. The changes here also include a refactoring to front-load files aggregation (i.e. discover all files to build at once, rather than by individual packages).Ultimately, these changes had minimal impact; the larger impact is by adopting the
worker-farm
package, which manages creation of subprocesses and concurrency within those subprocesses.The net result is a build time decrease of 56% (~24s to 10.5s, measured by
time node ./bin/packages/build.js
). The benefit will vary by machine, sinceworker-farm
defaults to concurrent workers equal to the number of CPUs.Testing Instructions:
There should be no regressions in the build of packages from
npm run build
.There should be no regressions in the rebuild of packages from
npm run dev
(though there may be some separate bugginess with rebuilds as addressed in #15219).