Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix perf regression #6204

Merged
merged 7 commits into from
Aug 3, 2018
Merged

Fix perf regression #6204

merged 7 commits into from
Aug 3, 2018

Conversation

arcanis
Copy link
Member

@arcanis arcanis commented Aug 1, 2018

Summary

Fixes #6177

This diff fixes the perf regression introduced by #6010. New logic:

  • Instead of comparing the files content, we go back to checking the mtime
  • Since we cannot rely on npm to provide the timestamp, we set it when we fetch the files
  • Since the user caches have already been "corrupted" with wrong timestamps, we need to bump it
  • This has the unfortunate side effects that we lose the ability to avoid copying files that didn't change between two different releases (for example if x.y.1 changes a single file, we will still need to copy each file it contains). This is because we don't get the individual file timestamps from npm.

Notes:

  • Some hack is needed in order to workaround utimes doesn't seem to support subseconds times nodejs/node#22070 - specifically, we need to use futimes instead of utimes when unpacking the files (otherwise we lose the subsecond precision).

  • Instead of bumping the cache, we could special case the 23 Oct 1985. That seems fragile, though.

Test plan

Added a test.

@buildsize
Copy link

buildsize bot commented Aug 1, 2018

File name Previous Size New Size Change
yarn-[version].noarch.rpm 919.97 KB 920.15 KB 177 bytes (0%)
yarn-[version].js 4.02 MB 4.02 MB 737 bytes (0%)
yarn-legacy-[version].js 4.18 MB 4.18 MB 1018 bytes (0%)
yarn-v[version].tar.gz 925.55 KB 925.8 KB 264 bytes (0%)
yarn_[version]all.deb 675.07 KB 675.27 KB 202 bytes (0%)

Copy link
Member

@BYK BYK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idea is sound but I'm concerned about the "patched fs" thing and failing tests

@@ -93,12 +93,39 @@ export default class TarballFetcher extends BaseFetcher {
} {
const integrityInfo = this._supportedIntegrity();

const now = new Date();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this closer to where it is used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's here because I don't want each file of a single package to have a different timestamp (like it would be the case if the Date constructor was called inside the closure where it is used).

const now = new Date();

const fs = require('fs');
const patchedFs = Object.assign({}, fs, {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, why are we patching fs where we already have our own fs functions module?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it's not the same. The tar-fs package is doing utimes call in order to set the files timestamps. This function is broken and doesn't support subsecond timestamps. The only way to solve this without forking tar-fs is to give it an api-compatible version of the fs module, patched in order to use futimes instead of utimes.

Ideally the fix should be done upstream, but this issue is problematic enough that it warrants an hotfix to land faster.

Copy link
Member

@Daniel15 Daniel15 Aug 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to monkey patch tar-fs rather than fs? That seems like a better idea to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, utimes is used within a closure. I also prefer to patch a function with clearly defined (if a bit buggy) inputs/output rather than a random function from a module that can change without notice.

Also note that this code doesn't modify the global fs implementation. It simply creates a new, local, fs instance and pass it to tar-fs using its custom filesystem option, which seems perfect for this use case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this is a good solution in regards to tar-fs. It does kind of bloat the file a little. Do you think it'll be possible to move untar-stream outside somehow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm maybe it could be moved in src/utils/fs, yeah 👍

@arcanis
Copy link
Member Author

arcanis commented Aug 1, 2018

I'll check this tomorrow, but I have a high confidence the tests are failing only because of the bumped cache version (which changes the cache path from /v1/ to /v2/). Plus linting, obviously.

@acao
Copy link

acao commented Aug 2, 2018

@arcanis looks like it failed on an async timeout, maybe try to re-run? not familiar with this suite.

@arcanis
Copy link
Member Author

arcanis commented Aug 2, 2018

Yeah this is a known flaky test, I'll merge the PR as soon as a contributor reviews the PR (cc @BYK / @imsnif ?)

@arcanis
Copy link
Member Author

arcanis commented Aug 3, 2018

Given the lack of explicit ack, since the tests pass and the fix has been confirmed to work, I'll assume that no explicit "Request Change" review is a go. Any change will be in a followup.

@arcanis arcanis merged commit 15a438d into yarnpkg:master Aug 3, 2018
Copy link
Member

@imsnif imsnif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it took me some time to get to this - I think I missed the merge by just a few minutes. :)
In any case, as @arcanis mentions - these can be addressed in subsequent PRs.

@@ -111,6 +112,20 @@ test('installing a package with a renamed file should not delete it', () =>
expect(await fs.exists(`${config.cwd}/node_modules/pkg/state.js`)).toEqual(true);
}));

test("installing a new package should correctly update it, even if the files mtime didn't change", () =>
runInstall({}, 'mtime-same', async (config, reporter): Promise<void> => {
await misc.sleep(2000);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance we can do this manually somehow rather than prolonging the test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid not - even with the subsecond fs patch, subsecond timings still won't work on filesystems that don't support it at all (I think it might be the case on CircleCI), so we need to wait at least >1s to be sure that the mtime will change.

Another option might be to mock the fs module here, but it seemed a bit hazardous so I went with the sleep (since the tests are executed in parallel, it doesn't actually take the whole two seconds in practice).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm not following (or maybe this doesn't work, I haven't tested it) but won't this do the trick?
https://nodejs.org/api/fs.html#fs_fs_futimes_fd_atime_mtime_callback

await reInstall.init();

expect(await fs.readJson(`${config.cwd}/node_modules/pkg/package.json`)).toMatchObject({version: '2.0.0'});
}));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test still passes if I revert the changes (speficially in util/fs). Is this a local/os issue for me?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a test that ensures that #5723 is still fixed. #6010 was missing tests, so I wanted to be sure that the initial bug wasn't coming back after replacing the commit that was meant to fix it.

There is no test for the perf regression since I'm not sure we can make one that wouldn't be super-flaky 😞 To give you an idea, the different of install time on my machine was 6s, which could potentially come from various other sources (slow network, ...).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Legit. I was just looking for something that would fail if these changes accidentally reverted or changed somehow. Maybe as a unit test for untar-stream when it's moved and for fs?

const now = new Date();

const fs = require('fs');
const patchedFs = Object.assign({}, fs, {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this is a good solution in regards to tar-fs. It does kind of bloat the file a little. Do you think it'll be possible to move untar-stream outside somehow?

arcanis added a commit that referenced this pull request Aug 3, 2018
* Revert "fix(util/fs): use file content instead of mtime to determine equality (#6010)"

This reverts commit c53d039.

* Bumps the cache version

* Forces the mtime to be based on the time the archives got fetched

* Fixes subseconds time assignment (nodejs/node#22070)

* Adds regression tests for #5723

* Fixes tests

* Fixes test on architectures that don't support subsecond precision
arcanis pushed a commit that referenced this pull request Aug 3, 2018
This was referenced Aug 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

yarn 1.9.2 takes a long time on linking dependencies
5 participants