-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rely on git hashes for manifest update #12182
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! Happy to land this as-is, but there are a couple of comments which might be worth looking at.
tools/manifest/vcs.py
Outdated
local_changes = self._local_changes() | ||
for result in self.git(*cmd).split("\0")[:-1]: | ||
rel_path = result.split("\t")[-1] | ||
hash = result.split()[2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hash
is also a bult-in function so maybe change the variable name.
tools/manifest/sourcefile.py
Outdated
if not self._hash: | ||
with self.open() as f: | ||
content = f.read() | ||
data = "blob " + str(len(content)) + "\0" + content |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I wonder if you profiled "".join(["blob ", str(len(content)), "\0", content])
and "blob %d\0%s" % (len(content), content)
vs this approach where the manifest isn't being generated from git (e.g. if you pass --work
to the update). I don't know which is fastest, but it seems worth trying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Profiled them and seems like joins over tuples are the fastest of dealing with very long strings. On my system, at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, sounds like a good idea to switch to that then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And they'll definitely be quicker on PyPy, given the lack of refcounting means they can't do the ugly CPython hack.
We earlier hashed all in-tree files when updating the manifest. Relying on git hashes instead gives us a speedup of ~50% when running |wpt manifest| With reference to web-platform-tests#11388
Mainly for the .serviceworker. change in tools/manifest/item.py, from web-platform-tests/wpt#12381 port/driver.py is updated accordingly to match upstream wptrunner to use HTTPS for .serviceworker. tests. This fixes an infrastructure/ test. Multiple other .serviceworker. tests are also rebaselined. There have been two recent manifest changes: web-platform-tests/wpt#12182 web-platform-tests/wpt#12563 (version bump) In order to not require a full manifest rebuild once this lands, WPT_BASE_MANIFEST.json needs to be updated. However, updating it in place has proven very difficult due to the size of the resulting diff. Instead, a WPT_BASE_MANIFEST_5.json is created. The old one will be deleted separately. Related Gerrit bug: https://bugs.chromium.org/p/gerrit/issues/detail?id=9640 This also brings in HTTP/2.0 support from web-platform-tests/wpt#12193. For this, new third_party python code is needed, namely h2, hpack, hyperframe, and enum. In addition to adding those to WPTWhiteList, it is also sorted and lines which aren't existing files (i.e., directories and removed files) are dropped. Reviewed at https://chromium-review.googlesource.com/1183425 but size of CL causing trouble updating it on Gerrit. [email protected] Bug: 876717 Change-Id: Ia43bd2eca54883a5d72a48008b6677d0e7187056 Reviewed-on: https://chromium-review.googlesource.com/1196424 Reviewed-by: Robert Ma <[email protected]> Reviewed-by: Philip Jägenstedt <[email protected]> Commit-Queue: Philip Jägenstedt <[email protected]> Cr-Commit-Position: refs/heads/master@{#588057}
We earlier hashed all in-tree files when updating the manifest.
Relying on git hashes instead gives us a speedup of ~50%
when running |wpt manifest|
With reference to #11388