-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to Xxhash128 from sha256 #70723
Conversation
#pragma warning disable CS9191 // The 'ref' modifier for an argument corresponding to 'in' parameter is equivalent to 'in'. Consider using 'in' instead. | ||
Contract.ThrowIfFalse(MemoryMarshal.TryWrite(span, ref Unsafe.AsRef(in this))); | ||
#pragma warning restore CS9191 | ||
} | ||
|
||
public static unsafe HashData FromPointer(HashData* hash) | ||
=> new(hash->Data1, hash->Data2, hash->Data3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
never used.
@@ -21,135 +19,41 @@ namespace Microsoft.CodeAnalysis | |||
internal partial record class Checksum | |||
{ | |||
// https://github.com/dotnet/runtime/blob/f2db6d6093c54e5eeb9db2d8dcbe15b2db92ad8c/src/libraries/System.Security.Cryptography.Algorithms/src/System/Security/Cryptography/SHA256.cs#L18-L19 | |||
private const int SHA256HashSizeBytes = 256 / 8; | |||
private const int XXHash128SizeBytes = 128 / 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why we need this constant and the HashSize constant. But was keeping this roughly the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you follow up with cleanup-refactoring?
Other than having this duplicated I'm not sure about the value of having the Create methods in a separate file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. i'm also intending to measure why we need Checksum and HashData when it seems like we could just have the latter. The former is technically good if we expect a few actual distinct checksums, and tons of pointers to them. The latter would be good if we expect less pointers and most of the values to just be needed in the place where we're pointing at them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might have been a 32-bit process the last time this was evaluated, but the concern last time we looked was the number of pointers could exceed the number of instances.
|
||
// Note: number of elements here should be computed based on what we need from our various collection-with-children objects. | ||
private static readonly ObjectPool<byte[]>[] s_checksumByteArrayPool = | ||
Enumerable.Range(0, 11).Select(i => new ObjectPool<byte[]>(() => new byte[HashSize * i])).ToArray(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no longer needed. the xxhash128 has all the functionality we need.
using var pooledHash = s_incrementalHashPool.GetPooledObject(); | ||
Span<byte> buffer = stackalloc byte[SharedPools.ByteBufferSize]; | ||
pooledHash.Object.Append(stream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
xxhash128 already has support for appending a stream to it. woohoo.
@@ -19,7 +19,7 @@ public void ValidateChecksumFromSpanSameAsChecksumFromBytes1() | |||
var checksumA = Checksum.Create(checksum1, checksum2); | |||
|
|||
// Running this test on multiple target frameworks with the same expectation ensures the results match | |||
Assert.Equal(Checksum.FromBase64String("N30m5jwVeMZzWpy9cbQbtSYHoXU="), checksumA); | |||
Assert.Equal(Checksum.FromBase64String("RRbspG+E4ziBC5hOWyrfCQ=="), checksumA); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expected change. we are legitimately using a different hash.
the tests do check though that we get the same result on netfx vs netcore (that way we can have roslyn communicating across process while still agreeing on the results).
@DustinCampbell We should probably switch Razor at the same time if we are making this change |
@@ -119,29 +120,24 @@ public override int GetHashCode() | |||
[DataContract, StructLayout(LayoutKind.Explicit, Size = HashSize)] | |||
public readonly record struct HashData( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no. this has to run on netfx and core.
eng/Versions.props
Outdated
@@ -245,6 +245,7 @@ | |||
<SystemDrawingCommonVersion>7.0.0</SystemDrawingCommonVersion> | |||
<SystemIOFileSystemVersion>4.3.0</SystemIOFileSystemVersion> | |||
<SystemIOFileSystemPrimitivesVersion>4.3.0</SystemIOFileSystemPrimitivesVersion> | |||
<SystemIOHashingVersion>8.0.0-rc.2.23479.6</SystemIOHashingVersion> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably won't want to ship a prerelease of this dependency with VS. Can you use the latest stable version instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we cannot. the latest stable does not include xxhash128 (the actual hashing function we're trying to use).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. In that case, we'll just ask that you help maintain the version on the VS repo side to switch to the stable 8.0 version when it's available.
Going to hold off on this till next tuesday when 8.0 releases. |
@ToddGrun @jasonmalinowski ptal. Source-Build update is here: dotnet/source-build-reference-packages#822 |
Note: this will need changes to VS build to pull in System.IO.Hashing. @jasonmalinowski are you a good person to coordinate that with? |
src/Workspaces/Core/Portable/Workspace/Solution/Checksum_Factory.cs
Outdated
Show resolved
Hide resolved
@jaredpar @jasonmalinowski the source-build change went in with: dotnet/source-build-reference-packages#822 Do you know why this is still having issues building on our side? Thanks! :) |
You may need to pick up the latest version of SBRP - similar to #69901. |
Ideally long term a darc subscription would be setup to automatically flow new versions in. |
Fixes AB#1880936
xxhash128 is smaller, and much faster (around 40x so) while still retaining the collision resistance we need for non-cryptographic content hashing. The API is also much easier to use, resulting in a ton of complex netfx vs netcore code being deleted.