Vectorized `GetNonRandomizedHashCode` #98838

Ruihan-Yin · 2024-02-23T00:38:07Z

Description

This PR vectorizes the hashing algorithm used in GetNonRandomizedHashCode and GetNonRandomizedHashCodeIgnoreCase with Vector128 APIs. This is motivated by the fact that this method was spotted as the hot path in some first party workload.

In terms of the implementation, the incoming string input will be hashed on the vector path as long as the length is above the threshold. This leads to a 2-path hashing, meaning short strings (length under threshold) and long strings (length above threshold) will be hashed in different way, I'm not sure if this discrepancy matters, and the given structure might make it hard to scale up to V256 and V512.

we are open to discuss and improve the implementation.

ghost · 2024-02-26T22:07:16Z

Tagging subscribers to this area: @dotnet/area-system-runtime
See info in area-owners.md if you want to be subscribed.

Issue Details

Still WIP, Just to run through the CI for testing, no need to review.

Author:	Ruihan-Yin
Assignees:	-
Labels:	`area-System.Runtime`, `community-contribution`
Milestone:	-

…ble.

jkotas · 2024-03-29T01:14:34Z

src/libraries/System.Private.CoreLib/src/System/String.Comparison.cs

+                    uint hashed3 = hashVector.GetElement(2);
+                    uint hashed4 = hashVector.GetElement(3);
+
+                    while (length > 4)


I am not sure why you have changed this to while loop. I think if was perfectly fine here and more efficient too.

The original main loop processes the data stream in the way:

while(length > 2) { length -= 4 }

There will be an extra iteration when length is 4n-3, e.g. 7. consuming an extra null terminator, and it also leaves the fact that the trailing string length can only 0/1/2, so it can be processed by the following if statement.

While in the updated case, since the main loop is operating on a larger granularity, the same trick might violate the memory beyond the null terminator. So I used while here.

Can the condition in the main while loop be while (length >= 8) and the check if (length >= 4)?

I understand that the existing code does tricks with the null terminator to save a few instructions. You do not have to match those tricks.

Sure, thanks for the suggestion, I can try that.

Ruihan-Yin · 2024-03-29T18:31:59Z

The failures have been fixed.

I would post the perf numbers I have locally. I understand there are some discussion on the implementation right now, but I would suppose the suggested implementation would be at worst better than the current one, so I would say the number could be meaningful.

Bare perf numbers with the method:

Note:
The regression on the scalar path should be expected, as the vector path will introduce branching overhead, and the scalar path will get penalty for bad speculation. Other than this, numbers on the vector path look good overall.

Micros with ConcurrentDictionary

String length: 1~50

String length: 20~200

Notes:
the default micors ConcurrentDisctionary uses a dictionary with size of 512 and input string with length from 1 to 50, I also tested with strings with length from 20 to 200

Ruihan-Yin · 2024-03-29T18:41:57Z

Benchmark code used for the method:

using System.Diagnostics;
using System.Numerics;
using System.Runtime.Intrinsics;
using System.Runtime;
using BenchmarkDotNet;
using BenchmarkDotNet.Running;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Configs;
using BenchmarkDotNet.Environments;
using BenchmarkDotNet.Toolchains.CoreRun;
using BenchmarkDotNet.Jobs;

public unsafe class BingSNR
{
    [Params(
        8,          // scalar-path
        32,         // vector-path
        64,
        128,
        512,
        10000
    )]
    public int Size;

    private byte[] _bytes, _sameBytes, _bytesDifferentCase;
    private char[] _characters, _sameCharacters, _charactersDifferentCase;

    [GlobalSetup]
    public void Setup()
    {
        _bytes = new byte[Size];
        _bytesDifferentCase = new byte[Size];

        for (int i = 0; i < Size; i++)
        {
            // let ToLower and ToUpper perform the same amount of work
            _bytes[i] = i % 2 == 0 ? (byte)'a' : (byte)'A';
            _bytesDifferentCase[i] = i % 2 == 0 ? (byte)'A' : (byte)'a';
        }
        _sameBytes = _bytes.ToArray();
        _characters = _bytes.Select(b => (char)b).ToArray();
        _sameCharacters = _characters.ToArray();
        _charactersDifferentCase = _bytesDifferentCase.Select(b => (char)b).ToArray();
    }

    [Benchmark]
    public unsafe int Hash_Vector()
    {
        fixed(char* pStr = _characters)
        {
            int ret = GetNonRandomizedHashCode_Vector(pStr, Size);
            return ret;
        }
    }
    
    [Benchmark]
    public int Hash()
    {
        fixed(char* pStr = _characters)
        {
            int ret = GetNonRandomizedHashCode(pStr, Size);
            return ret;
        }
    }

    [Benchmark]
    public unsafe int HashIgnoreCase_Vector()
    {
        fixed(char* pStr = _characters)
        {
            int ret = GetNonRandomizedHashCodeIgnoreCase_Vector(pStr, Size);
            return ret;
        }
    }


    [Benchmark]
    public int HashIgnoreCase()
    {
        fixed(char* pStr = _characters)
        {
            int ret = GetNonRandomizedHashCodeIgnoreCase(pStr, Size);
            return ret;
        }
    }

    static unsafe int GetNonRandomizedHashCode_Vector(char* src, int length)
    {
        uint* ptr = (uint*)src;
        uint hash1 = (5381 << 16) + 5381;
        uint hash2 = hash1;
        uint hash3 = hash1;
        uint hash4 = hash1;

        if((Vector128.IsHardwareAccelerated && length >= 2 * Vector128<ushort>.Count))
        {
            Vector128<uint> hashVector = Vector128.Create(hash1);

            while (length > 8)
            {
                Vector128<uint> srcVec = Vector128.Load(ptr);
                length -= 8;
                hashVector = (hashVector + RotateLeft(hashVector, 5)) ^ srcVec;
                ptr += 4;
            }

            uint hashed1 = hashVector.GetElement(0);
            uint hashed2 = hashVector.GetElement(1);
            uint hashed3 = hashVector.GetElement(2);
            uint hashed4 = hashVector.GetElement(3);

            while (length > 4)
            {
                uint p0 = ptr[0];
                uint p1 = ptr[1];

                length -= 4;
                hashed3 = (BitOperations.RotateLeft(hashed3, 5) + hashed3) ^ (p0);
                hashed4 = (BitOperations.RotateLeft(hashed4, 5) + hashed4) ^ (p1);
                ptr += 2;
            }

            while (length > 0)
            {
                uint p0 = ptr[0];

                length -= 2;
                hashed4 = (BitOperations.RotateLeft(hashed4, 5) + hashed4) ^ (p0);
                ptr += 1;
            }

            uint res = (((BitOperations.RotateLeft(hashed1, 5) + hashed1)) ^ hashed3) + 1566083941 * (((BitOperations.RotateLeft(hashed2, 5) + hashed2)) ^ hashed4);
            return (int)res;
        }

        while (length > 8)
        {
            uint p0 = ptr[0];
            uint p1 = ptr[1];
            uint p2 = ptr[2];
            uint p3 = ptr[3];
            length -= 8;
            // hashVector = (hashVector + RotateLeft(hashVector, 5)) ^ srcVec;
            hash1 = (BitOperations.RotateLeft(hash1, 5) + hash1) ^ (p0);
            hash2 = (BitOperations.RotateLeft(hash2, 5) + hash2) ^ (p1);
            hash3 = (BitOperations.RotateLeft(hash3, 5) + hash3) ^ (p2);
            hash4 = (BitOperations.RotateLeft(hash4, 5) + hash4) ^ (p3);
            ptr += 4;
        }

        while (length > 4)
        {
            uint p0 = ptr[0];
            uint p1 = ptr[1];
            length -= 4;
            // Where length is 4n-1 (e.g. 3,7,11,15,19) this additionally consumes the null terminator
            hash3 = (BitOperations.RotateLeft(hash3, 5) + hash3) ^ (p0);
            hash4 = (BitOperations.RotateLeft(hash4, 5) + hash4) ^ (p1);
            ptr += 2;
        }

        while (length > 0)
        {
            uint p0 = ptr[0];

            length -= 2;
            hash4 = (BitOperations.RotateLeft(hash4, 5) + hash4) ^ (p0);
            ptr += 1;
        }

        uint resOnScalarPath = (((BitOperations.RotateLeft(hash1, 5) + hash1)) ^ hash3) + 1566083941 * (((BitOperations.RotateLeft(hash2, 5) + hash2)) ^ hash4);
        return (int)resOnScalarPath;
    }

    static unsafe int GetNonRandomizedHashCodeIgnoreCase_Vector(char* src, int length)
    {
        uint* ptr = (uint*)src;
        uint hash1 = (5381 << 16) + 5381;
        uint hash2 = hash1;
        uint hash3 = hash1;
        uint hash4 = hash1;
        const uint NormalizeToLowercase = 0x0020_0020u;

        if((Vector128.IsHardwareAccelerated && length >= 2 * Vector128<ushort>.Count))
        {
            Vector128<uint> hashVector = Vector128.Create(hash1);
            Vector128<uint> NormalizeToLowercaseVec = Vector128.Create(NormalizeToLowercase);
            while (length > 8)
            {
                Vector128<uint> srcVec = Vector128.Load(ptr);
                length -= 8;
                hashVector = (hashVector + RotateLeft(hashVector, 5)) ^ (srcVec | NormalizeToLowercaseVec);
                ptr += 4;
            }

            uint hashed1 = hashVector.GetElement(0);
            uint hashed2 = hashVector.GetElement(1);
            uint hashed3 = hashVector.GetElement(2);
            uint hashed4 = hashVector.GetElement(3);

            while (length > 4)
            {
                uint p0 = ptr[0];
                uint p1 = ptr[1];

                length -= 4;
                hashed3 = (BitOperations.RotateLeft(hashed3, 5) + hashed3) ^ (p0 | NormalizeToLowercase);
                hashed4 = (BitOperations.RotateLeft(hashed4, 5) + hashed4) ^ (p1 | NormalizeToLowercase);
                ptr += 2;
            }

            while (length > 0)
            {
                uint p0 = ptr[0];

                length -= 2;
                hashed4 = (BitOperations.RotateLeft(hashed4, 5) + hashed4) ^ (p0 | NormalizeToLowercase);
                ptr += 1;
            }

            uint res = (((BitOperations.RotateLeft(hashed1, 5) + hashed1)) ^ hashed3) + 1566083941 * (((BitOperations.RotateLeft(hashed2, 5) + hashed2)) ^ hashed4);
            return (int)res;
        }

        while (length > 8)
        {
            uint p0 = ptr[0];
            uint p1 = ptr[1];
            uint p2 = ptr[2];
            uint p3 = ptr[3];
            length -= 8;
            // hashVector = (hashVector + RotateLeft(hashVector, 5)) ^ srcVec;
            hash1 = (BitOperations.RotateLeft(hash1, 5) + hash1) ^ (p0 | NormalizeToLowercase);
            hash2 = (BitOperations.RotateLeft(hash2, 5) + hash2) ^ (p1 | NormalizeToLowercase);
            hash3 = (BitOperations.RotateLeft(hash3, 5) + hash3) ^ (p2 | NormalizeToLowercase);
            hash4 = (BitOperations.RotateLeft(hash4, 5) + hash4) ^ (p3 | NormalizeToLowercase);
            ptr += 4;
        }

        while (length > 4)
        {
            uint p0 = ptr[0];
            uint p1 = ptr[1];
            length -= 4;
            // Where length is 4n-1 (e.g. 3,7,11,15,19) this additionally consumes the null terminator
            hash3 = (BitOperations.RotateLeft(hash3, 5) + hash3) ^ (p0 | NormalizeToLowercase);
            hash4 = (BitOperations.RotateLeft(hash4, 5) + hash4) ^ (p1 | NormalizeToLowercase);
            ptr += 2;
        }

        while (length > 0)
        {
            uint p0 = ptr[0];

            length -= 2;
            hash4 = (BitOperations.RotateLeft(hash4, 5) + hash4) ^ (p0 | NormalizeToLowercase);
            ptr += 1;
        }

        uint resOnScalarPath = (((BitOperations.RotateLeft(hash1, 5) + hash1)) ^ hash3) + 1566083941 * (((BitOperations.RotateLeft(hash2, 5) + hash2)) ^ hash4);
        return (int)resOnScalarPath;
    }

    static Vector128<uint> RotateLeft(Vector128<uint> src, int control)
    {
        return Vector128.BitwiseOr(Vector128.ShiftLeft(src, control), Vector128.ShiftRightLogical(src, 32 - control));
    }
    static unsafe int GetNonRandomizedHashCode(char* src, int length)
    {
        uint hash1 = (5381 << 16) + 5381;
        uint hash2 = hash1;
        uint* ptr = (uint*)src;
        while (length > 2)
        {
            length -= 4;
            hash1 = (BitOperations.RotateLeft(hash1, 5) + hash1) ^ ptr[0];
            hash2 = (BitOperations.RotateLeft(hash2, 5) + hash2) ^ ptr[1];
            ptr += 2;
        }
        if (length > 0)
            hash2 = (BitOperations.RotateLeft(hash2, 5) + hash2) ^ ptr[0];
        return (int)(hash1 + (hash2 * 1566083941));
    }

    static unsafe int GetNonRandomizedHashCodeIgnoreCase(char* src, int length)
    {
        uint hash1 = (5381 << 16) + 5381;
        uint hash2 = hash1;
        const uint NormalizeToLowercase = 0x0020_0020u;

        uint* ptr = (uint*)src;
        while (length > 2)
        {
            length -= 4;
            hash1 = (BitOperations.RotateLeft(hash1, 5) + hash1) ^ (ptr[0] | NormalizeToLowercase);
            hash2 = (BitOperations.RotateLeft(hash2, 5) + hash2) ^ (ptr[1] | NormalizeToLowercase);
            ptr += 2;
        }
        if (length > 0)
            hash2 = (BitOperations.RotateLeft(hash2, 5) + hash2) ^ (ptr[0] | NormalizeToLowercase);
        return (int)(hash1 + (hash2 * 1566083941));
    }

    static void Main(string[] args)
    {
        var switcher = new BenchmarkSwitcher(new[] {
        typeof(BingSNR)
      });
      switcher.Run(args);
    }
}

jkotas · 2024-03-29T19:39:11Z

perf numbers

So the break-even point is around 32 characters, and it is a regression for anything smaller than that? What can be done to address the regression for smaller strings? Also, do we know what is the string length distribution in the Bing scenarios that you are trying to improve?

Ruihan-Yin · 2024-03-29T19:50:09Z

So the break-even point is around 32 characters, and it is a regression for anything smaller than that? What can be done to address the regression for smaller strings?

The regression mostly comes from the branch misprediction, in this method, the string length is random, makes it hard for branch predictor to learn any pattern, I was not able to come up with good ideas to fix this.

Also, do we know what is the string length distribution in the Bing scenarios that you are trying to improve?

We did collected the distribution directly from the app, and the benchmark numbers, but I am not sure if it is proper to share them here, considering those are closed-source.

jkotas · 2024-03-29T20:06:34Z

The regression mostly comes from the branch misprediction

Where are the branch mispredictions for the bare perf numbers that show the worst regressions? The benchmarks are setup to run on the same length string in a loop. There should not be any branch mispredictions.

Also, I have notice that your bare perf numbers are for char[], but the actual use case for string. char[] and string payloads have different typical alignment.

Ruihan-Yin · 2024-03-29T21:16:24Z

Where are the branch mispredictions for the bare perf numbers that show the worst regressions? The benchmarks are setup to run on the same length string in a loop. There should not be any branch mispredictions.

I should made the wrong conclusion on the bare perf numbers, yes, you are right, all the input are in the same length, misprediction is not the issue in this case, I was using the trace from Dictionary micros

Attached is the trace with the posted micros:

With the highlighted line being the one taking the longest time, I suppose it might be because the folding logic for 4 elements is more complex than the original folding logic which only needs to take care of 2 elements.

Also, I have notice that your bare perf numbers are for char[], but the actual use case for string. char[] and string payloads have different typical alignment.

Thanks for pointing out, attached is the updated performance number:

this is the updated code:

     [Params(
        8,          // scalar-path
        32,         // vector-path
        64,
        128,
        512,
        10000
    )]
    public int Size;

    private string _string;

    [GlobalSetup]
    public void Setup()
    {
        _string = "";
        for (int i = 0; i < Size; i++)
        {
            _string += i % 2 == 0 ? 'a' : 'A';
        }        
    }

    [Benchmark]
    public unsafe int Hash_Vector()
    {
        fixed(char* pStr = _string)
        {
            int ret = GetNonRandomizedHashCode_Vector(pStr, Size);
            return ret;
        }
    }
    
    [Benchmark]
    public unsafe int Hash()
    {
        fixed(char* pStr = _string)
        {
            int ret = GetNonRandomizedHashCode(pStr, Size);
            return ret;
        }
    }

    [Benchmark]
    public unsafe int HashIgnoreCase_Vector()
    {
        fixed(char* pStr = _string)
        {
            int ret = GetNonRandomizedHashCodeIgnoreCase_Vector(pStr, Size);
            return ret;
        }
    }


    [Benchmark]
    public unsafe int HashIgnoreCase()
    {
        fixed(char* pStr = _string)
        {
            int ret = GetNonRandomizedHashCodeIgnoreCase(pStr, Size);
            return ret;
        }
    }

jkotas · 2024-04-01T03:58:14Z

It is not clear what the quality of the hashcodes produced by the new vectorized algorithm is. Have you done any analysis of the hashcode quality produced by the new algorithm? The existing GetNonRandomizedHashCode algorithm is not great so start with - GetNonRandomizedHashCode produces too many collisions for small strings #92556 has some data about it.
For large strings, the new vectorized algorithm seems to be slower than some of the existing well-known non-cryptographic algorithms that we have implemented in the BCL. I have compared the algorithm implementation in this PR with XxHash3.HashToUInt64 and XxHash3 runs 30% faster for string length 512. Are you able to replicate this result? I think we should consider using the well-known non-cryptographic hash algorithms instead of inventing our own to address this and previous points. We cannot take a dependency on System.IO.Hashing from CoreLib for XxHash3, but we should be able to included ifdefed version of XxHash3 implementation in CoreLib to reuse some of the code - Is it time to replace Marvin? #85206 is related to this, and it has some suggestions for how to further optimize XxHash3 for this case.
For small strings, there seems to be significant performance regression. This can be mitigated by using the existing algorithm for strings up to certain size.

Ruihan-Yin · 2024-04-02T21:23:07Z

Thanks for the feedbacks!

It is not clear what the quality of the hashcodes produced by the new vectorized algorithm is. Have you done any analysis of the hashcode quality produced by the new algorithm? The existing GetNonRandomizedHashCode algorithm is not great so start with - GetNonRandomizedHashCode produces too many collisions for small strings #92556 has some data about it.

using the code given in the issue, I got the same collision count with the number provided, the generated hash code will be different from the ones generated by the original algorithm, but the collision count remained the same with the given setup - This implementation may not resolve the known collision issue.

For large strings, the new vectorized algorithm seems to be slower than some of the existing well-known non-cryptographic algorithms that we have implemented in the BCL. I have compared the algorithm implementation in this PR with XxHash3.HashToUInt64 and XxHash3 runs 30% faster for string length 512. Are you able to replicate this result? I think we should consider using the well-known non-cryptographic hash algorithms instead of inventing our own to address this and previous points. We cannot take a dependency on System.IO.Hashing from CoreLib for XxHash3, but we should be able to included ifdefed version of XxHash3 implementation in CoreLib to reuse some of the code - Is it time to replace Marvin? #85206 is related to this, and it has some suggestions for how to further optimize XxHash3 for this case.

Locally, I can reproduce similar numbers by simply calling the System.IO.Hashing lib function as suggested, XxHash3 is around 40% faster than the proposed implementation and 64% faster than the original implementation. (benchmark code attached below, the inputs are slightly different due to the lib function interface.)

For small strings, there seems to be significant performance regression. This can be mitigated by using the existing algorithm for strings up to certain size.

I am not sure what existing algorithm refers to, do you mean the original one? I believe that would lead to mismatch in the generated hash code on scalar path and vector path.

   [Params(
        512
    )]
    public int Size;

    private string _string;

    [GlobalSetup]
    public void Setup()
    {
        _string = "";
        for (int i = 0; i < Size; i++)
        {
            _string += i % 2 == 0 ? 'a' : 'A';
        }
    }

    [Benchmark]
    public unsafe int Hash_Vector()
    {
        fixed(char* pStr = _string)
        {
            int ret = GetNonRandomizedHashCode_Vector(pStr, Size);
            return ret;
        }
    }
    
    [Benchmark]
    public unsafe int Hash()
    {
        fixed(char* pStr = _string)
        {
            int ret = GetNonRandomizedHashCode(pStr, Size);
            return ret;
        }
    }

    [Benchmark]
    public unsafe ulong Hash_XxHash()
    {
        ulong ret = XxHash3.HashToUInt64(MemoryMarshal.AsBytes(_string.AsSpan()));
        return ret;
    }

jkotas · 2024-04-02T22:55:03Z

do you mean the original one? I believe that would lead to mismatch in the generated hash code on scalar path and vector path.

It would not lead to mismatch as long as the existing algorithm is unconditionally used for small strings and the vectorized algorithm is unconditionally used for long strings. Something like:

int GetNonRandomizedHashCode()
{
    if (length > 32) // Note there is no `Vector128.IsHardwareAccelerated` check here
        return GetNonRandomizedHashCodeForLongString();

    ... existing code or some other implementation optimized for small sizes ...
}

int GetNonRandomizedHashCodeForLongString()
{
    ... the vectorized implementation ...
}

Note that XxHash3 uses similar pattern to avoid paying the fixed vectorization overhead for small sizes:

runtime/src/libraries/System.IO.Hashing/src/System/IO/Hashing/XxHash3.cs

Lines 129 to 144 in 995989e

    
           if (length <= 16) 
        
           { 
        
               return HashLength0To16(sourcePtr, length, (ulong)seed); 
        
           } 
        
           if (length <= 128) 
        
           { 
        
               return HashLength17To128(sourcePtr, length, (ulong)seed); 
        
           } 
        
           if (length <= MidSizeMaxBytes) 
        
           { 
        
               return HashLength129To240(sourcePtr, length, (ulong)seed); 
        
           } 
        
           return HashLengthOver240(sourcePtr, length, (ulong)seed);

Ruihan-Yin · 2024-04-04T17:41:16Z

Thanks for the explanation,

Attached above is the bare perf number with the suggested changes. The threshold for vector path is set to be 32. The regression at small inputs is mitigated.

Ruihan-Yin · 2024-04-15T16:36:28Z

Hi @jkotas , trying to kindly follow up on this PR, is there any further AR on my side to move forward on this, e.g integrating the suggested changes into the library?

jkotas · 2024-04-15T17:47:48Z

Do you plan to address the regressions for small strings and make the performance for large strings even better by switching to XxHash3?

This implementation may not resolve the known collision issue.

I am worried about your current implementation making the collision issues worse. Analyzing quality of hash functions is a non-trivial field.

Ruihan-Yin · 2024-04-16T16:37:58Z

Failures look irrelevant.

Do you plan to address the regressions for small strings and make the performance for large strings even better by switching to XxHash3?

The suggested changes to mitigate the small strings have been pushed to the PR. On the large input side, we currently do not have the plan to further optimize it with XxHash3.

I am worried about your current implementation making the collision issues worse. Analyzing quality of hash functions is a non-trivial field.

I do understand the hash code quality analysis is non-trivial, and due to my limited experience within this domain, it could be challenging for me to perform analysis and get the conclusion.

If this is not the right time for this PR, I understand.

jkotas · 2024-04-16T16:40:00Z

I do understand the hash code quality analysis is non-trivial, and due to my limited experience within this domain, it could be challenging for me to perform analysis and get the conclusion.

Right, it is why it is best to use one of the existing algorithms that has been analyzed.

stephentoub · 2024-07-22T13:27:12Z

Given the discussion and the analysis required and the lack of movement on it for several months, I'm going to close this for now. But thanks for trying to improve things here!

dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Feb 23, 2024

This was referenced Feb 24, 2024

System.Text.Json failing some large file tests #59678

Closed

[browser][MT] Assert failed: Cannot find Promise for JSHandle -2 #98406

Closed

Ruihan-Yin closed this Feb 26, 2024

Ruihan-Yin reopened this Feb 26, 2024

build-analysis bot mentioned this pull request Feb 26, 2024

[browser][MT] HttpClientCancelTest.SendAsync_Cancel_Success #98201

Closed

teo-tsirpanis added area-System.Runtime community-contribution Indicates that the PR has been added by a community member and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Feb 26, 2024

Ruihan-Yin closed this Feb 28, 2024

Ruihan-Yin reopened this Feb 28, 2024

Ruihan-Yin force-pushed the BingSNRUp branch 3 times, most recently from 88f2f40 to 39cfc20 Compare March 6, 2024 18:09

Ruihan-Yin closed this Mar 6, 2024

Ruihan-Yin reopened this Mar 6, 2024

This was referenced Mar 7, 2024

Tracking issue for CI build timeouts #76454

Closed

slow macOS - "##[error]The job running on agent Azure Pipelines 9 ran longer than the maximum time of 60 minutes." dotnet/dnceng#1883

Open

Ruihan-Yin added 7 commits March 18, 2024 15:46

Improve the performance for GetNonRandomizedHashCode

a108bef

Improve the performance for GetNonRandomizedHashCode

8e38b51

Ensure all the strings are hashed on the same path when SSE is availa…

fffaa7a

…ble.

bug fix attempt

e8c723c

bug fix attempt

535a890

bug fix attempt

ded022e

bug fix attempt

1d20ede

Ruihan-Yin force-pushed the BingSNRUp branch from 8958838 to 1d20ede Compare March 18, 2024 22:46

This was referenced Mar 19, 2024

[wasm] System.Text.Json.Tests running out of memory #98578

Closed

GC\Regressions\v2.0-beta2\452950\452950\452950.cmd failing on Mono minijit Windows x64 #99729

Open

Build linux-x64 Debug Mono_MiniJIT_LibrariesTests failure #99942

Closed

Ruihan-Yin added 4 commits March 28, 2024 13:43

make the expected collision point to be properly represented by int

e2f7f37

update test code, ensure to get the correct collision point

caccc70

bug fix

3faabaa

bug fix

110a29e

jkotas reviewed Mar 29, 2024

View reviewed changes

update the assert check.

545bdba

Resolve comments.

c9a9b62

build-analysis bot mentioned this pull request Mar 29, 2024

System.Security.Cryptography.X509Certificates.Tests.ChainTests.BuildInvalidSignatureTwice failure #82837

Open

Update with suggested changes from the review.

e20d484

build-analysis bot mentioned this pull request Apr 16, 2024

Test failure: PlatformNotSupportedException: Blocking wait is not supported on the JS interop threads #101100

Closed

stephentoub added the needs-author-action An issue or pull request that requires more info or actions from the author. label Jul 9, 2024

stephentoub closed this Jul 22, 2024

github-actions bot locked and limited conversation to collaborators Aug 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vectorized `GetNonRandomizedHashCode` #98838

Vectorized `GetNonRandomizedHashCode` #98838

Ruihan-Yin commented Feb 23, 2024 •

edited

Loading

ghost commented Feb 26, 2024

jkotas Mar 29, 2024 •

edited

Loading

Ruihan-Yin Mar 29, 2024

jkotas Mar 29, 2024

Ruihan-Yin Mar 29, 2024

Ruihan-Yin commented Mar 29, 2024 •

edited

Loading

Ruihan-Yin commented Mar 29, 2024

jkotas commented Mar 29, 2024

Ruihan-Yin commented Mar 29, 2024 •

edited

Loading

jkotas commented Mar 29, 2024 •

edited

Loading

Ruihan-Yin commented Mar 29, 2024 •

edited

Loading

jkotas commented Apr 1, 2024

Ruihan-Yin commented Apr 2, 2024 •

edited

Loading

jkotas commented Apr 2, 2024 •

edited

Loading

Ruihan-Yin commented Apr 4, 2024

Ruihan-Yin commented Apr 15, 2024

jkotas commented Apr 15, 2024

Ruihan-Yin commented Apr 16, 2024

jkotas commented Apr 16, 2024

stephentoub commented Jul 22, 2024

Vectorized GetNonRandomizedHashCode #98838

Vectorized GetNonRandomizedHashCode #98838

Conversation

Ruihan-Yin commented Feb 23, 2024 • edited Loading

Description

ghost commented Feb 26, 2024

jkotas Mar 29, 2024 • edited Loading

Choose a reason for hiding this comment

Ruihan-Yin Mar 29, 2024

Choose a reason for hiding this comment

jkotas Mar 29, 2024

Choose a reason for hiding this comment

Ruihan-Yin Mar 29, 2024

Choose a reason for hiding this comment

Ruihan-Yin commented Mar 29, 2024 • edited Loading

Bare perf numbers with the method:

Micros with ConcurrentDictionary

Ruihan-Yin commented Mar 29, 2024

jkotas commented Mar 29, 2024

Ruihan-Yin commented Mar 29, 2024 • edited Loading

jkotas commented Mar 29, 2024 • edited Loading

Ruihan-Yin commented Mar 29, 2024 • edited Loading

jkotas commented Apr 1, 2024

Ruihan-Yin commented Apr 2, 2024 • edited Loading

jkotas commented Apr 2, 2024 • edited Loading

Ruihan-Yin commented Apr 4, 2024

Ruihan-Yin commented Apr 15, 2024

jkotas commented Apr 15, 2024

Ruihan-Yin commented Apr 16, 2024

jkotas commented Apr 16, 2024

stephentoub commented Jul 22, 2024

Vectorized `GetNonRandomizedHashCode` #98838

Vectorized `GetNonRandomizedHashCode` #98838

Ruihan-Yin commented Feb 23, 2024 •

edited

Loading

jkotas Mar 29, 2024 •

edited

Loading

Ruihan-Yin commented Mar 29, 2024 •

edited

Loading

Ruihan-Yin commented Mar 29, 2024 •

edited

Loading

jkotas commented Mar 29, 2024 •

edited

Loading

Ruihan-Yin commented Mar 29, 2024 •

edited

Loading

Ruihan-Yin commented Apr 2, 2024 •

edited

Loading

jkotas commented Apr 2, 2024 •

edited

Loading