-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make string.FastAllocateString public #36989
Comments
Making this public would encourage mutating an immutable type. It's internal for a good reason. I would not want to make this public. string.Create was exposed an alternative. |
@stephentoub Maybe provide that ability via string.Create without delegate? |
@vanbukin How would that work? |
There could be an unsafe version which takes a function pointer, which would allow the delegate like scenario without forcing a delegate allocation |
@vanbukin Adding overloads to the |
@svick It could be like this: Here and now it’s only possible inside System.Private.CoreLib, because FastAllocateString is internal. Prefer way to do that now is to call string.Create, provide delegate, fake “state” and get span pointer inside delegate. But I don’t want to get overhead on delegate creation and call, that’s why I ask to make FastAllocateString public. |
@tannergooding sounds like a good idea |
I seem to remember @jkotas having an opinion about that, but I don't remember what it was.
Can you share more details about in what situation this matters? A delegate can be cached (and often is by the C# compiler), so the concern is about the cost of delegate invocation? A string is being allocated and populated; the delegate invocation shows up as being negatively impactful? |
Since
Oh, I assumed you meant something else, not just renaming
The delegate allocation overhead is tiny, since it's just one allocation for the application runtime, if you use |
I think yes.
Not renaming. Make FastAllocateString public or provide additional string.Create overload that does not take any state or delegrate. Only string size. To avoid any allocations.
Yes, but that might be helpful in performance-sensitive places. Like custom structures string formatting or database providers strings materialization. And that story is not only about possible delegate allocation, but also about delegate calling. It's not free. |
@svick Here is my results. Looks like 30%. |
@svick I've updated results. Add benchmarks for .NET 5.0 preview 4. Same difference. |
Per earlier comments, string instances are contractually immutable. The team is largely against adding APIs which encourage developers to mutate these instances. Such code is not guaranteed to work correctly across all versions of the runtime. I don't think benchmarks are going to do much to sway the majority position here. There might be opportunity here to improve the performance of the existing But to repeat, any API which promotes the idea of returning a string instance to the caller and saying "sure, go ahead and pin it and mutate its contents" is anathema to our team. |
@GrabYourPitchforks Ok. Tanner's suggestion looks good. Let's continue our discussion in a way to add string.Create overload that takes a pointer instead of opening FastAllocateString. Can we continue our discussion here or I need to create a new issue? |
Feel free to explore that idea here. |
The difference between delegate call and function pointer calli call is a few instruction. It is unlikely to make significant difference to add an overload that takes function pointer. I believe that most of the overhead of passing in delegate comes from extra call(s): setting up the frame, passing the arguments around, saving callee saved registers, ... . This overhead needs to be eliminated to get on par with calling FastAllocateString directly. Teaching the JIT to inline the delegates would be the most natural way to do it. I know @AndyAyersMS looked into it, but it is not easy. |
I investigated the possibility of adding a new API that would utilize the old "generic struct" trick. The conclusion is that there is room for improvement, but not a whole lot. The setupThe goal was to see if the following API could achieve performance of the fully "unsafe" method that uses static string GenericStringCreate<T>(int length, T initializer) where T : IStringInitialzer
interface IStringInitializer
{
void Initialize(Span<char> span);
} The testing consisted of measuring the time it would take to pass parameters to a dummy function with this signature: [MethodImpl(MethodImplOptions.NoInlining)]
static unsafe void Payload(char* dest, Uuid* uuidPtr) { }
unsafe struct Uuid { fixed byte Bytes[16]; } Tested "means of delivery" (optimized for optimal performance): public const int Length = 1; //To exacerbate the differences
[MethodImpl(MethodImplOptions.AggressiveInlining)]
static unsafe string StringifyWithUnsafe(Uuid uuid)
{
var result = CoreLib.FastAllocateString(Length);
fixed (char* ptr = &result.GetPinnableReference())
{
Payload(ptr, &uuid);
}
return result;
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
static unsafe string StringifyWithGenericStringCreate(Uuid uuid)
{
return StringHelpers.GenericStringCreate(Length, new UuuidStringInitialzer(&uuid));
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
static unsafe string StringifyWithStringCreate(Uuid uuid)
{
var uuidString = string.Create(Length, uuid, (s, u) =>
{
fixed (char* ptr = &MemoryMarshal.GetReference(s))
{
Payload(ptr, &u);
}
});
return uuidString;
}
readonly unsafe struct UuuidStringInitialzer : IStringInitialzer
{
private readonly Uuid* _uuidPtr;
public UuuidStringInitialzer(Uuid* uuidPtr) => _uuidPtr = uuidPtr;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public void Initialize(Span<char> span)
{
fixed (char* ptr = &MemoryMarshal.GetReference(span))
{
Payload(ptr, _uuidPtr);
}
}
} Where static unsafe string GenericStringCreate<T>(int length, T initializer) where T : IStringInitialzer
{
if (length <= 0)
{
if (length == 0)
{
return string.Empty;
}
throw new ArgumentOutOfRangeException(nameof(length));
}
string result = CoreLib.FastAllocateString(length);
initializer.Initialize(MemoryMarshal.CreateSpan(ref CoreLib.GetRawStringData(result), length));
return result;
} The benchmarkspublic class Tests
{
public static readonly Uuid DefaultUuid = default;
[Benchmark(Baseline = true)]
[MethodImpl(MethodImplOptions.NoInlining)]
public unsafe string UnsafeOverhead()
{
return UuidManipulation.StringifyWithUnsafe(DefaultUuid);
}
[Benchmark]
[MethodImpl(MethodImplOptions.NoInlining)]
public unsafe string StringGenericCreateOverhead()
{
return UuidManipulation.StringifyWithGenericStringCreate(DefaultUuid);
}
[Benchmark]
[MethodImpl(MethodImplOptions.NoInlining)]
public unsafe string StringCreateOverhead()
{
return UuidManipulation.StringifyWithStringCreate(DefaultUuid);
}
} Typical results
I rerun them multiple times with random statics added to see if layout could influence the timings, but the results do seem to be consistent (or at least not deviating by more than 1 ns with each run). On average, the unsafe version was ahead by ~2 ns, and vanilla |
Try Also, will the gc move allocated strings if they're not pinned? e.g. public static string ToHexString(this ReadOnlySpan<byte> bytes) {
var newStr = new string('\0', bytes.Length * 2);
ref var firstCh = ref Unsafe.AsRef(newStr.GetPinnableReference());
for (var i = 0; i < bytes.Length; ++i) {
var b = bytes[i];
var nib1 = b >> 4;
var isDig1 = (nib1 - 10) >> 31;
var ch1 = 55 + nib1 + (isDig1 & -7);
var nib2 = b & 0xF;
var isDig2 = (nib2 - 10) >> 31;
var ch2 = 55 + nib2 + (isDig2 & -7);
Unsafe.As<char, int>(ref Unsafe.Add(ref firstCh, i * 2))
= (ch2 << 16) | ch1;
}
return newStr;
} Is there a chance for the string to move during this on a large run? (Given string.Create doesn't appear to pin) Given this string ctor is basically just the result of FastAllocateString handed out almost directly, and GetPinnableReference is essentially readonly GetRawStringData; private string Ctor(char c, int count)
{
if (count <= 0)
{
if (count == 0)
return Empty;
throw new ArgumentOutOfRangeException(nameof(count), SR.ArgumentOutOfRange_NegativeCount);
}
string result = FastAllocateString(count);
if (c != '\0') // Fast path null char string
{
// ... snipped ...
}
return result;
}
public ref readonly char GetPinnableReference() => ref _firstChar;
internal ref char GetRawStringData() => ref _firstChar; |
Using the constructor seems to be slower: [Benchmark(Baseline = true)]
public string CreateWithFastAllocate() => CoreLib.FastAllocateString(32);
[Benchmark]
public string CreateWithNullCharacters() => new string('\0', 32);
I believe no, since The reason that I went with the pointer-based |
I'm closing this issue because the API questions have been answered and because the discussion has veered into encouraging dangerous coding practices. But once more, just to make this painfully clear: It is never safe or supported to mutate the contents of a returned Doesn't matter whether the API is If you mutate a string instance within your own library or application, you are entering unsupported territory. A future framework update could break you. Or - more likely - you'll encounter memory corruption that will be very painful for you or your customers to diagnose. |
Background and Motivation
Provide ability for developers to write libraries that able to format something to string using pre-allocated pinned char* pointer withoput any overhead for delegate call.
It's also possible now to "export" that function using .ilproj like this, but it works only with undocumented IgnoresAccessChecksToAttribute
Proposed API
Usage Examples
Preallocate known-size string, get pinned char* pointer and overwrite string contents like here or here
Alternative Designs
Provide separate external library (do not ship it with System.Private.CoreLib), that expose that method with
public
modifier or add string.Create overload without delegate that can only allocate and return string.Risks
That API should be supported in a future versions of .NET
The text was updated successfully, but these errors were encountered: