-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve XmlSerializationWriter.WriteTypedPrimitive #76436
Improve XmlSerializationWriter.WriteTypedPrimitive #76436
Conversation
Tagging subscribers to this area: @dotnet/area-system-xml |
I've created following benchmark to validate benefits: [BenchmarkCategory(Categories.Libraries)]
public class Perf_XmlSerializationWriter
{
private const int Iterations = 1000;
private readonly MyXmlSerializationWriter _writer = new MyXmlSerializationWriter();
private static readonly DateTime Now = new DateTime(2022, 9, 30, 9, 4, 15, DateTimeKind.Utc);
private static readonly DateTimeOffset DtoNow = Now.AddDays(1);
private static readonly TimeSpan Ts = new TimeSpan(1, 2, 3, 4, 5);
private static readonly byte[] BArray = new byte[] { 33, 44, 55 };
[IterationSetup]
public void CleanWriter() => _writer.Clean();
[Benchmark(OperationsPerInvoke = Iterations)]
public int AddPrimitives()
{
for (int i = 0; i < Iterations; i++)
{
_writer.Write('a');
_writer.Write(123);
_writer.Write(123.45m);
_writer.Write(Now);
_writer.Write(Ts);
_writer.Write(DtoNow);
_writer.Write((short)55);
_writer.Write(2345324L);
_writer.Write((sbyte)11);
_writer.Write((ushort)34);
_writer.Write((uint)4564);
_writer.Write((ulong)456734767);
_writer.Write((byte) 67);
_writer.Write(BArray);
_writer.Write(Guid.NewGuid());
}
return _writer.GetXmlLength();
}
}
internal class MyXmlSerializationWriter : XmlSerializationWriter
{
private readonly StringBuilder _builder = new StringBuilder();
public MyXmlSerializationWriter()
{
Writer = new XmlTextWriter(new StringWriter(_builder));
}
protected override void InitCallbacks()
{
}
public void Write<T>(T value) => WriteTypedPrimitive(null, null, value, false);
public void Clean() => _builder.Clear();
public string GetXml() => _builder.ToString();
public int GetXmlLength() => _builder.Length;
} And results from it: Runtime = .NET 7.0.0 (7.0.22.42610), X64 RyuJIT AVX2; GC = Concurrent Workstation
|
Method | Mean | Error | StdDev | Median | Min | Max | Gen 0 | Allocated |
---|---|---|---|---|---|---|---|---|
AddPrimitives | 2.032 us | 0.0286 us | 0.0223 us | 2.040 us | 1.985 us | 2.061 us | 0.1000 | 984 B |
After
Method | Mean | Error | StdDev | Median | Min | Max | Allocated |
---|---|---|---|---|---|---|---|
AddPrimitives | 1.960 us | 0.0249 us | 0.0194 us | 1.954 us | 1.935 us | 2.007 us | 360 B |
src/libraries/System.Private.Xml/src/System/Xml/Serialization/XmlSerializationWriter.cs
Outdated
Show resolved
Hide resolved
@eiriktsarpalis , @krwq Any thoughts here ? |
src/libraries/System.Private.Xml/src/System/Xml/Schema/XsdDuration.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.Xml/src/System/Xml/Schema/XsdDuration.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.Xml/tests/XmlSerializer/XmlSerializerTests.RuntimeOnly.cs
Outdated
Show resolved
Hide resolved
@TrayanZapryanov really sorry for delay, I've been quite busy in the last weeks |
@krwq Can you check again? |
src/libraries/System.Private.Xml/tests/XmlSerializer/XmlSerializerTests.RuntimeOnly.cs
Show resolved
Hide resolved
cc: @HongGit |
…ttps://github.com/TrayanZapryanov/runtime into improve_xmlserializatiowriter_writetypedprimitive
|
||
char[] buffer = _primitivesBuffer ??= new char[128]; | ||
switch (Type.GetTypeCode(t)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would ArrayPool<char>.Shared.Rent(128);
and returning at the end be better? I think this class can be used by multiple threads simultaneously if you have more than 1 thread serializaing concurrently using the same serializer instance. This shared buffer will become corrupted in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mconnew You are right for synchronization problems and I followed your suggestion and used ArrayPool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mconnew Unfortunately this renting decreased performance even worse than before. Here are samples using benchmark from here : dotnet/performance#2623
Before:
Method | Mean | Error | StdDev | Median | Min | Max | Gen0 | Allocated |
---|---|---|---|---|---|---|---|---|
AddPrimitives | 1.878 us | 0.0175 us | 0.0163 us | 1.876 us | 1.857 us | 1.908 us | 0.0900 | 792 B |
After:
Method | Mean | Error | StdDev | Median | Min | Max | Gen0 | Allocated |
---|---|---|---|---|---|---|---|---|
AddPrimitives | 2.192 μs | 0.0433 μs | 0.0384 μs | 2.177 μs | 2.155 μs | 2.275 μs | 0.0400 | 360 B |
There is definitely win in memory, but regression in execution.
I've looked at where XmlSerializationWriter is instantiated and found only here :
runtime/src/libraries/System.Private.Xml/src/System/Xml/Serialization/XmlSerializer.cs
Line 364 in 9b76c28
public void Serialize(XmlWriter xmlWriter, object? o, XmlSerializerNamespaces? namespaces, string? encodingStyle, string? id) |
Looks like it always creates new instance and there is method "Init" which also implies that this class have state and cannot be used concurrently.
Do you remember where this class is used concurrently ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same benchmark, but with previous implementation
Method | Mean | Error | StdDev | Median | Min | Max | Gen0 | Allocated |
---|---|---|---|---|---|---|---|---|
AddPrimitives | 1.906 us | 0.0209 us | 0.0185 us | 1.908 us | 1.855 us | 1.936 us | 0.0400 | 360 B |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When using a pre-generated serializer, it provides an instance derived from XmlSerializationWriter which I think is cached and reused with multiple calls to Serialize. There's also the reflection based serialization used when it's using SOAP mapping which might not use a new instance each time. The allocation is small, what's the cost of allocating it fresh each time? You could also implement it in such a way that it's optimal for single usage. Eg have a byte[]
field and use Interlocked.Exchange replacing it with null to "rent" the buffer. If another thread is using it, you will get a null back from the exchange so you create a new array locally. When you are done with it, use Interlocked.Exchange to put it back. This avoids the overhead of renting, will have zero allocation in the common case, and degrades to something no worse than it was originally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mconnew I like your idea and pushed new commit with it. Could you check again ?
Also results now are better then before:
Method | Mean | Error | StdDev | Median | Min | Max | Gen0 | Allocated |
---|---|---|---|---|---|---|---|---|
AddPrimitives | 1.969 us | 0.0351 us | 0.0293 us | 1.962 us | 1.942 us | 2.047 us | 0.0400 | 360 B |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mconnew One more question. I create char[128] which might be too big. Any suggestion here? When I execute tests from Performance repo I see increase of bytes allocated(maximum 8), but I really hoped to see some reduction :(.
Should I increase code complexity by starting with lower buffer like char[64] and then to inspect TryFormat methods and retry with bigger buffer, or we can leave this regression for small number of primitives, and have bigger wins when xml produces is much bigger ?
Here are results from XML performance comparison tests:
Statistics
Total: 68
Same: 75.00 %
Slower: 5.88 %
Faster: 10.29 %
Noise: 8.82 %
Unknown: 0.00 %
Statistics per Architecture
Architecture | Same | Slower | Faster | Noise | Unknown |
---|---|---|---|---|---|
X64 | 75.00 % | 5.88 % | 10.29 % | 8.82 % | 0.00 % |
Statistics per Operating System
Operating System | Same | Slower | Faster | Noise | Unknown |
---|---|---|---|---|---|
Windows 10 | 75.00 % | 5.88 % | 10.29 % | 8.82 % | 0.00 % |
Statistics per Namespace
Namespace | Same | Slower | Faster | Noise | Unknown |
---|---|---|---|---|---|
MicroBenchmarks.Serializers | 78.12 % | 9.38 % | 12.50 % | 0.00 % | 0.00 % |
Microsoft.Extensions.Configuration.Xml | 75.00 % | 0.00 % | 25.00 % | 0.00 % | 0.00 % |
System.Xml.Linq | 73.68 % | 0.00 % | 0.00 % | 26.32 % | 0.00 % |
System.Xml.Tests | 66.67 % | 16.67 % | 16.67 % | 0.00 % | 0.00 % |
XmlDocumentTests.XmlDocumentTests | 66.67 % | 0.00 % | 33.33 % | 0.00 % | 0.00 % |
XmlDocumentTests.XmlNodeListTests | 100.00 % | 0.00 % | 0.00 % | 0.00 % | 0.00 % |
XmlDocumentTests.XmlNodeTests | 50.00 % | 0.00 % | 0.00 % | 50.00 % | 0.00 % |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have thought of a way to eliminate the allocation. You could use stackalloc
to allocate the memory on the stack. The only problem with this is that XmlWriter needs a char[]
passed which means you can't assign the stack allocated memory to a Span<char>
initially, we need it as an array. marking the method as unsafe. WriteTypedPrimitive
is a protected method on an existing public type. I don't know if adding the unsafe keyword is a breaking change. This is easily solved by moving the implementation to an inner private method which is marked unsafe and calling that from this method. So something like this:
protected void WriteTypedPrimitive(string? name, string? ns, object o, bool xsiType)
{
UnsafeWriteTypedPrimitive(name, ns, o, xsiType);
}
private void UnsafeWriteTypedPrimitive(string? name, string? ns, object o, bool xsiType)
{
char* buffer = stackalloc char[128];
Span<char> bufferSpan = buffer;
// rest of implementation
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mconnew Unfortunately I am not so good with pointer math.
How can I convert char* to char[]?
I cannot see any method of XmlWriter which accepts char*.
In XmlUtf8RawTextWriter I can see that there is iteration using char*, but that's all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been thinking about this some more and realized multiple methods can't be called on different threads concurrently. The class wraps the XmlWriter (stored in the _w field) which means the thread usage must match that of XmlWriter. And as you can only have one thread at a time trying to write to an XmlWriter, only one thread at a time will be calling methods on this class. It's safe to have a single instance of the buffer which exists for the lifetime of this class.
I tried to work out a way to cast a char* to a char[] and there's no good clean way to do it.
src/libraries/System.Private.Xml/tests/XmlSerializer/XmlSerializerTests.RuntimeOnly.cs
Outdated
Show resolved
Hide resolved
@mconnew, @krwq Next changes :
|
@@ -38,6 +38,9 @@ public abstract class XmlSerializationWriter : XmlSerializationGeneratedCode | |||
private bool _soap12; | |||
private bool _escapeName = true; | |||
|
|||
//char buffer for serializing primitive values | |||
private char[]? _primitivesBuffer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @mconnew mentioned, it should be safe to keep one buffer alongside the reference to a single instance of XmlWriter (aka, _w on line 27). Calling into here simultaneously would already be asking for trouble racing to the _w.Write*()
calls at the end of the method.
private char[]? _primitivesBuffer; | |
private char[] _primitivesBuffer = new char[128]; |
And then use this buffer through the rest of WriteTypedPrimitive
without the need for any interlocked exchange.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@StephenMolloy, @mconnew
I will remove it, but have two questions:
- char[128] might be too big - Do you see any primitive type that cannot fit in char[64] for example?
- If you check next changes, we will need same buffer in other classes that inherits it. Should we make it protected with this PR or later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay - I set it to char[64]
Test failure appears to be unrelated. #64227 |
Fixes 76434