Allow changing the newline sequence for indenting using Utf8JsonWriter #54715

SteveKCheng · 2021-06-25T02:47:23Z

Background and Motivation

Currently the newline sequence used by Utf8JsonWriter depends on the running platform, obtained from Environment.NewLine.

It would be useful to be able to set it explicitly so that JSON output can be reproduced exactly no matter what platform the .NET program is running under. It is essential when the JSON output bytes are to be compared directly or through a hash, e.g. as a digital signature or for regression-testing expected outputs.

Proposed API

namespace System.Text.Json
{
    public struct JsonWriterOptions
    {
+        public string? NewLine { get; set; }
    }
}

The default value is null (backwards compatible), which means to take the value from Environment.NewLine.
Otherwise the specified string is used as the newline sequence, as in the TextWriter.NewLine property.

Usage Examples

using var writer = new Utf8JsonWriter(outputStream, new JsonWriterOptions
{
    Indented = true,
    NewLine = "\n"
};

Alternatives

Currently as Utf8JsonWriter does not offer this functionality, I have to use Newtonsoft's JSON writer instead --- which relies on TextWriter that has a customizable NewLine property. It seems fairly difficult, and obviously inefficient, to write a "filter" on a Stream to convert the platform-specific newline byte sequence.

Risks

I think it is low risk for the most common cases. There is no change to behavior of existing code that does not set the JsonWriterOptions.NewLine property. Utf8JsonWriter's code is already parameterized on Environment.NewLine and this proposal is just asking we stop hard-coding the use of a global property.

What if the arbitrary newline sequence contains non-ASCII characters? Or it results in invalid JSON?

My first thought is that such characters should still be supported for completeness (otherwise we'd have to define what the behavior would be) even they might cause slightly slower execution. And it would be consistent with TextWriter's API with no friction for the user.

But perhaps an enumeration (or boolean?) allowing only for the standard choices of newline sequences would work too.
Currently, Utf8JsonWriter's code assumes Environment.NewLine is either "\r\n" or "\n". Supporting the choice between these two would be good enough to satisfy the motivation above.

The text was updated successfully, but these errors were encountered:

ghost · 2021-06-25T02:47:34Z

Tagging subscribers to this area: @eiriktsarpalis, @layomia
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and Motivation

Currently the newline sequence used by Utf8JsonWriter depends on the running platform, obtained from Environment.NewLine.

It would be useful to be able to set it explicitly so that JSON output can be reproduced exactly no matter what platform the .NET program is running under. It is essential when the JSON output bytes are to be compared directly or through a hash, e.g. as a digital signature or for regression-testing expected outputs.

Proposed API

namespace System.Text.Json
{
    public struct JsonWriterOptions
    {
+        public string? NewLine { get; set; }
    }
}

The default value is null (backwards compatible), which means to take the value from Environment.NewLine.
Otherwise the specified string is used as the newline sequence, as in the TextWriter.NewLine property.

Usage Examples

using var writer = new Utf8JsonWriter(outputStream, new JsonWriterOptions
{
    Indented = true,
    NewLine = "\n"
};

Alternatives

Currently as Utf8JsonWriter does not offer this functionality, I have to use Newtonsoft's JSON writer instead --- which relies on TextWriter which has a customizable NewLine property. It seems fairly difficult, and obviously inefficient, to try to write a "filter" on a Stream to convert the platform-specific newline byte sequence.

Risks

I think it is low risk for the most common cases. There is no change to behavior of existing code that does not set the JsonWriterOptions.NewLine property. Utf8JsonWriter's code is already parameterized on Environment.NewLine and this proposal is just asking we stop hard-coding the use of a global property.

What if the newline sequence contains non-ASCII characters? I think such characters should still be supported for completeness (otherwise we'd have to define what the behavior would be) even they might cause slightly slower execution. An enumeration could be used instead, allowing only for the standard choices of newline sequences, but an enumeration is rather noisy (I think in this case) and it would not be consistent with the existing API on TextWriter.

Author:	SteveKCheng
Assignees:	-
Labels:	`api-suggestion`, `area-System.Text.Json`, `untriaged`
Milestone:	-

GrabYourPitchforks · 2021-06-25T03:06:30Z

I don't have any feedback on the customizing newline scenario - will leave that for the JSON team. But I wanted to focus on this comment:

It is essential when the JSON output bytes are to be compared directly or through a hash, e.g. as a digital signature or for regression-testing expected outputs.

Please be advised that different major versions of System.Text.Json are not guaranteed to have byte-for-byte identical output with one another. This could be because the .NET runtime itself changes how some values are encoded (like how .NET Framework and .NET 5 have different string representations for double), it could be because characters like '€' may or may not be escaped by default, or it could be some other factor.

If you rely on the contents having a very specific byte-for-byte representation, please always test each new version of System.Text.Json (or its dependencies) or each new version of the .NET runtime in an integration environment before deploying to production. This will allow you to catch such discrepancies early.

Hope this helps!

SteveKCheng · 2021-06-25T03:09:25Z

@GrabYourPitchforks
Just want to comment: I am fine with the caveat that byte-for-byte comparison only works with the same version of the run-time & libraries. The use case I have right now is regression-testing by comparing expected outputs where some developers might run the program on Windows while the automated periodic process is running on Linux.

jeffhandley · 2022-04-06T01:39:05Z

@SteveKCheng We know that we aren't going to get to this feature during .NET 7, and compared to many of the other issues/features in our System.Text.Json backlog, I don't expect we'll get to this for the foreseeable future. To reflect this, I'm going to close the issue.

For the regression testing scenario you described, I recommend normalizing the newlines before comparison. As was previously called out, other direct comparisons or consumption for a hash, a digital signature, or other purpose would not be reliable without normalization either.

jeffhandley · 2022-09-04T20:54:27Z

Reopening based on additional feedback from @damieng that there are scenarios where additional control of the written content is valuable.

@damieng if you could, please add a comment citing your scenarios so that we can consider them and if we proceed on this use them as acceptance criteria.

SteveKCheng added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Jun 25, 2021

dotnet-issue-labeler bot added area-System.Text.Json untriaged New issue has not been triaged by the area owner labels Jun 25, 2021

eiriktsarpalis added needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration and removed untriaged New issue has not been triaged by the area owner labels Jul 16, 2021

eiriktsarpalis added this to the 7.0.0 milestone Jul 16, 2021

eiriktsarpalis added api-needs-work API needs work before it is approved, it is NOT ready for implementation and removed needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration labels Oct 15, 2021

eiriktsarpalis removed the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Nov 16, 2021

eiriktsarpalis mentioned this issue Jan 13, 2022

System.Text.Json work planned for .NET 7 #63762

Closed

16 tasks

eiriktsarpalis mentioned this issue Feb 3, 2022

[API Proposal]: Allow specifying indent size and whitespace character when writing JSON with Utf8JsonWriter #63882

Closed

jeffhandley closed this as completed Apr 6, 2022

ghost locked as resolved and limited conversation to collaborators May 6, 2022

jeffhandley reopened this Sep 4, 2022

jeffhandley modified the milestones: 7.0.0, Future Sep 4, 2022

dotnet unlocked this conversation Sep 4, 2022

scbedd mentioned this issue Feb 21, 2023

[Prioritized] Normalize line endings Azure/azure-sdk-tools#2453

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow changing the newline sequence for indenting using Utf8JsonWriter #54715

Allow changing the newline sequence for indenting using Utf8JsonWriter #54715

SteveKCheng commented Jun 25, 2021 •

edited

Loading

ghost commented Jun 25, 2021

Background and Motivation

Proposed API

Usage Examples

Alternatives

Risks

GrabYourPitchforks commented Jun 25, 2021

SteveKCheng commented Jun 25, 2021

jeffhandley commented Apr 6, 2022

jeffhandley commented Sep 4, 2022

Allow changing the newline sequence for indenting using Utf8JsonWriter #54715

Allow changing the newline sequence for indenting using Utf8JsonWriter #54715

Comments

SteveKCheng commented Jun 25, 2021 • edited Loading

Background and Motivation

Proposed API

Usage Examples

Alternatives

Risks

ghost commented Jun 25, 2021

Background and Motivation

Proposed API

Usage Examples

Alternatives

Risks

GrabYourPitchforks commented Jun 25, 2021

SteveKCheng commented Jun 25, 2021

jeffhandley commented Apr 6, 2022

jeffhandley commented Sep 4, 2022

SteveKCheng commented Jun 25, 2021 •

edited

Loading