Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: Add more string syntax type constants to StringSyntaxAttribute #65634

Closed
geeknoid opened this issue Feb 20, 2022 · 28 comments · Fixed by #67621
Closed

[API Proposal]: Add more string syntax type constants to StringSyntaxAttribute #65634

geeknoid opened this issue Feb 20, 2022 · 28 comments · Fixed by #67621
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-System.Diagnostics
Milestone

Comments

@geeknoid
Copy link
Member

geeknoid commented Feb 20, 2022

EDITED 3/14/2022 by @stephentoub:

namespace System.Diagnostics.CodeAnalysis
{
    public sealed class StringSyntaxAttribute : Attribute
    {
        public const string DateTimeFormat = nameof(DateTimeFormat);
        public const string Json = nameof(Json);
        public const string Regex = nameof(Regex);

+        public const string CompositeFormat = nameof(CompositeFormat);
+        public const string DateFormat = nameof(DateFormat);
+        public const string EnumFormat = nameof(EnumFormat);
+        public const string GuidFormat = nameof(GuidFormat);
+        public const string NumericFormat = nameof(NumericFormat);
+        public const string TimeFormat = nameof(TimeFormat);
+        public const string TimeSpanFormat = nameof(TimeSpanFormat);
+        public const string Uri = nameof(Uri);
+        public const string Xml = nameof(Xml);
    }
}

Background and motivation

StringSyntaxAttribute defines a few well-known string types. Seems like this could be made considerably more comprehensive in order to standardize on string syntax naming, making it possible to build tooling (like analyzers) around more standard string syntaxes.

API Proposal

A few ideas:

  • CompositeFormat - as in String.Format, etc
  • Base64 - happens in config and other places
  • Uri
  • XML

API Usage

[StringSyntax(StringSyntaxAttribute.Base64)]
public string Key { get; set; }

[StringSyntax(StringSyntaxAttribute.Uri)]
public string SourceUri { get; set; }

public string Format([StringSyntax(StringSyntaxAttribute.CompositeFormat)] string format, params object[] args);

Alternative Designs

Without these, there will be a proliferation of names used to represent the same text, which will make it difficult to do tooling that leverages these annotations.

Risks

No response

@geeknoid geeknoid added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Feb 20, 2022
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Feb 20, 2022
@ghost
Copy link

ghost commented Feb 20, 2022

Tagging subscribers to this area: @tommcdon
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

StringSyntaxAttribute defines a few well-known string types. Seems like this could be made considerably more comprehensive in order to standardize on string syntax naming, making it possible to build tooling (like analyzers) around more standard string syntaxes.

API Proposal

A few ideas:

  • CompositeFormat - as in String.Format, etc
  • Base64 - happens in config and other places
  • Uri
  • XML

API Usage

[StringSyntax(StringSyntaxAttribute.Base64)]
public string Key { get; set; }

[StringSyntax(StringSyntaxAttribute.Uri)]
public string SourceUri { get; set; }

public string Format([StringSyntax(StringSyntaxAttribute.CompositeFormat)] string format, params object[] args);

Alternative Designs

Without these, there will be a proliferation of names used to represent the same text, which will make it difficult to do tooling that leverages these annotations.

Risks

No response

Author: geeknoid
Assignees: -
Labels:

api-suggestion, area-System.Diagnostics, untriaged

Milestone: -

@stephentoub
Copy link
Member

cc: @terrajobst, @CyrusNajmabadi

I'm fine with us being more comprehensive in our attribution of the core libraries. Even if Roslyn doesn't special-case certain values, we should at a minimum be able to add analyzers that validate syntax of attributed parameters/members.

@terrajobst
Copy link
Member

Same, I posted this earlier. I just want to make sure we're getting feedback from at least one consumer.

@CyrusNajmabadi
Copy link
Member

I have no issue with this. Though it may cause confusion from people trying this annotation and getting no help whatsoever.

@teo-tsirpanis
Copy link
Contributor

teo-tsirpanis commented Feb 20, 2022

It would be nice if StringSyntaxAttribute helped with formatting more types like integers and floats. I often get confused over what some specifiers mean and have to look it up.

@stephentoub
Copy link
Member

@CyrusNajmabadi, has a further set of languages been considered / prioritized by Roslyn? Presumably it'd be a fairly simple extension on DateTimeFormat to support other format specifiers (numeric, date, time, TimeSpan, enums)? Would it be hard to support C#/VB colorization, given Roslyn already knows how to do that generally? What about composite format string syntax, given that Roslyn already knows how to colorize interpolated strings?

@CyrusNajmabadi
Copy link
Member

What about composite format string syntax,

What is a composite format string?

@stephentoub
Copy link
Member

stephentoub commented Feb 22, 2022

What is a composite format string?

e.g. the string passed to string.Format, e.g. "Something {0} something something {1:X}"

@CyrusNajmabadi
Copy link
Member

has a further set of languages been considered / prioritized by Roslyn? Presumably it'd be a fairly simple extension on DateTimeFormat to support other format specifiers (numeric, date, time, TimeSpan, enums)?

I think numeric/primitive formats would be totally fine to add. Not prioritized currently, but if we felt there was high value there we would.

Would it be hard to support C#/VB colorization,

Probably not. Though I'm curious how often someone is working with a complete c#/vb literal within code.

We tend to have that in Roslyn due to our tests. But if be more surprised to find it in customer code. SGs make it more likely, but I think you'd see more partial-fragments of code rather than complete snippets.

@tommcdon tommcdon removed the untriaged New issue has not been triaged by the area owner label Feb 23, 2022
@tommcdon tommcdon added this to the 7.0.0 milestone Feb 23, 2022
@Joe4evr
Copy link
Contributor

Joe4evr commented Feb 23, 2022

Though I'm curious how often someone is working with a complete c#/vb literal within code.

It's more or less required for any Code Analyzer tests.

Also, I'd like to suggest C#/VB literal coloring uses different shades than normal so that it can be visually distinct.

@CyrusNajmabadi
Copy link
Member

It's more or less required for any Code Analyzer tests.

This is itself a very niche group. I'm generally curious about the programming community at large. Note that analyzer tests also don't use C#/VB itself, but a markup dialect of it.

@stephentoub stephentoub self-assigned this Mar 3, 2022
@stephentoub
Copy link
Member

stephentoub commented Mar 3, 2022

Here is approximately what the core libraries would look like if we extended StringSyntaxAttribute to have:

public const string CompositeFormat = nameof(CompositeFormat);
public const string DateFormat = nameof(DateFormat);
public const string EnumFormat = nameof(EnumFormat);
public const string GuidFormat = nameof(GuidFormat);
public const string NumericFormat = nameof(NumericFormat);
public const string TimeFormat = nameof(TimeFormat);
public const string TimeSpanFormat = nameof(TimeSpanFormat);
public const string Uri = nameof(Uri);
public const string Xml = nameof(Xml);

beyond the ones already there:

public const string DateTimeFormat = nameof(DateTimeFormat);
public const string Json = nameof(Json);
public const string Regex = nameof(Regex);

Commit:
stephentoub/runtime@64dc97a...37573bc

Seems like DateFormat, EnumFormat, GuidFormat, NumericFormat, TimeFormat, and TimeSpanFormat would all be relatively straightforward extensions of the existing Roslyn support for DateTimeFormat, just with different lists of values. I expect CompositeFormat could at least benefit from similar syntax highlighting as is used for interpolated strings (e.g. red for the literal portions, black for the holes), as well as warnings about malformed formats (e.g. a missing close bracket). Uri and Xml would presumably need a more involved effort, but would similarly seem to benefit from colorization and warnings about malformed content.

I'm skeptical of adding a Base64. The benefit of this attribute is primarily in cases where you'd have a literal at the call site, and literal base64 strings are quite rare in my experience; they're usually fed in from somewhere. On top of that, it’s not clear what benefit tooling would provide in the case where you did have literals… show you the decoded bytes in a tooltip?

For SQL, I'd defer to @roji and @ajcvickers about if we should add something for that and what it would be.

And from @CyrusNajmabadi's comments, seems a CSharp/VisualBasic wouldn't be particularly valuable in practice.

There are an unbounded number of possible "languages" folks might use. I'd prefer to see us only add values for ones where we have concrete examples where they'd be valuable, which means both real APIs that would be annotated due to having a string parameter likely to be used with literals and at least solid ideas for what tooling would improve the experience of those APIs should they be annotated.

@roji
Copy link
Member

roji commented Mar 3, 2022

Just to make sure people are aware, Jetbrains have had a subset of these for a long while: StringFormatAttribute (and/or StructuredMessageTemplate) are similar to the proposed [CompositeFormat], there's UriString, RegexPattern, etc.

Rider/R# recognize these and provide help and syntax highlighting. Note that Jetbrains IDEs also infer contents in other way in advanced ways - see this blog post). In a nutshell, you can either declare a "language" (JSON/regex/SQL) on a variable via a comment, or the IDE auto-detects it in some cases.

Re SQL, @terrajobst's comment about SQL dialects being different is important. Jetbrains allows you to set up a default/project-global dialect setting, according to which all SQL literals are interpreted (it also allows stuff like executing SQL literals against a default configured data source directly in the IDE). A Roslyn analyzer could be similarly configured (via .editorconfig?), telling it what dialect exactly to expect in literals which are known to be SQL (VS could also be aware of that, for syntax highlighting).

In theory we could allow defining the dialect on the attribute (e.g. an enum), but that would embed a closed list of known dialects (not a good idea), plus in most cases an API doesn't actually know or care about the dialect (e.g. DbCommand.CommandText accepts any SQL for any database).

So I guess I'd prefer to see concrete plans for what an analyzer (or VS) would do with a SQL attribute, and maybe plan based on that...

@CyrusNajmabadi
Copy link
Member

In a nutshell, you can either declare a "language" (JSON/regex/SQL) on a variable via a comment, or the IDE auto-detects it in some cases.

We do this as well.

@CyrusNajmabadi
Copy link
Member

The major decision pivot points here are who actually owns this code. Roslyn is not going to write and own languages that are out of our purview.

I'm working on exposing easy classification now. At that point it's up to interested parties to then add support accordingly

@stephentoub
Copy link
Member

Roslyn is not going to write and own languages that are out of our purview.

Which from my list are in and out of Roslyn's purview?

@CyrusNajmabadi
Copy link
Member

C#/vb and anything deeply embedded into .net. So Json/regex/xml (and format strings).

We're also super wary of any domain that is not backed by a single clear spec. Sql is an example that we would avoid as there are so many dialects.

@stephentoub
Copy link
Member

stephentoub commented Mar 4, 2022

Ok, so you'd be ok with pretty much everything I added in stephentoub/runtime@64dc97a...37573bc

@CyrusNajmabadi
Copy link
Member

yup yup

@Wraith2
Copy link
Contributor

Wraith2 commented Mar 8, 2022

I agree with @roji on the SQL syntax.

It feels like a larger VS feature. Perhaps allow some form of InitializationData string on the string format attribute and then have visual studio use that to identify and query a language server to support it.

@teo-tsirpanis
Copy link
Contributor

teo-tsirpanis commented Mar 8, 2022

Another idea: database connection strings. Doesn't have to be too complicated and recognize specific fields, just basic highlighting and syntax check.

😳

@Wraith2
Copy link
Contributor

Wraith2 commented Mar 8, 2022

Connection strings are a similar minefield to the syntax itself. You'd need to know the provider to access the builder type, even then there is no notion of syntax checking it only allows you to access properties. There's obsolete syntax etc. Again it is something you'd want a specific new language service for.

@roji
Copy link
Member

roji commented Mar 8, 2022

WRT to connection strings, these generally shouldn't be in source code, but rather in external config.

@stephentoub stephentoub added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation labels Mar 14, 2022
@KieranDevvs
Copy link

We're also super wary of any domain that is not backed by a single clear spec. Sql is an example that we would avoid as there are so many dialects.

Do we have to have every single dialect? Even the common SQL language highlighting would be beneficial. I'm thinking how Azure Data Studio, Notepad++, VS Code, etc. have managed to implement this.

@stephentoub stephentoub added the blocking Marks issues that we want to fast track in order to unblock other important work label Apr 5, 2022
@bartonjs
Copy link
Member

bartonjs commented Apr 5, 2022

Video

We changed "DateFormat" to "DateOnlyFormat" and "TimeFormat" to "TimeOnlyFormat" to make it clear that each of them was excluding the aspects of the other.

namespace System.Diagnostics.CodeAnalysis
{
    public partial sealed class StringSyntaxAttribute : Attribute
    {
        //public const string DateTimeFormat = nameof(DateTimeFormat);
        //public const string Json = nameof(Json);
        //public const string Regex = nameof(Regex);

        public const string CompositeFormat = nameof(CompositeFormat);
        public const string DateOnlyFormat = nameof(DateOnlyFormat);
        public const string EnumFormat = nameof(EnumFormat);
        public const string GuidFormat = nameof(GuidFormat);
        public const string NumericFormat = nameof(NumericFormat);
        public const string TimeOnlyFormat = nameof(TimeOnlyFormat);
        public const string TimeSpanFormat = nameof(TimeSpanFormat);
        public const string Uri = nameof(Uri);
        public const string Xml = nameof(Xml);
    }
}

@bartonjs bartonjs added api-approved API was approved in API review, it can be implemented and removed blocking Marks issues that we want to fast track in order to unblock other important work api-ready-for-review API is ready for review, it is NOT ready for implementation labels Apr 5, 2022
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Apr 5, 2022
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Apr 7, 2022
@deeprobin

This comment was marked as outdated.

@deeprobin

This comment was marked as outdated.

@ghost ghost locked as resolved and limited conversation to collaborators May 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Diagnostics
Projects
None yet
Development

Successfully merging a pull request may close this issue.