-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: Add more string syntax type constants to StringSyntaxAttribute #65634
Comments
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label. |
Tagging subscribers to this area: @tommcdon Issue DetailsBackground and motivationStringSyntaxAttribute defines a few well-known string types. Seems like this could be made considerably more comprehensive in order to standardize on string syntax naming, making it possible to build tooling (like analyzers) around more standard string syntaxes. API ProposalA few ideas:
API Usage[StringSyntax(StringSyntaxAttribute.Base64)]
public string Key { get; set; }
[StringSyntax(StringSyntaxAttribute.Uri)]
public string SourceUri { get; set; }
public string Format([StringSyntax(StringSyntaxAttribute.CompositeFormat)] string format, params object[] args); Alternative DesignsWithout these, there will be a proliferation of names used to represent the same text, which will make it difficult to do tooling that leverages these annotations. RisksNo response
|
cc: @terrajobst, @CyrusNajmabadi I'm fine with us being more comprehensive in our attribution of the core libraries. Even if Roslyn doesn't special-case certain values, we should at a minimum be able to add analyzers that validate syntax of attributed parameters/members. |
Same, I posted this earlier. I just want to make sure we're getting feedback from at least one consumer. |
I have no issue with this. Though it may cause confusion from people trying this annotation and getting no help whatsoever. |
It would be nice if |
@CyrusNajmabadi, has a further set of languages been considered / prioritized by Roslyn? Presumably it'd be a fairly simple extension on DateTimeFormat to support other format specifiers (numeric, date, time, TimeSpan, enums)? Would it be hard to support C#/VB colorization, given Roslyn already knows how to do that generally? What about composite format string syntax, given that Roslyn already knows how to colorize interpolated strings? |
What is a composite format string? |
e.g. the string passed to string.Format, e.g. "Something {0} something something {1:X}" |
I think numeric/primitive formats would be totally fine to add. Not prioritized currently, but if we felt there was high value there we would.
Probably not. Though I'm curious how often someone is working with a complete c#/vb literal within code. We tend to have that in Roslyn due to our tests. But if be more surprised to find it in customer code. SGs make it more likely, but I think you'd see more partial-fragments of code rather than complete snippets. |
It's more or less required for any Code Analyzer tests. Also, I'd like to suggest C#/VB literal coloring uses different shades than normal so that it can be visually distinct. |
This is itself a very niche group. I'm generally curious about the programming community at large. Note that analyzer tests also don't use C#/VB itself, but a markup dialect of it. |
Here is approximately what the core libraries would look like if we extended StringSyntaxAttribute to have: public const string CompositeFormat = nameof(CompositeFormat);
public const string DateFormat = nameof(DateFormat);
public const string EnumFormat = nameof(EnumFormat);
public const string GuidFormat = nameof(GuidFormat);
public const string NumericFormat = nameof(NumericFormat);
public const string TimeFormat = nameof(TimeFormat);
public const string TimeSpanFormat = nameof(TimeSpanFormat);
public const string Uri = nameof(Uri);
public const string Xml = nameof(Xml); beyond the ones already there: public const string DateTimeFormat = nameof(DateTimeFormat);
public const string Json = nameof(Json);
public const string Regex = nameof(Regex); Commit: Seems like DateFormat, EnumFormat, GuidFormat, NumericFormat, TimeFormat, and TimeSpanFormat would all be relatively straightforward extensions of the existing Roslyn support for DateTimeFormat, just with different lists of values. I expect CompositeFormat could at least benefit from similar syntax highlighting as is used for interpolated strings (e.g. red for the literal portions, black for the holes), as well as warnings about malformed formats (e.g. a missing close bracket). Uri and Xml would presumably need a more involved effort, but would similarly seem to benefit from colorization and warnings about malformed content. I'm skeptical of adding a Base64. The benefit of this attribute is primarily in cases where you'd have a literal at the call site, and literal base64 strings are quite rare in my experience; they're usually fed in from somewhere. On top of that, it’s not clear what benefit tooling would provide in the case where you did have literals… show you the decoded bytes in a tooltip? For SQL, I'd defer to @roji and @ajcvickers about if we should add something for that and what it would be. And from @CyrusNajmabadi's comments, seems a CSharp/VisualBasic wouldn't be particularly valuable in practice. There are an unbounded number of possible "languages" folks might use. I'd prefer to see us only add values for ones where we have concrete examples where they'd be valuable, which means both real APIs that would be annotated due to having a string parameter likely to be used with literals and at least solid ideas for what tooling would improve the experience of those APIs should they be annotated. |
Just to make sure people are aware, Jetbrains have had a subset of these for a long while: StringFormatAttribute (and/or StructuredMessageTemplate) are similar to the proposed [CompositeFormat], there's UriString, RegexPattern, etc. Rider/R# recognize these and provide help and syntax highlighting. Note that Jetbrains IDEs also infer contents in other way in advanced ways - see this blog post). In a nutshell, you can either declare a "language" (JSON/regex/SQL) on a variable via a comment, or the IDE auto-detects it in some cases. Re SQL, @terrajobst's comment about SQL dialects being different is important. Jetbrains allows you to set up a default/project-global dialect setting, according to which all SQL literals are interpreted (it also allows stuff like executing SQL literals against a default configured data source directly in the IDE). A Roslyn analyzer could be similarly configured (via .editorconfig?), telling it what dialect exactly to expect in literals which are known to be SQL (VS could also be aware of that, for syntax highlighting). In theory we could allow defining the dialect on the attribute (e.g. an enum), but that would embed a closed list of known dialects (not a good idea), plus in most cases an API doesn't actually know or care about the dialect (e.g. DbCommand.CommandText accepts any SQL for any database). So I guess I'd prefer to see concrete plans for what an analyzer (or VS) would do with a SQL attribute, and maybe plan based on that... |
We do this as well. |
The major decision pivot points here are who actually owns this code. Roslyn is not going to write and own languages that are out of our purview. I'm working on exposing easy classification now. At that point it's up to interested parties to then add support accordingly |
Which from my list are in and out of Roslyn's purview? |
C#/vb and anything deeply embedded into .net. So Json/regex/xml (and format strings). We're also super wary of any domain that is not backed by a single clear spec. Sql is an example that we would avoid as there are so many dialects. |
Ok, so you'd be ok with pretty much everything I added in stephentoub/runtime@64dc97a...37573bc |
yup yup |
I agree with @roji on the SQL syntax. It feels like a larger VS feature. Perhaps allow some form of InitializationData string on the string format attribute and then have visual studio use that to identify and query a language server to support it. |
😳 |
Connection strings are a similar minefield to the syntax itself. You'd need to know the provider to access the builder type, even then there is no notion of syntax checking it only allows you to access properties. There's obsolete syntax etc. Again it is something you'd want a specific new language service for. |
WRT to connection strings, these generally shouldn't be in source code, but rather in external config. |
Do we have to have every single dialect? Even the common SQL language highlighting would be beneficial. I'm thinking how Azure Data Studio, Notepad++, VS Code, etc. have managed to implement this. |
We changed "DateFormat" to "DateOnlyFormat" and "TimeFormat" to "TimeOnlyFormat" to make it clear that each of them was excluding the aspects of the other. namespace System.Diagnostics.CodeAnalysis
{
public partial sealed class StringSyntaxAttribute : Attribute
{
//public const string DateTimeFormat = nameof(DateTimeFormat);
//public const string Json = nameof(Json);
//public const string Regex = nameof(Regex);
public const string CompositeFormat = nameof(CompositeFormat);
public const string DateOnlyFormat = nameof(DateOnlyFormat);
public const string EnumFormat = nameof(EnumFormat);
public const string GuidFormat = nameof(GuidFormat);
public const string NumericFormat = nameof(NumericFormat);
public const string TimeOnlyFormat = nameof(TimeOnlyFormat);
public const string TimeSpanFormat = nameof(TimeSpanFormat);
public const string Uri = nameof(Uri);
public const string Xml = nameof(Xml);
}
} |
EDITED 3/14/2022 by @stephentoub:
Background and motivation
StringSyntaxAttribute defines a few well-known string types. Seems like this could be made considerably more comprehensive in order to standardize on string syntax naming, making it possible to build tooling (like analyzers) around more standard string syntaxes.
API Proposal
A few ideas:
API Usage
Alternative Designs
Without these, there will be a proliferation of names used to represent the same text, which will make it difficult to do tooling that leverages these annotations.
Risks
No response
The text was updated successfully, but these errors were encountered: