-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Digit separators #216
Comments
Does this apply to real literals as well? For example, would I have no idea if this would be useful, just curious. |
👍 |
Don't shoot me, but would it be too hard to parse "space" as a seperator? Or does that make the grammer ambiguous? int two = 0b 10;
short max = 0x ffff;
long oneMillion = 1 000 000; Just thinking out loud. |
Digit separators where included in the VB.net (vNext CTP) would it be beneficial to a also describe what was allow in VB? It think |
I agree I like space more then underscore. It's generally easier to type and makes it easier when working with something like hex numbers e.g. |
I don't see how you could use a space as the separator because numeric literals would then potentially consist of not one but several tokens. This would make them very difficult to parse. The underscore seems the best choice of separator to me, particularly as it's already used by several other languages. I'm not so keen on allowing multiple consecutive underscores but I suppose it does no harm. |
This grammar wouldn't allow consecutive separators.
I think spaces could also be possible
|
I think it would be very hard to use spaces. I haven't looked at the parser, but it's probably doing something like breaking the text at white spaces, parenthesis, braces, whatever and analyzing the tokens from there. Assuming that after a numeric literal it might come the rest of it is doable, but I don't think it is worth the cost. And what next? This?
Or this?
Although it might be an itsy bitsy harder to write in most keyboard configuration, the semantic break of the numeric literal is the same with the |
Wonder if the parser supports significant whitespace? |
The VB implementation of digit group separators prototyped last year actually supported three different separators originally: underscore, back tick, and space. So you could write &B1111 0010 or 1_000_000 or 3`600. We quickly decided that back tick didn't make enough sense to anyone and cut it. The VB preview still supported both underscores and spaces. The biggest motivation for spaces was binary literals, another feature prototyped at the same time, because binary numbers are conventionally separated with spaces. As to implementation, it's not hard at all really - at least in VB, particularly when you don't allow multiple consecutive separators. Normally the scanner encounters a digit and starts scanning a integral literal one character at a time until it encounters a character that's not a digit for the base being used (decimal, hex, octal) then it stops. We changed it so that if the non-digit character were a underscore or space it would peek one more character ahead and if that character were a digit it would keep scanning it as a single token. There are some corner cases you have to put extra recovery around but it's not very complicated, particularly because in VB it's not valid to have two integer literals follow one another so it's non-breaking to interpret 1 1 as 11. I think C# is the same here though in C# we were pretty settled that underscore would be the sole separator. I think the biggest concern about that is that tools would be confused thinking the space was a word boundary (not VS, the editor is smart enough in VS to handle space) and we just couldn't foresee what havoc spaces would be unleashing on the world (if any). Another more minor concern was complexity - would users benefit more from having a single consistent separator used everywhere? If we decided to pick one it would likely be the underscore so space was only a possibility if we were ok with having two separators which was an open question. -ADG |
Using space as a separator would probably be a bad idea, because it would cause hard-to-spot mistakes. For instance, |
@thomaslevesque that is a very good point, before I suggested it I quickly tried to think of places where two numbers would follow each other, but I had totally missed this obvious one. I think that is probably a deal breaker. Seems generally people are not for using space, and I think I have come to agree with this point. Still don't like how "1_000" looks, but it might be the best and easiest option. |
Isn't this proposal about digit separators for the literals have a prefix? |
@AdamSpeight2008, no, it's for all numeric literals. |
@AdamSpeight2008, we did consider restricting space in particular to its most obvious use case - binary literals. It would be unusual, but I think it's worth considering if it gives us more confidence in the feature. @thomaslevesque, @chrisaut, I find that developers tend to bias negatively on what would confuse other developers and how often. Just about every feature ever proposed or introduced has someone saying "this will cause hard to spot mistakes for everyone ever". There are also features which at first seem harmless - then later turn out to be pits of failure. Fortunately, with "Roslyn" and a managed code base it's much easier to quickly prototype language features - even the scary ones and experiment and make decisions after making observations. I think that will give us the most room to explore the full potential of the language without being committed to doing or not doing a feature a particular way too early. It's still very very early in the design of VB15 (this idea has 0% chance of making it into C#) and given how often space has been proposed or preferred by different VB users we've spoken to I'd hate to cut the idea down prematurely if it could actually produce a better experience for those users. Regards, -ADG |
I'd say ' or ` are better choices than _:
|
Sadly this holds true only for the US keyboard layout. At least In the German layout all three require two key strokes. Only space is one keystroke here, too. |
Agreed ` or ' are undesirable for the reasons already mentioned. I actually don't mind using _ as a separator at all, and, frankly, anything here is better than nothing :) Using space seems like a recipe for conflicts all over the place, and I don't see it adding that much value. I dislike the idea of allowing multiple, alternative separators, while anyone reusing Roslyn wouldn't care, other tools doing their own lexing of C# code would have to do much more work. |
|
In VB.net |
@ViIvanov ` and ' make it look like numbers are indicating degrees. or feet and inches. |
@AdamSpeight2008, in VB the explicit line continuation is actually to ensure that the underscore is never a trailing character of an identifier or other token so it wouldn't be a problem. I agree that ` and ' look more like units of measurement. _ has a precedent in identifiers as a chunk separator. is used for binary numbers in particular and has been recommended by various bodies as a standard separator alternative to either comma or period (http://en.wikipedia.org/wiki/Decimal_mark#Digit_grouping) I haven't seen a good scenario for multiple consecutive separators yet and am likely to advocate disallowing them. |
@gafter So the final decision is disallowing separators immediately after prefixes? |
@CnSimonChan I think it is implement in the |
@zippec: Completely agree. @jaredpar should we break out the feature request for |
@jskeet yes let's use a separate issue since this feature is implemented as spec'd here. We can use the new issue to track changing to allow that syntax. |
Would be nice. Although space feels less C#ish, I still vote for spaces, I mean can it go wrong as long as we're expecting a |
@weitzhandler, I think that changing C# 7 and Visual Studio for tomorrow is, most probably, out of the question. 😄 |
so var a = 1 0; is actually just ten? |
@alrz, that's no worst than
The greater issue here is that, in this particular case and only in this particular case, space is a special case for white spaces. And that's bad. Very bad. |
No, the space is worst because it's invisible. In your example it's impossible to overlook the zero because the literal goes on and on. and on. |
Limit to single space (surely no line breaks 😡).
|
@weitzhandler No it doesn't. C# doesn't mind how many spaces you are using between tokens at all. |
We should keep the discussion here. |
@weitzhandler I think you mean discussion has moved here. |
@alrz, Visual Studio can make white space visible. But I still think that would be the least of the problems. |
Discussion for this feature has been moved here. |
Being able to group digits in large numeric literals would have great readability impact and no significant downside.
Adding binary literals (#215) would increase the likelihood of numeric literals being long, so the two features enhance each other.
We would follow Java and others, and use an underscore
_
as a digit separator. It would be able to occur everywhere in a numeric literal (except as the first and last character), since different groupings may make sense in different scenarios and especially for different numeric bases:Any sequence of digits may be separated by underscores, possibly more than one underscore between two consecutive digits. They are allowed in decimals as well as exponents, but following the previous rule, they may not appear next to the decimal (
10_.0
), next to the exponent character (1.1e_1
), or next to the type specifier (10_f
). When used in binary and hexadecimal literals, they may not appear immediately following the0x
or0b
.The syntax is straightforward, and the separators have no semantic impact - they are simply ignored.
This has broad value and is easy to implement.
The text was updated successfully, but these errors were encountered: