Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add u8::to_ascii_digit() #95969

Closed
hniksic opened this issue Apr 12, 2022 · 11 comments
Closed

Add u8::to_ascii_digit() #95969

hniksic opened this issue Apr 12, 2022 · 11 comments
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@hniksic
Copy link
Contributor

hniksic commented Apr 12, 2022

When writing parsers for textual or semi-textual data received from an external source, one often gets octets that need to be interpreted as decimal numbers. For example:

let mut slice: &[u8] = ...;
// get the next octet in `slice` as hex digit, ignoring hyphens
let digit = loop {
    let c = *slice.get(0)? as char;
    slice = &slice[1..];
    if c == '-' {
        continue;
    }
    break Some(c.to_digit(16)? as u8);
}

While the compiler might be able to optimize this code sufficiently, a function like u8::to_ascii_digit() would allow us to operate on u8 directly, making the meaning clearer. It would be analogous to the already existing is_ascii_digit() and to_ascii_{lower,upper}case(). In the case of the above code, it would look like this:

let mut slice: &[u8] = ...;
let digit = loop {
    let d = *slice.get(0)?;
    slice = &slice[1..];
    if d == b'-' {
        continue;
    }
    break Some(d.to_ascii_digit(16)?);
}

The advantage is that it's clear we're working on octets the whole time. It avoids the cast to char which made the code look like it handled non-ASCII characters, when it in reality didn't, nor was it supposed to.

@gimbling-away
Copy link
Contributor

@rustbot claim

@gimbling-away
Copy link
Contributor

#96110 should fix this.

@Stargateur
Copy link
Contributor

Stargateur commented Apr 16, 2022

I'm not sure it's a good idea, this could make people confuse what is what.

A u8 is not supposed to be a character, std force the user to convert a u8 to char is a nice way to prevent mistake. IMO the first sniped is more explicit than the second

@gimbling-away
Copy link
Contributor

std force the user to convert a u8 to char is a nice way to prevent mistake. IMO the first sniped is more explicit than the second

Can you elaborate on this part please? I couldn't really get what you mean @Stargateur

@CraftSpider
Copy link
Contributor

I think there's good intuitive symmetry with the other ascii* functions already on u8. With the provided snippets I find both equally explicit, but I think having this method goes well with the other ascii methods already available, and is equally explicit in general.

@camsteffen
Copy link
Contributor

The existing methods u8::is_ascii_digit and u8::Is_ascii_hexdigit don't have a radix argument.

@m-ou-se m-ou-se added C-feature-request Category: A feature request, i.e: not implemented / a PR. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Apr 28, 2022
@m-ou-se
Copy link
Member

m-ou-se commented Apr 28, 2022

I commented on #96110:

From the name it's not clear to me what direction this conversion goes.

Does b'5'.to_ascii_digit() give me 5? Or does 5u8.to_ascii_digit() give me b'5'?

The 'to ascii' in the name makes me think this returns an ascii character (b'5'), but it seems to do the opposite.

@m-ou-se
Copy link
Member

m-ou-se commented Apr 28, 2022

As indicated on the issue templates, it's usually best to discuss feature requests on https://internals.rust-lang.org/ rather than opening an issue for them.

@gimbling-away
Copy link
Contributor

Proposal to close this since the PR has been closed? [See PR thread for discussion]

@thomcc
Copy link
Member

thomcc commented Jul 25, 2022

In #96110 (comment) there are several reasons this should probably be closed. Further work on this (perhaps in the form of a more fleshed out {core,std}::ascii as mentioned) should be done as an API Change Proposal, rather than in this issue -- see https://std-dev-guide.rust-lang.org/feature-lifecycle/api-change-proposals.html for more info on that if desired.

@thomcc thomcc closed this as completed Jul 25, 2022
@m-ou-se m-ou-se closed this as not planned Won't fix, can't repro, duplicate, stale Jul 25, 2022
@thomcc
Copy link
Member

thomcc commented Jul 25, 2022

(Ah, my bad, thanks)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants