What's the correct wrapping value for `ilog{|2|10}`? #100422

scottmcm · 2022-08-11T19:43:01Z

Filing since it's in pFCP over at #70887 (comment), and I don't think we can reasonably change it after stabilization (especially since it's documented).

Currently, 0.ilog2() in release mode evaluates to 0. (And same for the other bases.)

It's not obvious to me that that's necessarily the best option. For example, the (BITS-1) - LeadingZeros implementation is slightly nicer on CPUs from the past decade or so https://rust.godbolt.org/z/3q6WME84Y (and I think would be just as nice on older ones if LLVM was a bit smarter), but gives -1_{as u32} in the wrapping case.

It also might be nice to have the wrapping case give something clearly different from any of the successful cases, rather than having 1.ilog2() and 0.ilog2() give the same thing. For example u32::MAX.next_power_of_two() returns 0 in release mode, which isn't a power of two (according to is_power_of_two), so it's distinguishable from all the non-wrapping cases.

This only affects ilog/ilog2/ilog10 on uNN. (The checked versions don't have an issue, nor do the ones on NonZeroUNN.)

The text was updated successfully, but these errors were encountered:

scottmcm · 2022-08-11T19:52:56Z

I couldn't find any previous discussion of what the exact wrapping value ought to be, just that there ought to be one, as opposed to a panic!.

There was this note: #80918 (comment)

In this case of logs, checking has to be done anyway

For which I think the ilog2 implementation I mention above is a counter-example, albeit only for the base-2 case.

Also, currently ilog2 in release mode is essentially .checked_ilog2().unwrap_or(0), so we can pick any value we want to return for those cases.

FWIW, the implementation in https://graphics.stanford.edu/~seander/bithacks.html#IntegerLogLookup gives -1 as the result for ilog2(0).

leonardo-m · 2022-08-11T21:21:11Z

In Rust we try to do the right thing. Not even Python gives -1. Let's avoid in-band error values in Rust.

scottmcm · 2022-08-11T21:31:44Z

@leonardo-m What's your proposal for what the "right" thing is, then? That 0.ilog2() panic in release too?

Arguably the most-correct value is −∞ (like (+0.0_f32).log2()), but that doesn't fit in any of our integer types.

So 0.saturating_ilog2() => 0 would make sense, but release mode for the other un-prefixed things is wrapping, not saturating.

leonardo-m · 2022-08-11T21:48:51Z

Python has some functions that return in-band values:

>>> "xy".find("y")
1
>>> "xy".find("z")
-1

From Rust I expect something better. In Rust if a function isn't naturally total and you can't restrict its input, then I expect it to return Option (or Result), or when that's isn't ergonomic enough, a panic (in release too). Returning -1 is dangerous as that Python find function, returning 0 is not math.

cmpute · 2022-08-11T23:46:54Z

I think returning -1 (wrapped if necessary) is a clear distinguish between normal and error result. Returning an Option is not ideal currently because it will makes the function practically unusable in const context (before const unwrap is stabilized)

If getting closest to the "correct" math answer, how about choosing i32::MIN as u32? However it seems weird and -1 still seems better to me

leonardo-m · 2022-08-12T07:03:52Z

I think returning -1 (wrapped if necessary) is a clear distinguish between normal and error result.

Perhaps in other languages. Rust is known for being better than that. I have chosen to use Rust because it's a bit better than Python (see above).

Returning an Option is not ideal currently because it will makes the function practically unusable in const context (before const unwrap is stabilized)

Let's return Option and and wait for const unwrap to be stabilized before making ilog const, then.

scottmcm · 2022-08-12T08:02:42Z

Let's return Option and [...]

That's what 0.checked_ilog2() does already. Is your proposal, then, that the -> u32 version of ilog2 just shouldn't exist on uNN? How do you square that with precedent of other things -- like uNN::next_power_of_two or iNN::abs -- that have the checked_ version return Option, but the un-prefixed one wrap in release mode?

programmerjake · 2022-08-12T08:13:42Z

what about treating it like division by zero and just panicking?

lnicola · 2022-08-12T08:36:27Z

I'd be fine with -1. The point of these methods is to get fast code (otherwise many won't use them and will prefer to do UNN::BITS - x.leading_zeros() - 1 to keep the code branchless and avoid the ugly unwrap()) and a defined value outside of the usual input range. u32::MAX.next_power_of_two() (from above) is a good example.

If you want stricter guarantees than that, you can use NonZeroUNN.

jdahlstrom · 2022-08-12T13:32:24Z

A minor point: u32::MAX.next_power_of_two() being 0 is actually a reasonable choice - even though 0 is not a power of two, it is congruent to the correct power of two modulo 2^32 (which is 2^32 itself), so exactly what is expected in wrapping arithmetic. But unfortunately there's no "−∞ mod 2^32" that could be used for ilog2.

cmpute · 2022-08-12T16:29:42Z

Modular arithmetic is confusing in the context of logarithm, if you want to return the discrete logarithm of 0 in the release mode, then there are a lot of candidates, because (in case of u8) $$2^8 \equiv 2^{16} \equiv 2^{24} \equiv \ldots \equiv 2^{8k} \equiv 0 \;\mathrm{mod} \;256$$ therefore in the sense of modular arithmetic, $\log_2(0)$ could be any multiple of uNN:BITS. If you really want to achieve something close to $-\infty$, then I guess the best value is i32::MIN as u32 as I mentioned above. (Interestingly though, i32::MIN as u32 = $2^{31}$, which is a multiple of 8, 16, 32, 64, 128)

scottmcm · 2022-08-12T19:08:38Z

what about treating it like division by zero and just panicking?

That's certainly a possibility. @m-ou-se suggested not doing that back in the original PR, though: #80918 (comment)

tspiteri · 2022-08-19T16:23:15Z

There was this note: #80918 (comment)

In this case of logs, checking has to be done anyway

For which I think the ilog2 implementation I mention above is a counter-example, albeit only for the base-2 case.

Doesn't leading_zeros implicitly include a check for zero in the code generated? [godbolt]

tspiteri · 2022-08-19T16:38:33Z

Also, since the ilog methods are implemented for signed integers as well, another issue would be the log of negative values. For example for floating-point, 0f32.log2() would return −∞, but (-1f32).log2() would return NaN. So if 0i32.ilog2 would return some value for −∞, would (-1i32).ilog2() panic anyway, or return some value?

jdahlstrom · 2022-08-19T17:23:58Z

So if 0i32.ilog2 would return some value for −∞, would (-1i32).ilog2() panic anyway, or return some value?

I don't think it makes sense to do anything other than panic in the case of negative values, which in turn is probably an argument for panicking on zero as well. I guess one could argue that "wrapping" could be interpreted as wrapping the preimage to fit in the function's domain (rather than just wrapping the image to fit in the range) but IMO that would be entirely unintuitive and unhelpful as well.

After thinking about it a bit, it seems to me that ilog simply cannot have a meaningful wrapping variant, because wrapping semantics (as reasonably interpreted) cannot answer the question "what should the result be for values for which the original function is undefined?" There are no elements in ilog's domain for which the function underflows or overflows.

scottmcm · 2022-08-19T18:12:13Z

@tspiteri

Doesn't leading_zeros implicitly include a check for zero in the code generated? [godbolt]

That's only the case on old chips. As I linked in the OP (#100422 (comment)), that happens in the instruction generation very-last-phase of LLVM on old chips, and on newer chips it knows to emit a different instruction instead of the branching code: https://rust.godbolt.org/z/1KM5zqTMG.

The good news is that it looks like newer LLVM has gotten smarter here. It used to be that, because of how the intrinsic gets translated, it would emit obviously-redundant assembly on old chips in some code patterns, as you could see in Rust 1.16: https://rust.godbolt.org/z/rxPzGdxTY.

joboet · 2022-08-20T08:37:49Z

$\mathrm{ilog2}(x)$ is supposed to find the largest $k \in \mathbb{N}_0$ so that $2^k \leq x$. Since computers use wrapping arithmetic, the equation is better put as $2^k \mod{2^{bits}} \leq x$. The largest $k < 2^{32}$ satisfying this equation in the case $x = 0$ is $2^{32} - 1$ (aka u32::MAX).

So if we consider a wrapping ilog2 to be the inverse of 2.wrapping_pow, u32::MAX makes perfect sense (and is very efficient on modern CPUs, as others have pointed out).

I would also advocate for wrapping_ilog2 to be added as a method which always has this behaviour. ~~ilog2 could then be treated like the other integer operations: panicking on overflow in debug mode, allowing people to catch these errors, and wrapping in release mode, making the operation faster.~~ [Edit: it already behaves that way]

scottmcm · 2022-08-20T09:36:46Z

ilog2 could then be treated like the other integer operations: panicking on overflow in debug mode, allowing people to catch these errors, and wrapping in release mode, making the operation faster.

Note that's already how ilog2 works today. It's just that it wraps to 0 in release right now -- this issue doesn't propose removing the panic in debug behaviour, just to figure out what the best value for release would be.

scottmcm · 2022-08-23T18:19:59Z

Since, from the notes, it seems like the libs-api meeting today said "how about we panic instead of wrap", I'll try to answer my own question from above:

How do you square that with precedent of other things [...]

I would propose the following rule:

Conceptually, the result is first calculated in true ℤ (aka to infinite precision). Then for the non-prefixed versions

If that result is undefined (ilog(0), 1 / 0, etc), then it panics (regardless of compilation mode)
If that result is defined and fits in the result type, it's returned.
If that result is defined but not in-range for the result type, it panics if overflow checking is enabled, and wraps (mod 2^N for uN) if overflow checking is disabled.

I think that's consistent with all the current precedent.

(#100422 (comment) also is good reading about the distinction I'm drawing here.)

programmerjake · 2022-08-23T19:41:08Z

If that result is undefined (ilog(0), 1 / 0, etc), then it panics (regardless of compilation mode)

I would redefine that to include cases where the result is defined but you can't wrap that to the result type, e.g. sqrt(-1) or infinity. that would cover log(0) = -infinity and log(-1) = pi * i

scottmcm · 2022-08-23T22:53:00Z

I agree, but was considering that implied by "in ℤ", as that set includes neither −∞ nor ⅈ.

(Like how f32::log(-0.5) is NAN, not a complex number.)

m-ou-se · 2022-08-30T11:47:56Z

what about treating it like division by zero and just panicking?

That's certainly a possibility. @m-ou-se suggested not doing that back in the original PR, though: #80918 (comment)

I'm not necessarily against panicking in that case. I just pointed out how it's a bit inconsistent with the other operations we have on integers. (Whether it makes sense for the other operations to have this difference in behavior between debug and release mode is an interesting discussion too, but I'm afraid that ship has sailed.)

The reason we don't always panic on overflow for + and - etc. is performance. It's simply too costly to do. (We always panic on division by zero because that's both mathematically undefined (even modulo 2^BITS) and because on some platforms, the native division instruction will signal/abort/trap the program when the right hand side is zero.)

We applied the same reasoning to things like abs and next_power_of_two, although one might reasonably argue that for those it might be fine to always panic, since these are not as common as + and -. (You could then still use checked_*().unwrap_unchecked() if you really want to make sure there's no check, if you're sure it's unnecessary.)

I would propose the following rule: …

That seems reasonable.

It'd be nice if it'd still be easy to get the efficient 32 - some_u32.leading_zeros() calculation through some usage of ilog2 or checked_ilog2 though. It looks like llvm isn't smart enough today to turn some_u32.checked_ilog2().map_or(0, |n| n + 1) into simply a count-leading-zeros and a subtract instruction. :(

m-ou-se · 2022-09-13T19:29:29Z

We discussed this again in the libs-api meeting, and we concluded pretty much the same as @scottmcm's rule, meaning that these functions should panic on zero in all compilation modes, since there's not a modulo-2^N-correct result.

Panic for invalid arguments of `{integer primitive}::ilog{,2,10}` in all modes Decision made in rust-lang/rust#100422 (comment) resolves rust-lang/rust#100422 tracking issue: rust-lang/rust#70887 r? `@m-ou-se`

scottmcm mentioned this issue Aug 11, 2022

Tracking Issue for Integer::{ilog,ilog2,ilog10} #70887

Closed

8 tasks

m-ou-se removed the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label Aug 30, 2022

lukas-code mentioned this issue Oct 2, 2022

Panic for invalid arguments of {integer primitive}::ilog{,2,10} in all modes #102578

Merged

bors closed this as completed in d8091f8 Oct 12, 2022

scottmcm mentioned this issue Aug 13, 2024

Rangeless Shifts rust-lang/libs-team#428

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's the correct wrapping value for `ilog{|2|10}`? #100422

What's the correct wrapping value for `ilog{|2|10}`? #100422

scottmcm commented Aug 11, 2022

scottmcm commented Aug 11, 2022 •

edited

Loading

leonardo-m commented Aug 11, 2022

scottmcm commented Aug 11, 2022

leonardo-m commented Aug 11, 2022

cmpute commented Aug 11, 2022

leonardo-m commented Aug 12, 2022

scottmcm commented Aug 12, 2022

programmerjake commented Aug 12, 2022

lnicola commented Aug 12, 2022 •

edited

Loading

jdahlstrom commented Aug 12, 2022

cmpute commented Aug 12, 2022 •

edited

Loading

scottmcm commented Aug 12, 2022

tspiteri commented Aug 19, 2022

tspiteri commented Aug 19, 2022

jdahlstrom commented Aug 19, 2022 •

edited

Loading

scottmcm commented Aug 19, 2022 •

edited

Loading

joboet commented Aug 20, 2022 •

edited

Loading

scottmcm commented Aug 20, 2022

scottmcm commented Aug 23, 2022

programmerjake commented Aug 23, 2022 •

edited

Loading

scottmcm commented Aug 23, 2022

m-ou-se commented Aug 30, 2022

m-ou-se commented Sep 13, 2022 •

edited

Loading

What's the correct *wrapping* value for ilog{|2|10}? #100422

What's the correct *wrapping* value for ilog{|2|10}? #100422

Comments

scottmcm commented Aug 11, 2022

scottmcm commented Aug 11, 2022 • edited Loading

leonardo-m commented Aug 11, 2022

scottmcm commented Aug 11, 2022

leonardo-m commented Aug 11, 2022

cmpute commented Aug 11, 2022

leonardo-m commented Aug 12, 2022

scottmcm commented Aug 12, 2022

programmerjake commented Aug 12, 2022

lnicola commented Aug 12, 2022 • edited Loading

jdahlstrom commented Aug 12, 2022

cmpute commented Aug 12, 2022 • edited Loading

scottmcm commented Aug 12, 2022

tspiteri commented Aug 19, 2022

tspiteri commented Aug 19, 2022

jdahlstrom commented Aug 19, 2022 • edited Loading

scottmcm commented Aug 19, 2022 • edited Loading

joboet commented Aug 20, 2022 • edited Loading

scottmcm commented Aug 20, 2022

scottmcm commented Aug 23, 2022

programmerjake commented Aug 23, 2022 • edited Loading

scottmcm commented Aug 23, 2022

m-ou-se commented Aug 30, 2022

m-ou-se commented Sep 13, 2022 • edited Loading

What's the correct wrapping value for `ilog{|2|10}`? #100422

What's the correct wrapping value for `ilog{|2|10}`? #100422

scottmcm commented Aug 11, 2022 •

edited

Loading

lnicola commented Aug 12, 2022 •

edited

Loading

cmpute commented Aug 12, 2022 •

edited

Loading

jdahlstrom commented Aug 19, 2022 •

edited

Loading

scottmcm commented Aug 19, 2022 •

edited

Loading

joboet commented Aug 20, 2022 •

edited

Loading

programmerjake commented Aug 23, 2022 •

edited

Loading

m-ou-se commented Sep 13, 2022 •

edited

Loading