Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add get_unchecked to Option and Result #378

Closed
alion02 opened this issue May 5, 2024 · 7 comments
Closed

Add get_unchecked to Option and Result #378

alion02 opened this issue May 5, 2024 · 7 comments
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api

Comments

@alion02
Copy link

alion02 commented May 5, 2024

Proposal

Problem statement

Prompted by this comment:

Similarly, you can already write:

unsafe { a.checked_mul(b).unwrap_unchecked() }

so it's strange to me that additional functions like unchecked_mul are being added to u32 for this specific operation.

I began searching for concrete examples of situations where using unwrap_unchecked leads to bad codegen. I didn't have to search long.

The problem with unwrap_unchecked is that it's the programming equivalent of littering - with every unwrap we're sprinkling in a condition for LLVM to keep around, even though frequently what we really mean is "I know this is Some/Ok/Err, please give me the contents without checking the variant." In a perfect world, these conditions would just get ignored when irrelevant and cleanly DCE'd, but...

Motivating examples or use cases

type NZ = core::num::NonZeroU32;
type U = u32;

#[no_mangle]
unsafe fn sum_unwrap(a: &[Option<NZ>; 16]) -> U {
    a.iter().map(|&v| v.unwrap_unchecked().get()).sum()
}

godbolt

This innocuous snippet leads to staggeringly bad assembly - 80+ lines of LLVM IR and 30+ ARM instructions... to add together 16 numbers. For reference, a non-vectorized implementation would be 15 adds, 8 pair-loads, and a return. Autovectorized it's just 8 instructions.

Solution sketch

Maybe we should just stop littering.

We can restore good codegen if we express the unwrap in a different way, without invoking unreachable_unchecked; for example, like this:

trait OptionExt {
    type T;
    unsafe fn get_unchecked(self) -> Self::T;
}

impl<T> OptionExt for Option<T> {
    type T = T;

    #[allow(invalid_value)]
    unsafe fn get_unchecked(self) -> T {
        match self {
            Some(v) => v,
            None => core::mem::MaybeUninit::uninit().assume_init(),
        }
    }
}

godbolt

Add the above method (and analogous methods on Result) to core.

Alternatives

Change implementation of unwrap_unchecked

Idly browsing through uses of unwrap_unchecked, I notice that a significant portion (perhaps even majority!) of them probably don't care to keep their conditions around. Worth investigating with benchmarks.

Not convinced it's relevant, but clang does not generate an assume for an std::optional dereference. godbolt

Additionally, the "unchecked" wording sort of implies a lack of checks, which is... well, ostensibly true...

Change current implementation and add new methods

Assuming get_unchecked is on average better than unwrap_unchecked, we might want to replace the functionality for current code and also keep providing the previous functionality for the cases where it is useful. Call it unwrap_assume or something.

Do nothing

This can be implemented in user code just fine, as an extension method. The problem is discoverability - if you're reaching for unwrap_unchecked, you probably care about performance, and with unwrap_unchecked being the only unchecked method on Option/Result you might not think to search further, or consider what the method does under the hood (and whether that's something you want to happen).

Improve LLVM

Presumably a long-term effort. I don't have the necessary knowledge to properly consider this alternative.

@Noratrieb
Copy link
Member

this is just a duplicate of unwrap_unchecked, there's no value in having both

bors added a commit to rust-lang-ci/rust that referenced this issue May 5, 2024
[Experiment] Replace `unreachable_unchecked()` with `uninit().assume_init()` in `unwrap_unchecked()`

[Relevant ACP.](rust-lang/libs-team#378) [Relevant godbolt.](https://godbolt.org/z/WqfcT7Ys9)

Attempt to clean up the LLVM IR we generate based on the assumption that we usually don't want to `assume` the value of the discriminant (because it sticks around in the IR and randomly inhibits optimizations).
@workingjubilee
Copy link
Member

Additionally, the "unchecked" wording sort of implies a lack of checks, which is... well, ostensibly true...

The documentation for Option::unwrap_unchecked says:

Returns the contained Some value, consuming the self value, without checking that the value is not None.

So we do not promise the implementation of Option::unwrap_unchecked is specifically None => unreachable_unchecked().

@kennytm
Copy link
Member

kennytm commented May 5, 2024

You'll get the same optimized ASM using safe functions alone

#[no_mangle]
fn sum_map_or(a: &[Option<NZ>; 16]) -> U {
    a.iter().map(|&v| v.map_or(0, NZ::get)).sum()
}

This works because Option<NonZeroU32> and u32 are layout-wise equivalent and v.map_or(0, NZ::get) must be an identity function in the low-level representation. The special layout of NonZeroU32 is likely also why uninit().assume_init() works while unreachable_unchecked() does not, until LLVM realized such optimization opportunity.

bors added a commit to rust-lang-ci/rust that referenced this issue May 5, 2024
[Experiment] Replace `unreachable_unchecked()` with `uninit().assume_init()` in `unwrap_unchecked()`

[Relevant ACP.](rust-lang/libs-team#378) [Relevant godbolt.](https://godbolt.org/z/WqfcT7Ys9)

Attempt to clean up the LLVM IR we generate based on the assumption that we usually don't want to `assume` the value of the discriminant (because it sticks around in the IR and randomly inhibits optimizations).
bors added a commit to rust-lang-ci/rust that referenced this issue May 6, 2024
[Experiment] Replace `unreachable_unchecked()` with `uninit().assume_init()` in `unwrap_unchecked()`

[Relevant ACP.](rust-lang/libs-team#378) [Relevant godbolt.](https://godbolt.org/z/WqfcT7Ys9)

Attempt to clean up the LLVM IR we generate based on the assumption that we usually don't want to `assume` the value of the discriminant (because it sticks around in the IR and randomly inhibits optimizations).
@scottmcm
Copy link
Member

scottmcm commented May 7, 2024

Have you filed an LLVM bug? Seems worth seeing what they say, even if it's just "yes, this is why assume operand bundles are better than assume calls with values".

This innocuous snippet leads to staggeringly bad assembly - 80+ lines of LLVM IR

Most of which is the assumes, which are not emitted into the machine code. Counting those is disingenuous.

As Nils said, the proposed method is just unwrap_unchecked, and thus isn't worth adding. (Notably, returning undef in LLVM from a function marked noundef on the return is just another way of spelling unreachable now that llvm/llvm-project#60717 got fixed.) Yes, the icmp-assumes aren't great, but range metadata on parameters in LLVM is the way forward for that, to stop needing the assumes. (If you remove them entirely you'll quickly find things like .get() > 0 no longer optimizing, see rust-lang/rust#49572 for some history there.)

That said, I do think there's a place for something here. NonZero::new is the safe code way to do the guaranteed-legal-transmute from u32 to Option<NonZero<u32>>. It would make sense to me to have a safe method for Option<NonZero<u32>>u32 as well.

Then you wouldn't need unsafe code for this at all, since

pub fn sum_unwrap(a: &[Option<NZ>; 16]) -> U {
    a.iter().map(|&v| v.unwrap_or_zero()).sum()
}

would give exactly (https://godbolt.org/z/e5axT9bK8) what you wanted.

@pitaj
Copy link

pitaj commented May 7, 2024

v.unwrap_or_zero()

That's just unwrap_or_default right?

@scottmcm
Copy link
Member

scottmcm commented May 7, 2024

That's just unwrap_or_default right?

No, as in @kennytm 's snippit above it's v.map_or(0, NZ::get). Just unwrap would give the NonZero, which isn't Default.

So yes, there's a safe way, but it's a less-direct way. I like it when we have a clear "this is just the transmute you're about to write in unsafe code" method for things where the transmute would be safe given stable guarantees, because it's easy to link from next to the documentation of the layout guarantee and avoids all the "oh, I didn't think of that" or "but that's so much slower in debug mode" or whatever objections.

@dtolnay
Copy link
Member

dtolnay commented May 19, 2024

Thank you for the ACP, discussion, godbolt links, and draft PRs for benchmarking!

We discussed this ACP in this week's standard library API team meeting. Those present were not convinced that having 2 unsafe Option<T>->T conversions, one using intrinsics::unreachable() and the other MaybeUninit::uninit().assume_init() or something else, was worth having at this time.

But instead we are open to whatever improvements can be made to LLVM, rustc, or the standard library to make the existing unwrap_unchecked lead to better code on average.

We didn't get a chance to evaluate Option<NonZero<_>>::unwrap_or_zero (#378 (comment), #378 (comment)) as a team, but that would be a separate ACP. My personal opinion is that option.map_or(0, NonZero::get) is good enough for this, but it wouldn't surprise me if others on the team feel inversely.

@dtolnay dtolnay closed this as completed May 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api
Projects
None yet
Development

No branches or pull requests

7 participants