Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added possibility to pass uninit arrays to random generator #271

Closed
wants to merge 1 commit into from
Closed

Added possibility to pass uninit arrays to random generator #271

wants to merge 1 commit into from

Conversation

AngelicosPhosphoros
Copy link

This can be useful to skip extra work on caller side and it is more logical because we don't really need to have initialized values to fill them with new values.

@AngelicosPhosphoros
Copy link
Author

I don't really know how platforms except Linux and Windows operate so here may be things which I can implement improperly. E.g. is memory which we send to JS to let it fill with random bytes must be initialized before that.


const BCRYPT_USE_SYSTEM_PREFERRED_RNG: u32 = 0x00000002;

#[link(name = "bcrypt")]
extern "system" {
fn BCryptGenRandom(
hAlgorithm: *mut c_void,
pBuffer: *mut u8,
pBuffer: *mut MaybeUninit<u8>,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

T and MaybeUninit is equal for FFI purposes: https://doc.rust-lang.org/stable/std/mem/union.MaybeUninit.html#layout

src/lib.rs Outdated
Comment on lines 294 to 299
// Help miri to catch mistakes.
// If user calls `assume_init()` on elements,
// miri would show error.
#[cfg(miri)]
if res.is_err() {
dest.fill(MaybeUninit::uninit());
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking that maybe this is better to be put into rand crate and let rand use getrandom_to_uninit even for initialized slices.

@AngelicosPhosphoros
Copy link
Author

Also related PR: rust-random/rand#1241

@notgull
Copy link

notgull commented Jul 8, 2022

I like this idea. Not only because of the potential utility, but because I think having methods like this to deal with uninit slices should be a pattern throughout the ecosystem.

My only thing is that I wonder if using ReadBuf would be a better fit.

@josephlr
Copy link
Member

josephlr commented Jul 8, 2022

Personally, I would want to wait until there's a standard interface in the standard library (ReadBuf), instead of implementing our own non-standard interface to deal with uninitialized buffers.

@josephlr
Copy link
Member

josephlr commented Jul 8, 2022

Also, as this is explicitly a performance improvement, could you also add some benchmarks so that we can see the perf impact of this change (vs just zeroing the uninit buffer and calling getrandom normally).

I would guess the best case would be the rdrand implementation, as that's the fastest RNG we have in here IIRC

@AngelicosPhosphoros
Copy link
Author

Personally, I would want to wait until there's a standard interface in the standard library (ReadBuf), instead of implementing our own non-standard interface to deal with uninitialized buffers.

It looks good but it is defined in std library and not in core and getrandom can be used in environments without std.

@AngelicosPhosphoros
Copy link
Author

@josephlr @notgull What you think about this updated API?

@notgull
Copy link

notgull commented Jul 8, 2022

I like this!

Is there any particular reason why ReadBuf isn't available on core?

@@ -7,10 +7,12 @@
// except according to those terms.

//! Implementation for Nintendo 3DS
use core::mem::MaybeUninit;
Copy link
Member

@josephlr josephlr Jul 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This increases our MSRV to 1.36 (current is 1.34) so is a breaking change. However, if we gate everything behind the read_buf feature, this shouldn't be an issue, as it will require nightly anyway.

@josephlr
Copy link
Member

josephlr commented Jul 9, 2022

Personally, I would want to wait until there's a standard interface in the standard library (ReadBuf), instead of implementing our own non-standard interface to deal with uninitialized buffers.

It looks good but it is defined in std library and not in core and getrandom can be used in environments without std.

Then I think for now we should add an off-by-default "read_buf" feature that:

  • Depends on the "std" features
  • Enables the read_buf nightly feature
  • Documents that the feature requires libstd and that it requires a recent-ish nightly

Then we could have code like this in our Cargo.toml:

# Nightly-only implementation with `std::io::ReadBuf` struct.
read_buf = ["std"]

and in lib.rs

#![cfg_attr(feature = "read_buf", feature(read_buf))]

#[cfg(feature = "std")]
extern crate std;
#[cfg(feature = "read_buf")]
use std::io::ReadBuf;

Then everywhere else we can just refer to ReadBuf or crate::ReadBuf. That way if ReadBuf moves to core, we only have to change 1-2 lines.

@josephlr
Copy link
Member

josephlr commented Jul 9, 2022

@AngelicosPhosphoros before we get too far into this, could you add some benchmarks to show that this is actually faster when dealing with uninitialized buffers? I don't want you to waste your time if it turns out there isn't a notable difference.

Specifically, I would want to compare:

  • Directly reading random data into an uninitialized buffer
  • Initializing a buffer (say with zero), then calling the normal getrandom method

I would recommend focusing on the rdrand implementation or the linux implementation for benchmarking, as those are the two fastest IIRC, so it's where the cost of writing the buffer twice would be the most noticeable. Feel free to add the benchmarks to this PR using #[bench].

If it turns out there's a statistically significant improvement, I think there is a path to incorporate such functionality into getrandom. However, it might be the case that this optimization doesn't matter for getrandom (as it calls the underlying OS RNG, which can be slow compared to memset/bzero). In that case, this optimization might only matter for dealing with a userspace CSPRNG, so such uninitialized buffer code would be best to live in the rand (and similar) creates, rather than this crate.

@josephlr
Copy link
Member

josephlr commented Jul 9, 2022

See https://godbolt.org/z/8WKn6oGrT for some example implementations using the existing API and the proposed new API.

The generated code confirms that regardless of how we use the existing API, it basically becomes a call to memset+getrandom_inner, while using the new API lets us only have a call to getrandom_inner.

Now it's just a question if getrandom_inner is fast enough for the initial call to memset to matter.

@josephlr
Copy link
Member

@AngelicosPhosphoros actually I just realized that we don't need this PR to benchmark the cost of initialization + calling getrandom. #272 adds benchmarks that should be able to measure the cost.

@newpavlov
Copy link
Member

On Linux and "common" hardware memset is approximately 2 orders of magnitude faster than calling getrandom (~10 GB/s vs ~80 MB/s), so I highly doubt we will be able to see the performance difference on common platforms. The situation may change on constrained targets (e.g. embedded devices).

Either way, I don't think it's worth to use ReadBuf and MaybeUninit in our case. A better solution could be something like this:

// It's safe to assume that the buffer gets fully filled if the function returns `Ok(())`
pub unsafe fn getrandom_raw(buf_ptr: *mut u8, buf_len: usize) -> Result<(), Error> { .. }

#[inline(always)]
pub getrandom(buf: &mut [u8]) ->  Result<(), Error> {
    unsafe { getrandom_raw(buf.as_mut_ptr(), buf.len()) }
}

@josephlr
Copy link
Member

I think @newpavlov's suggestion in #271 (comment) is the way we will want to go with this. Closing in favor of #279

@josephlr josephlr closed this Aug 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants