-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CHANGE: Make <SmallRng as SeedableRng>::Seed
the same type on 32 and 64 bit platforms
#1285
Comments
Thanks for the detailed issue. I agree with the motivation. We are accepting breaking changes now (though 0.9 could still take a while to arrive). I suggest option (1) and simply discarding excess bytes. Wasteful, but simple and straightforward (one should also not assume that somehow mashing extra bytes together improves the seeding quality in any way). The one problem with this approach would be if someone supplied a seed by casting converting a small integer into the seed, but I think documentation is clear that
There is no guarantee that it does anyway, nor probably any benefit. For saving/reloading an RNG, serde is probably the only option. |
Arguably, Note that option 1. and 2. still don't give you reproducibility, which makes me wonder what your use case is? You want determinism (so Another option would be to accept |
@vks has a very good point here: we directly support four types of seeding: So why isn't this good enough? |
I don't need reproducibility for my case. I could have used My use case was getting the However, I've also used
The algorithm changing does not mean they're portability hazards. Plenty of applications do not need exact reproducibility, and it's very sensible to ask users to choose specific RNG algorithm if they care about exact reproducibility. The way I'd think of this is like This is unlike
This seems acceptable, I had considered accepting a For cryptographic RNGs, if you do this though it would have to be done with a cryptographic hash algorithm to preserve the entropy, and I suspect even then they like the guarantee that they get the exactly correct amount of entropy). |
I think my answer is that it's extremely non-obvious that the It is very easy for Rust code to use this, not realize it will cause their code to fail to compile on certain architectures, and cause problems later, perhaps for downstream users (ones who may not even need to use the method being invoked). I will note that while the lack of documentation exacerbates this, no amount of documentation will stop it. It's not clear to me why to even have the SmallRng type if it's not going to abstract away from the underlying RNG anyway — preventing stuff like this seems like it would be the entire point of that type. |
If the algorithm changes on a platform the seed type may change which could break some builds |
Yes, which is what this issue is about... That's not a necessary property of changing the algorithm. |
Interesting point. @thomcc for your tests I recommend using Regarding the APIs then, I feel like perhaps we should consider pub trait FromSeed: Sized {
type Seed: Sized + Default + AsMut<[u8]>;
fn from_seed(seed: Self::Seed) -> Self;
}
pub trait SeedableRng {
fn seed_from_u64(mut state: u64) -> Self;
fn from_rng<R: RngCore>(mut rng: R) -> Result<Self, Error>;
fn from_entropy() -> Self;
}
impl<R: FromSeed> SeedableRng for R {
// rand_core provides this ...
}
// direct impl for SmallRng (which is a wrapper type, not a type-def, and does not impl FromSeed)
// (same for StdRng)
impl SeedableRng for SmallRng {
// ...
} |
Maybe we should simply use Xoshiro128++ on all platforms? Yes, it will be less optimal on 64-bit platforms, but it will resolve all potential pitfalls regarding reproducibility and portability hazards. Maybe there are lightweight PRNGs which can be implemented without lose of efficiency on both 32 and 64 bit platforms? |
@dhardy I like your proposal! It breaks use cases where you need determinism and unpredictability, but those are arguably better served by using I think it's nice to discourage using |
@newpavlov the main point is to let us replace the algorithm behind
Assuming you mean reproducibility, it doesn't, because for that you need to specify the RNG directly anyway. I'd almost like to drop the |
@dhardy But I guess many users look at So I think the question is: should we modify goals of |
@newpavlov that's another topic, but it's right there in the docs: "The algorithm is deterministic but should not be considered reproducible [..]". Motivating your argument with "But I guess [..]" isn't exactly persuasive. So, no, I don't think we should modify the goals. Also, I think we shouldn't break reproducibility of anything deterministic in a patch release without strong motivation (e.g. a security issue). And none of this has anything to do with this issue unless we commit Rand to never change the algorithms behind those RNGs which I don't want to do (we've changed both RNGs in the past and might find another performance bump or other reason to switch algorithms in the future). |
If the refine RFC were part of the base language, this might not be a problem: essentially, associated types get |
Regarding my idea above, it has a couple of problems:
My feeling is that we should simply not fix this (except with documentation). Acceptable? |
Yes. I think we should explicitly state that users should not rely on stability of |
Summary
Change
SmallRng
's implementation ofSeedableRng
so that theSeed
is the same type and size on all targets (currently,<SmallRng as SeedableRng>::Seed
is a[u8; 32]
ifcfg(target_pointer_width = "64")
and a[u8; 16]
ifcfg(target_pointer_width = "32")
).This would be a breaking change, which will need to wait for the next semver-major release (currently that would be
0.9
)Details
There are a few ways this may be implemented that would satisfy me.
[u8; 32]
. This is double the size the RNG actually needs, so for use it would need to be combined somehow (or have a portion discarded). There are many possible ways to do this, and the details do not seem particularly important.[u8; 16]
. This is half the size the RNG actually needs, so it would need to be stretched into a[u8; 32]
before use. As before, there are ways to do this,Of these I think 1 or 3 is the best, but do not feel particularly strongly.
The first two have the drawback that the seed would no longer corresponds directly to the state (or to the seed of the wrapped RNG). This seems to be allowed by the documentation of
SeedableRng::from_seed
already, andSmallRng
being an abstraction probably gives you leeway to do things like it, but I suppose it is a downside.The last one has a downside that it may be difficult to find algorithms that run well on 32 and 64 bit platforms (although I don't know that I believe this).
Motivation
rand::rngs::SmallRng
's public API currently contains a portability hazard. Specifically, if you useSmallRng::from_seed
, you need to pass a[u8; 32]
on 64 bit targets and a[u8; 16]
on 32 bit ones. Users who are writing portable code must notice this difference and either avoid the function or change argument passed intofrom_seed
based on a#[cfg]
.The fact that a different algorithm is used on different platforms is mentioned in the documentation, but it is not made clear that this impacts the API in a way other than the sequence of values output by the RNG. To me (and I could have misread the intent), the documentation seemed to indicate that the underlying RNG should considered an implementation detail (which is good, if I wanted a specific RNG implementation, I'd use that directly), implying I shouldn't have to worry about it (except beyond not expecting identical results across platforms).
I hit this as a compilation failure on 32 bit platforms in rust-lang/rust#104658 when updating the Rust standard library's version of
rand
used in benchmarks and tests, and found it quite surprising. I ended up switching torand_xorshift
to provide the RNG implementation, because it was less of a portability headache (even though I didn't care about the underlying algorithm).I'd have preferred to continue using
SmallRng
(and had I noticedseed_from_u64
perhaps I'd have used that), but it still strikes me as pretty odd to have something like this in the public API on what's already intended to be a wrapper that abstracts away from the specific underlying algorithm — IOW, it feels like an abstraction leak.Given how popular
rand
is, and the fact that most developer machines are 64 bit these days... I suspect there's some amount ofrand
users who have code which does not compile on 32 bit platforms as a result, which seems unfortunate.Alternatives
Several alternatives for how to fix this are given under "Details", above. Those all seem pretty good to me, and relatively simple.
Aside from those, another option would be to overhaul the
SeedableRng
API to help avoid this sort of issue. That may help, but in general this kind of issue can crop up anywherecfg
is used unless care is taken to avoid it... so I don't know that it's an issue withSeedableRng
in particular.Finally, some alternatives which I don't find particularly compelling:
cfg
.P.S. Sorry if this has been discussed, I did a search and it did not come up.
The text was updated successfully, but these errors were encountered: