-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Safety #765
Safety #765
Conversation
This is the first step for addressing a soundness issue where parallel joins create aliasing mutable references to the storage and where in regular joins for some storages previously returned references will be invalidated by calling `get_mut` (at least under stacked borrows afaict). The internals of each storage are adjusted to store components within a `SyncUnsafeCell` to allow handing out mutable references. `SyncUnsafeCell` is a wrapper of `UnsafeCell` that provides `Sync` by default. Other various details: * Edge cases with `as` casts on 16-bit and 32-bit platforms addressed to avoid UB in `UnprotectedStorage` impls. * Safety documentation added to unsafe code usage within `UnprotectedStorage` impls. * Safety documentation added to unsafe impls of DistinctStorage (only in storages.rs) * Started introduction of `#[deny(unsafe_op_in_unsafe_fn)]` lint in various modules. * Safety requirements on `UnprotectedStorage::get/get_mut` updated. * `NullStorage` internals updated to handle ZSTs better (including properly dropping them when `clean` is called and not dropping them in `insert`) and not require `T: Default`. * In `Storage::insert` add the `id` to the mask after calling `inner.insert()` to protect against unwinding from the `insert` call.
…hange how `UnprotectedStorage::clean` works. * `clean` now always clears all components even if dropping the storage would have dropped them automatically (this helps address an edge case with `DenseVecStorage::insert` overflowing a `u32`). * Add safety requirement to `clean` that indicates the caller should ensure the mask has been cleared even if unwinding occurs from the `clean` call. * Ensured uses of `clean` met this requirement. * Also continue expanding application of `unsafe_op_in_unsafe_fn` lint. * Fixed typo from previous commit where FlaggedStorage::get was using get_mut internally.
…oned since we most likely want to transition this to a streaming only storage (which is thus given &mut access).
Implementations and uses of these traits are not yet changed. However, hopefully this is the final form needed to have storages that only support lending/streaming joins while also allowing some storages to support regular `Iterator` like joins and parallel joins as well as allowing storages where `UnprotectedStorage<T>::AccessMut` doesn't implement `DerefMut<Target =T>` (like a planned variant of the flagged storage).
…ed/used: (NOTE: nothing compiling yet since implementation of these traits have not been updated) * Introduced `LendJoin` trait that is like the lending iterator version of `Join`. This is useful for types that need to return aliasing mutable references from `get` calls with distinct `id`s (e.g. `Entries`, `DerefFlaggedStorage`, `RestrictStoraged`). `LendJoin` uses `nougat` crate to provide a GAT based API on stable rust. * Removed unsound `JoinIter::get`/`JoinIter::get_unchecked` but these methods are present on `JoinLendIter` where they can be soundly implemented. * Since there is a single `MaybeJoin` type used for all joins, the convenient `.maybe()` method was moved to `LendJoin` which should be the common denominator of implemented join traits (if we put this method on multiple traits, rust might start wondering which one you want to call, which isn't convenient...). * `ParJoin` trait is no an longer empty trait that relies on the implementation in `Join`. The new `ParJoin::get` takes a shared reference so the `ParallelIterator` implementation no longer creates aliasing exclusive references to call `Join::get`. * `Join` is now an `unsafe` trait to require that the mask/values returned from `Join::open` are properly associated. * Extended application of `deny(unsafe_op_in_unsafe_fn)` to the `join` module and added safety documentation to calls to unsafe functions there. * Removed `Clone` implementation for `JoinIter<J> where J::Mask: Clone, J::Value: Clone`. Nothing, in `Join::get` safety requirements implies that this is safe, in the cases where this is safe, the user can just call `.join()` twice for similar effect. Other misc changes: * `BitAnd` helper trait and `MaybeJoin` struct moved to their own files to declutter `join/mod.rs`.
…hese traits and add LendJoin implementations. Compiles again!!! * Several Join implementors where commented out (marked with `D-TODO`) so that I can update them in a separate batch. Want to make sure the changes were working first. * Remove `where Self: 'next'` bound from `LendJoin::Type<'next>'` since this was causing issues and an unnecessary bound. * Fix several other errors related to usage of `LendJoin`'s GAT. * Fix other misc errors from the last few commits * `deny(unsafe_op_in_unsafe_fn)` now covers the whole crate. * Add safety comments to unsafe code used in `Generation` methods. Still need to: * Implement `SharedGetAccessMutStorage` for relevant storages. * Update commented out types that implement `Join`. * Update some safety comments.
…e storage safety comments.
`SharedGetAccessMutStorage` -> `SharedGetMutStorage`, rename `shared_get_access_mut` -> `shared_get_mut`.
* Start work on implementing LendJoin and safely re-implementing Join for `&ChangeSet`, `&mut ChangeSet`, and `ChangeSet`. * Add `AccessMut` trait as a replacement for a few cases that were using `DerefMut` (since we don't want to require that `UnprotectedStorage::AccessMut<'a>' has to implement `DerefMut`). IIRC the cases were originally missed because they are behind feature flags. * Modify `SharedGetMutOnly` to also be generic over the storage type so that we don't have to require `T: Component` (since we were getting the storage type from the associated `Component::Storage`). IIRC this is to support use in `ChangeSet<T>` which doesn't require `T: Component`.
…ngeSet<T> where iterating it removes items. * Added additional requirement to Join::get/LendJoin::get that it can not be called multiple times with the same ID. * Added unsafe `RepeatableLendGet` trait to allow opt-out of this requirement so that a safe `JoinLendIter::get` method can remain exposed. * Updated relevant safety comments for uses/impls of `LendJoin::get`. * TODO for next commit: update all uses/impls of `Join::get` to ensure they correspond with the requirement changes.
addtional details)
…n issue in `shred`
…lized and bump the MSRV to 1.65.0
* Replace `Join` impl with `LendJoin` (to avoid creating aliasing mutable references to the storage). * Create new `Storage::not_present_insert` method that requires that the `id` not be present in the mask. This is used by both `Storage::insert` and `VacantEntry::insert` so we can centralize documenting the safety of calling `UnprotectedStorage::insert` and the handling of potential unwinding from `BitSet::add`.
…lated changes: * SharedGetMutOnly::get_mut changed from method to associated function to make its use more apparent (e.g. compared to calling UnprotectedStorage::get_mut). * New requirement added to ParJoin trait implementation to facilitate callers of ParJoin::get that need to ensure they don't repeat indices. * `ShareGetMutStorage::shared_get_mut` requirements tweaked to allow calling this in conjuction with `UnprotectedStorage::get` when the `id`s used don't overlap. This facilitates `Join`/`ParJoin` impls for `RestrictedStorage` which can allow getting a component for one entity mutably while immutably getting the component for another entity at the same time. * Marker types used for restricted storage implementation replaced with producing distinct types for different types of joins: `PairedStorageRead` (for any read only join), `PariedStorageWriteExclusive` (for mutable LendJoin), and `PairedStorageWriteShare` (for mutable Join/ParJoin). * Renamed `PairedStorage` (which was replaced with the 3 types above) methods `get_unchecked`/`get_unchecked_mut` to `get`/`get_mut` since `unchecked` often is used to indicate some safety requirement hasn't been checked which isn't the case here. Renamed existing `get`/`get_mut` methods to `get_other`/`get_mut_other`. * Other misc changes that were missed in previous commits.
…ect changes in safety requirements.
…oin` for `Drain`.
…hey finish in a reasonable time
… actually guaranteed to abort on failure
…ble implmenetation.
…sertion into removing the inserted component (mainly to make should_panic test work)
…re Self: 'next" from LendJoin::get
Exciting news! I finally got a chance to update veloren to use this and profile it and I don't see any particular regressions. Might try to post some tracy pictures later. I think this should be ready to merge soon. |
Here are some profiling results in veloren, "this trace" is before changes here and "external trace" is after. I focused on two systems. "character_behavior" which has a join over a lot of component types and "phys" which has multiple joins over fewer component types of which some joins are parallel ones. It seems like there are no regressions and potentially a slight improvement (the profiling conditions have room to be more strictly controlled so I would not trust that this improvement is as significant as it appears here). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are some notes that will hopefully add some helpful context for reviewers
miri: | ||
name: "Miri" | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v3 | ||
- name: Install Miri | ||
run: | | ||
rustup toolchain install nightly --component miri | ||
rustup override set nightly | ||
cargo miri setup | ||
- name: Install latest nextest release | ||
uses: taiki-e/install-action@nextest | ||
- name: Test with Miri | ||
run: ./miri.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI now includes running miri.
if comp.get().condition < 5 { | ||
let mut comp = comp.get_mut(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed get_unchecked
-> get
on the paired storage types since it could imply that an important check is missing, but it is just that this gets the component value for the current entity in a join so it doesn't need to check that it is present in the storage. Paired storages also allow getting the component for a different entity than the current one, that was renamed to get_other
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _unchecked suffix always reminds me of unsafe access (e.g. https://doc.rust-lang.org/std/primitive.slice.html#method.get_unchecked ), so this rename is good.
Update: I later realized that get indeed remains unsafe, and the restriction that the item must exists sounds very similar to the unchecked variant of slice. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: I later realized that get indeed remains unsafe, and the restriction that the item must exists sounds very similar to the unchecked variant of slice.
The method discussed here has always been safe. There are other unsafe methods with the same name that this PR touches, so maybe you are thinking of one of those.
unsafe impl<T> LendJoin for ChangeSet<T> { | ||
type Mask = BitSet; | ||
type Type<'next> = T; | ||
type Value = DenseVecStorage<T>; | ||
|
||
unsafe fn open(self) -> (Self::Mask, Self::Value) { | ||
(self.mask, self.inner) | ||
} | ||
|
||
unsafe fn get<'next>(value: &'next mut Self::Value, id: Index) -> Self::Type<'next> { | ||
// NOTE: This impl is the main reason that `RepeatableLendGet` exists | ||
// since it moves the value out of the backing storage and thus can't | ||
// be called multiple times with the same ID! | ||
// | ||
// SAFETY: Since we require that the mask was checked, an element for | ||
// `id` must have been inserted without being removed. Note, this | ||
// removes the element without effecting the mask. However, the caller | ||
// is also required to not call this multiple times with the same `id` | ||
// value and mask instance. Because `open` takes ownership we don't have | ||
// to update the mask for futures uses since the `ChangeSet` is | ||
// consumed. | ||
unsafe { value.remove(id) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason we have the unsafe RepeatableLendGet
marker trait is for this case where we can't get the same spot twice because this removes the value.
@@ -0,0 +1,57 @@ | |||
use hibitset::{BitSetAnd, BitSetLike}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/join/bit_and.rs
is just factoring some existing code out of the main join module
/// Calls a closure on each entity in the join. | ||
pub fn for_each(mut self, mut f: impl FnMut(LendJoinType<'_, J>)) { | ||
self.keys.for_each(|idx| { | ||
// SAFETY: Since `idx` is yielded from `keys` (the mask), it is | ||
// necessarily a part of it. `LendJoin` requires that the iterator | ||
// doesn't repeat indices and we advance the iterator for each `get` | ||
// call in all methods that don't require `RepeatableLendGet`. | ||
let item = unsafe { J::get(&mut self.values, idx) }; | ||
f(item); | ||
}) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the method that can't work without nougat
(at least with my best efforts).
/// Used by the framework to quickly join components. | ||
pub trait UnprotectedStorage<T>: TryDefault { | ||
/// The wrapper through with mutable access of a component is performed. | ||
#[cfg(feature = "nightly")] | ||
type AccessMut<'a>: DerefMut<Target = T> | ||
type AccessMut<'a>: AccessMut<Target = T> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched this from DerefMut
to a new AccessMut
trait to allow types that are automatically deferenceable. This is for a flagged storage variant that I'm planning that would be more explicit about when a component is marked as mutated. Most cases aren't effected by this unless they are generic over the component type. AccessMut
also still has Deref
as a super trait like DerefMut
does.
/// Used by the framework to mutably access components in contexts where | ||
/// exclusive access to the storage is not possible. | ||
pub trait SharedGetMutStorage<T>: UnprotectedStorage<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The base UnprotectedStorage
trait means that we can implement LendJoin
for it. SharedGetMutStorage
enables implementing Join
. And finally DistinctStorage
enables ParJoin
.
src/storage/restrict.rs
Outdated
pub struct PairedStorageWriteShared<'rf, C: Component> { | ||
index: Index, | ||
storage: SharedGetOnly<'rf, C, C::Storage>, | ||
} | ||
|
||
impl<'rf, 'st, C, S, B, Restrict> PairedStorage<'rf, 'st, C, S, B, Restrict> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the additional complexity of adding the LendJoin
case it was much easier to have several separate paired storage types rather than using generic marker parameters.
/// }); | ||
/// b.get_mut(); | ||
/// ``` | ||
fn _dummy() {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compile_fail test to check for mistakes
impl<T> UnprotectedStorage<T> for NullStorage<T> | ||
where | ||
T: Default, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NullStorage no longer requires T: Default
, since we can soundly conjure up ZSTs provided that they represent instances that this has previously taken ownership of
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hard to check such a massive PR. i had a look through all of it and am positive that its an improvement. LGTM
@xMAC94x thanks for taking a look ❤️ I will wait till this weekend in case anyone else is interested in reviewing. If anyone interested needs more time just let me know. |
I just wanted to say: thanks for pushing this all the way to completion :) I would have reviewed, but I've been busy with other things recently. Nice work! |
Fixes #647
See API changes section for description.
Checklist
Todo
LendJoin usesSo I wanted to removenougat
since I started working on it before GATs stabilized. I kept it because I thought it might be useful to add more iterator combinators (some combinators are currently impossible to implement with actual GATs). However, I think it would probably be prudent to remove this and use regular GATs for simplicity.nougat
but it turns out that I can't implementfor_each
without it (without either running into requiringSelf: 'static
due tofor<'_>
hrtb or not being able to have lifetimes in the GAT (LendJoin::Type<'next>
) that aren't'next
and aren't'static
.specs
usingecs_bench_suite
, there is maybe a 3% increase inadd_remove
, there was a significant increase (10+%) in deletion times due toshred::MetaTable
changes, so I introduce anightly
feature that enables a more efficient implementation usingptr_metadata
,fetch
heavy code shows improvements due to the use ofatomic_refcell
inshred
instead of the custom atomic cell implementation)hibitset
andshred
to crates.io and use those instead of git dependencies.API changes
This is a breaking change.
This PR fixes several soundness issues that I encountered when working on #737. These include:
Join
implementation forspecs::storage::Entries
allows creating aliasing&mut
references to the underlying storage (i.e. when not using theJoinIter
as a lending/streaming iterator).Join
implementation and API ofspecs::storage::RestrictedStorage
similarly can create mutable aliasing references to storages as well as to a specific component.Join
implementation ofStorage
allowedDerefFlaggedStorage
to create aliasing&mut
references to the internal events channel.JoinIter::get
allows creating aliasing mutable refs, see JoinIter::get allows mutable aliasing without the user writing any unsafe code. #647.&mut Join::Value
references.Many (but not all) of these issues stem from artificially lengthening the lifetime of the
&mut Join::Value
withinJoin::get
implementations. We now avoid this and leverage alternative mechanisms including interior mutability and lending iteration.In more detail, we:
LendJoin
trait or lending (aka streaming) iteration of joined values.Entries
now only implementLendJoin
(i.e. remove unsoundJoin
implementations).LendJoin
for everything that implementsJoin
(manually, not a blanket impl since there are difference in the implementation).JoinIter::get
toJoinLendIter::get
.RepeatableLendGet
trait (whichJoinLendIter::get
requires). This is to facilitate destructive implementations ofLendJoin
where getting the same index more than once would be unsound (i.e. literally just the owned implementation ofLendJoin
forChangeSet
which removes values as it iterates).ParJoin
trait to not rely onJoin
implementation so we have additional flexibility to use different implementations and keep&mut Join::Value
inJoin::get
.Join
anunsafe
trait. The newLendJoin
trait is alsounsafe
to implement.Storage
no longer implements mutableJoin
s (the non-lending variant) for allUnprotectedStorage
s. Instead this implementation requires that the storage implementsSharedGetMutStorage
. The trait which provides ashared_get_mut(&self, id: Index)
method. Notice this takes a shared reference. Storages that can soundly implement this wrap their components inUnsafeCell
to allow constructing a mutable reference from a shared reference.#![deny(unsafe_op_in_unsafe_fn)]
to make it easier to identify and document unsafe operations.To try to catch any remaining UB, I ran the available tests under Miri . This also identified some issues in dependencies for which I have submitted PRs to fix:
(We need to publish new versions of these to crates.io)
This PR adds Miri to the CI.
Additionally,
specs
exposed anightly
cargo feature that enabled additional APIs using GATs. Since GATs are now stabilized, I bumped the MSRV to be able to eliminate this feature and remove a bunch ofcfg
complexity.Also I introduced
AccessMut
trait which is similar toDerefMut
except it requires explicit use. The associated typeUnprotectedStorage::AccessMut<'_>
now requiresAccessMut
instead ofDerefMut
. This is to faciliate my work in #737 where I am exploring a flagged storage type that makes generating modification events more explicit. A blanket implementation ofAccessMut
for anything implementingDerefMut
is included.