Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update seize to 0.3 #123

Merged
merged 5 commits into from
Apr 21, 2024
Merged

Update seize to 0.3 #123

merged 5 commits into from
Apr 21, 2024

Conversation

ibraheemdev
Copy link
Collaborator

@ibraheemdev ibraheemdev commented Mar 12, 2024

This change is Reviewable

Copy link

codecov bot commented Mar 12, 2024

Codecov Report

Attention: Patch coverage is 71.42857% with 4 lines in your changes are missing coverage. Please review.

Project coverage is 76.83%. Comparing base (48ff2e6) to head (8a2bf22).

❗ Current head 8a2bf22 differs from pull request most recent head 924ff7d. Consider uploading reports for the commit 924ff7d to get more accurate results

Additional details and impacted files
Files Coverage Δ
src/iter/mod.rs 100.00% <ø> (ø)
src/node.rs 73.51% <100.00%> (-0.04%) ⬇️
src/raw/mod.rs 79.72% <ø> (ø)
src/reclaim.rs 86.00% <100.00%> (ø)
src/map.rs 78.24% <42.85%> (+<0.01%) ⬆️

... and 1 file with indirect coverage changes

@ibraheemdev
Copy link
Collaborator Author

ibraheemdev commented Mar 13, 2024

It looks like the Box argument in TreeBin::new was making miri tag it as unique and making it mad when it was dereferenced again during the tree walk through a prev reference. Changing it to take a raw pointer fixes the failures.

@jonhoo
Copy link
Owner

jonhoo commented Mar 16, 2024

Hmm, I think what we're running into here is actually this property, where the expectation is that a Box argument to a function must be valid (and thus its backing memory not aliased) until it is deallocated. In our case, we don't deallocate it, we just convert it into a raw pointer with Box::into_raw and then we stick that raw pointer into the concurrent data structure so that readers can immediately start reading it. In other words, the memory behind the Box is aliased during the execution of the function, which I think violates the LLVM noalias attribute.

Moving to passing in a raw pointer does avoid this problem since it means noalias isn't added, although it's pretty sad that it makes the whole function now be unsafe as the caller has to guarantee the pointer came from Box::into_raw.

What's super interesting is that using the new tree borrows model -Zmiri-tree-borrows also makes Miri not complain (while keeping the Box in the argument type), and this makes me wonder whether I've either confused myself or whether it's just a missing bit in the current tree borrows implementation?

Would you mind splitting out the seize upgrade from this PR so that we can land that while continuing to dig on the Miri side of things?


I'm going to be bold and tag in @RalfJung and @Vanille-N here, just in case they can shed some light on the situation. Very briefly, the question is: is it okay to have a function that does this:

flurry/src/node.rs

Lines 244 to 246 in 48ff2e6

pub(crate) fn new(bin: Box<Linked<BinEntry<K, V>>>, guard: &Guard<'_>) -> Self {
let mut root = Shared::null();
let bin = Shared::from(Box::into_raw(bin));

and then immediately makes the *mut that comes out of Box::into_raw available to other threads? This Miri failure suggests it isn't, and so does my read of this paragraph in the tree borrows doc (since the Box isn't freed), but with -Zmiri-tree-borrows Miri no longer complains about the code in question.

@Vanille-N
Copy link

In other words, the memory behind the Box is aliased during the execution of the function, which I think violates the LLVM noalias attribute.

Unless I misunderstand, the situation you describe isn't actually "aliased memory". At least it definitely doesn't violate noalias w.r.t. the original Box pointer because the pointer that is being shared is a child of that one and thus there is not a protected tag (the original Box pointer) for which a child (the *mut obtained) and a cousin (doesn't exist) are simultaneously accessed.
noalias does not mean "you cannot duplicate this pointer".

That doesn't answer why SB complains about the code, but at least (unless Box::into_raw does some weird stuff with the pointer ?) I don't think TB is expected to forbid this code.

@jonhoo
Copy link
Owner

jonhoo commented Mar 17, 2024

Thank you, that's helpful! You are indeed right, we never create cousins of the *mut (i.e., other children of the Box), only children of it. In a layman's sense I would certainly describe the memory as aliased the moment we share the *mut, but if LLVM's noalias only applies to the pointer while held by the Box, then we're okay. It just bends my brain that sharing the *mut isn't a violation of noalias on the Box since it points to the same memory 😅

Yeah, no idea why SB and TB disagree here.

@ibraheemdev I think I'd recommend undoing the Box -> *mut change then, and instead adding -Zmiri-tree-borrows to MIRIFLAGS. It's listed as "very unstable", but I think it's better to keep the current code to get back to a clean Miri while accepting some potential for false negatives.

@RalfJung
Copy link
Contributor

RalfJung commented Mar 19, 2024

In our case, we don't deallocate it, we just convert it into a raw pointer with Box::into_raw and then we stick that raw pointer into the concurrent data structure so that readers can immediately start reading it. In other words, the memory behind the Box is aliased during the execution of the function, which I think violates the LLVM noalias attribute.

As long as the other threads are using the pointer that was created by into_raw, it's fine. noalias means there can't be any "outside" aliases, but if the function decides to share its noalias pointer with everyone, then everyone can access it. It's just crucial that they use a pointer that is "derived from" (in the provenance sense) the noalias pointer.

is it okay to have a function that does this:

Yes that seems totally fine, as Neven said.

The error indicates that you are using a pointer that is not derived from that bin argument, and that is the actual issue. If you can reproduce this locally without "disable-isolation", I'd recommend using -Zmiri-track-pointer-tag=1060680,1086853 (IDs will need adjustment after the testcase is isolated) to determine the order in which stuff happens here and where the two involved tags come from. I see atomics, is there any concurrency in this test?

The error makes it look like some unknown old pointer X is used to construct the bin argument at Box type, and then later the original pointer X is passed to as_tree_node and used again, which is incompatible with Box's noalias. To find X, Miri's pointer tracking can be used to determine the parent tag of 1086853, up the chain, until we have a tag that's also an ancestor of 1060680.

TB is more tolerant of read-only aliasing with unique pointer (mutable references and Box), so if this bin pointer is not being written to that would explain why it only errors with SB.

@ibraheemdev
Copy link
Collaborator Author

ibraheemdev commented Mar 19, 2024

Yeah, the issue that we are actually aliasing pointers. The outside is constructing a tree and giving it's children back pointers to the root node. It's then turning the root pointer into a Box and passing it to new. new traverses the tree and accesses to the back pointer to the root, which is aliased, as Miri flags.

@jonhoo I'm not sure we should be switching to tree borrows semantics to avoid an issue like this because once the tree is constructed the pointers are aliased, so we shouldn't really be using Box again.

@jonhoo
Copy link
Owner

jonhoo commented Mar 19, 2024

Ahhh, yes, you're right @ibraheemdev. Although the problem isn't the aliasing (I think we're actually doing that part right), it's the provenance. And concurrency isn't part of the problem. To give a bit more detail for Ralf/Neven since this is interesting (and possibly just a shortcoming of SB):

The heap allocated value is originally constructed here:

flurry/src/map.rs

Lines 2780 to 2781 in 48ff2e6

let new_tree_node =
Shared::boxed(BinEntry::TreeNode(new_tree_node), &self.collector);

This happens as part of constructing a doubly-linked list just prior to the call with the Box argument. That code all deals just in *muts. The doubly-linked list is not shared to other threads at this point.

On the first iteration through, head and tail are both set to point to that allocation, with their provenance set as children of the original allocation. Then, when the next element is added to the linked list, the prev pointer of that element is set to the current tail pointer, which is == head. From a provenance perspective, this is then a child of tail (which is a sibling(?) of head). This assignment happens here:

flurry/src/map.rs

Line 2779 in 48ff2e6

new_tree_node.prev.store(tail, Ordering::Relaxed);

After the linked list construction, we now turn head back into a Box:

flurry/src/map.rs

Lines 2801 to 2806 in 48ff2e6

BinEntry::Tree(TreeBin::new(
// safety: we have just created `head` and its `next`
// nodes and have never shared them
unsafe { head.into_box() },
guard,
)),

From an aliasing perspective, not a problem — we have another raw pointer to it, but as long as it isn't dereferenced, things are fine. The Box has provenance as a child of head.

We then pass that Box to TreeBin::new, which immediately calls Box::into_raw on it again (it only takes an argument of type Box to enforce that the caller has indeed heap-allocated the value). The resulting pointer has provenance as a child of the Box. The function then does "some stuff" to turn the linked list into a tree instead (still no concurrent access). The pointer that's a child of the Box gets assigned as the root. We then walk the linked list (forwards), assigning "left" and "right" pointers using the pointers that come out of next/prev pointers. Meaning the root now has provenance as the child of the Box, but all the other pointers in the tree have provenance from the linked list construction.

Once we now go to use this tree for anything, we hit the problem (at least for SB). I think what happens is that Miri sees the (read) access of the root, and concludes that this is access via the provenance of the Box. So, the provenance of that Box must be "alive" (I don't know the right term). But then it sees another (read) access through the .prev pointer of a node in the tree, that that pointer has provenance from the original head (well, tail really). And since that provenance is "before" the Box provenance, SB concludes that the access to head invalidates the access via Box.

Why TB allows this I'm not entirely sure, but it sounds like you're onto something @RalfJung in that it may have sufficient information about the provenance tree here to conclude that it's okay for these provenances to co-exist when all the accesses are reads?


In terms of where we go from here, I suppose I'm okay switching to the *mut argument type and just documenting clearly in both the method signature and all calls that the argument must be/is a value constructed via Shared::boxed.

@RalfJung
Copy link
Contributor

RalfJung commented Mar 20, 2024

Although the problem isn't the aliasing (I think we're actually doing that part right), it's the provenance.

There's no difference between these. Provenance is how aliasing is defined. So I don't know what you mean by this.

If you have a long-lived Box, then because Box is noalias, all other long-lived pointers that are used while the Box is "live" must be derived from that Box. From your description it sounds like this is indeed not the case.

If there are truly only Box and no references anywhere here, you may may want to add your voice in rust-lang/unsafe-code-guidelines#326 arguing for the noalias on Box to be removed.

Why TB allows this I'm not entirely sure, but it sounds like you're onto something @RalfJung in that it may have sufficient information about the provenance tree here to conclude that it's okay for these provenances to co-exist when all the accesses are reads?

My theory is that in the testcase, everything is read-only. TB allows aliasing of noalias pointers as long as there are no writes. If you have other testcases where a similar pattern occurs, but then after the Box is created

  • The original pointer is used for mutation
  • Then a pointer derived from the Box is used for mutation

then there should be UB with TB as well.

@jonhoo
Copy link
Owner

jonhoo commented Mar 20, 2024

Although the problem isn't the aliasing (I think we're actually doing that part right), it's the provenance.

There's no difference between these. Provenance is how aliasing is defined. So I don't know what you mean by this.

I think it's probably a mistake to use these terms interchangeably. While we may define aliasing in terms of provenance, that doesn't mean they are synonyms. Semi-formally, at least in my head, provenance is "knowledge of ow we got here", while aliasing is "a statement about the provenance graph". But in this case, I was using the less-formal meaning of aliasing, namely "are two pointers pointing to the same address" (which yes, I know, is an overly-simple way of looking at the world 😅). So what I meant was that the problem isn't that we have two *mut to the allocation behind a Box while the Box lives (which was my read of @ibraheemdev's comment), it's that after the whole Box dance, we now have *mut of different (and incompatible as per below) provenances to the underlying memory in our data structure.

If you have a long-lived Box, then because Box is noalias, all other long-lived pointers that are used while the Box is "live" must be derived from that Box. From your description it sounds like this is indeed not the case.

If there are truly only Box and no references anywhere here, you may may want to add your voice in rust-lang/unsafe-code-guidelines#326 arguing for the noalias on Box to be removed.

Yes, I think that's right. The various "states of the world" are, in order:

  1. A Box is created.
  2. The Box is turned into a *mut with Box::into_raw.
  3. The *mut from 2 is stored in two places.
  4. A Box is re-created from the first of the *mut from 3. The second *mut is not dereferenced while this Box lives.
  5. A function is called, and passed the Box from 4 as an argument.
  6. The function immediately turns that Box into a *mut with Box::into_raw.
  7. The *mut from 6 and the second *mut from 3 are both dereferenced (read-only).
  8. (eventually) One of the *mut from 7 is turned back into a Box and dropped. This happens only once we guarantee that none of the other *mut will ever be dereferenced again.

I think the question is whether this means "the Box can't be noalias". I suppose in the execution above, Box::from_raw could technically (and legally) do something like relocate the allocation, which would break this entire flow. Which suggests that the mere existence of a Box implies any alias to the allocation cannot be relied upon any more. Which is indeed what noalias implies and what SB tells us. So I think I'm onboard with that (because I believe that's something that should be legal and safe for Box to do).

I suppose TB in this case allows it because it realizes that Box::into_raw and Box::from_raw don't do that? Maybe?

Why TB allows this I'm not entirely sure, but it sounds like you're onto something @RalfJung in that it may have sufficient information about the provenance tree here to conclude that it's okay for these provenances to co-exist when all the accesses are reads?

My theory is that in the testcase, everything is read-only.

Yep, see above.

@RalfJung
Copy link
Contributor

RalfJung commented Mar 20, 2024 via email

@RalfJung
Copy link
Contributor

RalfJung commented Mar 20, 2024 via email

@jonhoo
Copy link
Owner

jonhoo commented Mar 20, 2024

what I meant was that the problem isn't that we have two *mut to the allocation behind a Box while the Box lives
I would say that is exactly the problem. The point is that the Box is live as long as at least one pointer derived from it is live. That is a fundamental ingredient of both Stacked Borrows and Tree Borrows.

Oh interesting, I hadn't thought about it that way. If that is indeed the rule, then yes, totally agree that we violate noalias since we do have other live pointers to the allocation while a child of the Box is also live 👍

I suppose TB in this case allows it because it realizes that Box::into_raw and Box::from_raw don't do that? Maybe?
No, TB allows it because there are no writes.

Ah, yes, I suppose "relocating" is really just one example of a write, but any write indeed.

src/map.rs Outdated
// safety: we have just created `low` and its `next`
// nodes and have never shared them
let low_bin = unsafe {
BinEntry::Tree(TreeBin::new(Box::into_raw(low.into_box()), guard))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my read of the discussion, I think we shouldn't Box::into_raw(low.into_box()) here (and elsewhere) — we shouldn't be re-constructing a Box at all. Instead, we should grab the raw pointer directly with low.0.into_inner() so that we preserve the right provenance. In fact, we should probably remove fn into_box entirely.

src/map.rs Outdated Show resolved Hide resolved
@jonhoo
Copy link
Owner

jonhoo commented Mar 20, 2024

The point is that the Box is live as long as at least one pointer derived from it is live. That is a fundamental ingredient of both Stacked Borrows and Tree Borrows.

Follow-up question on this: how does this work when you have code like this:

let a = Box::new(42);
let b = Box::into_raw(a);
let c = unsafe { Box::from_raw(b) };
drop(c);

c is live, and is derived from b, which is in turn derived from a. Which by your statement implies that a is live. But that violates noalias since we then really have two "live" Boxes to the same allocation?

@RalfJung
Copy link
Contributor

RalfJung commented Mar 20, 2024

Oh interesting, I hadn't thought about it that way. If that is indeed the rule, then yes, totally agree that we violate noalias since we do have other live pointers to the allocation while a child of the Box is also live 👍

Okay looks we found the cause of the misunderstanding then. :)
rust-lang/unsafe-code-guidelines#450 is an issue tracking whether this is the right call. It's about &mut as that's where this happens most often, but also applies to Box.

c is live, and is derived from b, which is in turn derived from a. Which by your statement implies that a is live. But that violates noalias since we then really have two "live" Boxes to the same allocation?

No, this is fine for the same reason reborrowing is fine with &mut (which in a sense also means there are aliases). nolias is not violated precisely because of the parent-child relationship. It's only an aliasing violation when you have two live pointers that are not in a parent-child relationship. To be more precise, this is asymmetric: writing to a pointer means you can't use any of its noalias children or cousins/uncles any more, but you can still use all of its parents.

@ibraheemdev ibraheemdev force-pushed the seize-0.3 branch 2 times, most recently from b34292c to a04005c Compare April 13, 2024 23:17
@ibraheemdev
Copy link
Collaborator Author

ibraheemdev commented Apr 16, 2024

The new Miri failures are unrelated and caused by a recent nightly regression (rust-lang/rust#124013). Everything passes running locally an older Miri.

Copy link
Owner

@jonhoo jonhoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, this looks good to me, thanks!

One interesting question here is whether we can now remove into_box entirely?

src/node.rs Outdated Show resolved Hide resolved
@jonhoo jonhoo merged commit d05dac5 into jonhoo:main Apr 21, 2024
17 checks passed
jonhoo added a commit that referenced this pull request Apr 21, 2024
@jonhoo
Copy link
Owner

jonhoo commented Apr 21, 2024

Releasing in #124

jonhoo added a commit that referenced this pull request Apr 21, 2024
jonhoo added a commit that referenced this pull request Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants