Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for Weak::into_raw/from_raw & similar #60728

Closed
1 of 3 tasks
vorner opened this issue May 11, 2019 · 32 comments · Fixed by #72288
Closed
1 of 3 tasks

Tracking issue for Weak::into_raw/from_raw & similar #60728

vorner opened this issue May 11, 2019 · 32 comments · Fixed by #72288
Labels
B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@vorner
Copy link
Contributor

vorner commented May 11, 2019

Changing this to a tracking issue for the feature:

  • Pull request Weak::into_raw #60766
  • ❓ What to do about dangling Weak (created by Weak::new()) and T: ?Sized? These seem to be incompatible.
  • Stabilization

The original proposal

Hello

The Arc has the into_raw and from_raw methods. I think it would be technically possible to have the same on Weak. Obviously, as Weak is non-owning, it would be up to the caller to make sure the pointer is not dangling when used.

Would adding these (and maybe as_raw too ‒ one can get a reference out of Arc, but not from Weak) require an RFC, because the handling of these methods might be even a bit more delicate than the ones on Arc, or is this considered small enough for just a pull request & later stabilization?

Motivation

I've written the arc-swap crate that allows to keep an Arc around but make it point to some other object atomically. It uses the mentioned methods. It would make sense to have weak version of this atomic storage too.

Additionally, the as_raw would make it possible to compare if eg an Arc and Weak point to the same object (which would also come handy in some alternative of the Cache that doesn't hold the object alive needlessly).

@Centril Centril added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. C-feature-request Category: A feature request, i.e: not implemented / a PR. labels May 11, 2019
@Centril
Copy link
Contributor

Centril commented May 11, 2019

Would adding these (and maybe as_raw too ‒ one can get a reference out of Arc, but not from Weak) require an RFC, because the handling of these methods might be even a bit more delicate than the ones on Arc, or is this considered small enough for just a pull request & later stabilization?

A PR is sufficient; the motivation can be justified in there.

@vorner vorner changed the title Weak::into_raw/from_raw & similar Tracking issue for Weak::into_raw/from_raw & similar May 19, 2019
Centril added a commit to Centril/rust that referenced this issue May 29, 2019
Weak::into_raw

Hello

This is my first shot at rust-lang#60728. I'd like to consult it a bit before moving further.

The biggest question I have is if this API makes sense. My motivation for it is to be able to store the `Weak` in `AtomicPtr`. For that I don't actually need for the pointer to point to the `T`, any pointer (maybe casted to `usize`) would be good enough, but this mirrors what `Arc` does and could be useful for other things too (like comparing if Arc and Weak point to the same thing without playing with the counts), while some opaque pointer wouldn't.

Some secondary questions, if this is deemed desirable are:
* The weak pointer may be dangling if it is created by `Weak::new()`. It would make sense to treat this as NULL, but that is incompatible with `T: ?Sized` ‒ both `new()` and `ptr::null()` are available only for sized types. The current implementation is therefore also available only for sized `T`s. It would be possible to allow `?Sized` if the API would be `fn into_raw(self) -> Option<NonNull<T>>` and `fn from_raw(NonNull<T>)`, but that's different API than `Arc` has. What would be preferred?
* There's a FIXME in my code about what I suspect could be UB. Is it really UB & how to get the pointer correctly? Is manual offsetting of the pointer the only way?
* Am I missing some other necessary thing around the feature gates and such?
* Is the documentation understandable? I know writing docs is not my strongest skill :-|.

Thinks I'd like to do as part of the PR, but are not yet done:
* Turn the referenced issue into tracking issue for the feature flag.
* Once the `sync::Weak` is considered reasonable, I'd do the equivalent for `rc::Weak`.
* This might call for few more tests than what is currently part of the documentation.
Centril added a commit to Centril/rust that referenced this issue May 29, 2019
Weak::into_raw

Hello

This is my first shot at rust-lang#60728. I'd like to consult it a bit before moving further.

The biggest question I have is if this API makes sense. My motivation for it is to be able to store the `Weak` in `AtomicPtr`. For that I don't actually need for the pointer to point to the `T`, any pointer (maybe casted to `usize`) would be good enough, but this mirrors what `Arc` does and could be useful for other things too (like comparing if Arc and Weak point to the same thing without playing with the counts), while some opaque pointer wouldn't.

Some secondary questions, if this is deemed desirable are:
* The weak pointer may be dangling if it is created by `Weak::new()`. It would make sense to treat this as NULL, but that is incompatible with `T: ?Sized` ‒ both `new()` and `ptr::null()` are available only for sized types. The current implementation is therefore also available only for sized `T`s. It would be possible to allow `?Sized` if the API would be `fn into_raw(self) -> Option<NonNull<T>>` and `fn from_raw(NonNull<T>)`, but that's different API than `Arc` has. What would be preferred?
* There's a FIXME in my code about what I suspect could be UB. Is it really UB & how to get the pointer correctly? Is manual offsetting of the pointer the only way?
* Am I missing some other necessary thing around the feature gates and such?
* Is the documentation understandable? I know writing docs is not my strongest skill :-|.

Thinks I'd like to do as part of the PR, but are not yet done:
* Turn the referenced issue into tracking issue for the feature flag.
* Once the `sync::Weak` is considered reasonable, I'd do the equivalent for `rc::Weak`.
* This might call for few more tests than what is currently part of the documentation.
@cramertj
Copy link
Member

cramertj commented Oct 3, 2019

@rust-lang/libs is there any opposition to stabilizing at least .as_raw()? It's hard to imagine wanting that function looking any different than it does today, and it's pretty useful (I don't know another way to deduplicate weak pointers to the same object).

@vorner
Copy link
Contributor Author

vorner commented Oct 4, 2019

When you bring it up, what needs to be done for stabilizing it all? The API follows the one on Arc, so I don't think any of this could look differently.

There's the problem with unsized Ts ‒ currently, they are simply not supported (because of the interaction with dangling & null pointers). But relaxing the restriction later would be backwards compatible change, so I don't think a decision what exactly to do about that has to block stabilization.

@CAD97
Copy link
Contributor

CAD97 commented Nov 21, 2019

Relaxing the restriction of T to ?Sized I don't think is compatible with the current safety requirements. On top of that, I think that as is currently written, it's self-conflicting.

[...]

It takes ownership of one weak count. In case a null is passed, a dangling Weak is returned.

Safety

The pointer must represent one valid weak count. In other words, it must point to T which is or was managed by an (A)Rc and the weak count of that (A)Rc must not have reached 0. It is allowed for the strong count to be 0.

[...]

The safety requirement requires that the pointer point to a valid ArcInner/RcBox (side note: probably should make those naming consistent). But then it also says that it accepts null? The reason for this is that as_raw/into_raw return null when the strong count is 0 when the weak is fake (self.inner() returns None).

But on top of that, this is wrong (I just realized writing this up)! Because from_raw(into_raw(_)) on a dangling Weak turns it into the singleton dangling Weak, the weak count of the original will never be decreased! The correct behavior is for Weak::into_raw to return the dangling pointer such that Weak::from_raw can recover the original pointer to allocation and decrease the weak count. This then is fully compatible with T: ?Sized as it's doing the exact same as the non-weak version.

As for as_raw... it's "fine" for it to return null as a helpful hint (as dereferencing null is a faster error), but I think it's more important for as_raw(_); forget(_) to be equivalent to into_raw(_).

@vorner
Copy link
Contributor Author

vorner commented Nov 24, 2019

It's been a while since I wrote it, so I went back to have a look. I think it is slightly different than how you describe it. If I read through it correctly, there are three kinds of weak pointer values:

  • Fully valid ones, pointing to RcBox with strong_count() > 0.
  • Dead ones, pointing to RcBox, but with strong_count() = 0. They've run the destructor on the T, but the RcBox is still there.
  • Dangling ones. These were created with Weak::new(). They don't have any RcBox allocated for them and their pointer is set to a sentinel value (usize::MAX casted to pointer).

The into_raw()/from_raw() return null() only for the last case. There's no weak count to speak of in that case, so nothing is forgotten to be decreased.

The code needs to special-case the sentinel value somehow anyway, it can't just add the offset to it, because it would overflow and that is AFAIK forbidden thing to do with pointers. It could also return the sentinel value (usize::MAX), but null seems to be as good sentinel value as that other one, with the difference users recognize it better. It also has proper alignment for the type, which other code might rely on (by putting flags into the low bits). The downside is that pointers to unsized types don't have null() and if I remember correctly, this was the reason why they are not currently supported.

I can try changing the sentinel value and see if I can add support for unsized types instead.

As for the documentation, if I just remove all notions of null and keep it to the fact the pointer must come from previous Weak and not specify what value is returned in case it is from Weak::new(), would it be better?

@CAD97
Copy link
Contributor

CAD97 commented Nov 24, 2019

Ah, I admit, I read the code wrong.

Ok, I agree returning null for "fake" weak pointers is probably the correct sentinel. (The use of usize::MAX as a sentinel for the inner pointer only works because the RcBox/ArcInner has align(usize) > 1, but the T doesn't have that guarantee, so usize::MAX is a theoretically valid pointer. I think a T pointer within an RcBox/ArcInner is guaranteed to have at least align(inner) because of the alignment of the container, but it'd be better to not rely on that.)

But the nonexistence of ptr::null for ?Sized types isn't really an issue. The reason it doesn't exist is the inability for it to conjure pointer metadata out of thin air. We have that!

I don't know the exact way std would accomplish this, but "all" that needs to be done is to pull the metadata off of the inner pointer, and put it onto a null pointer. (And then do the same thing in from_raw.)

For docs, I'd just specify that the pointer given to Weak::from_raw must have come from Weak::as_raw or Weak::into_raw.

@vorner
Copy link
Contributor Author

vorner commented Nov 24, 2019

I've tried updating the docs, I hope it's for the better.

But the nonexistence of ptr::null for ?Sized types isn't really an issue. The reason it doesn't exist is the inability for it to conjure pointer metadata out of thin air. We have that!

I don't know the exact way std would accomplish this, but "all" that needs to be done is to pull the metadata off of the inner pointer, and put it onto a null pointer. (And then do the same thing in from_raw.)

It's not the problem with null pointers per se, because Weak::new exists for Sized types only too. It's simply illegal to put null pointer into into_raw for unsized type for that reason. However the sized implementation needs to check for null equality. It would be fine if this turned out to be false every time for unsized types, but as the null() doesn't even exist, it doesn't compile :-(. I think specialization would help here but AFAIK it's not ready to be used yet. Or is it?

But, after clarifying the documentation, is there anything that would prevent adding that specialization/relaxing the bounds in the future in backwards compatible manner?

@jonas-schievink jonas-schievink added B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC and removed C-feature-request Category: A feature request, i.e: not implemented / a PR. labels Nov 24, 2019
vorner added a commit to vorner/rust that referenced this issue Dec 5, 2019
Clarify it is OK to pass a pointer that never owned a weak count (one
from Weak::new) back into it as it was created from it. Relates to
discussion in rust-lang#60728.
Centril added a commit to Centril/rust that referenced this issue Dec 5, 2019
…tolnay

weak-into-raw: Clarify some details in Safety

Clarify it is OK to pass a pointer that never owned a weak count (one
from Weak::new) back into it as it was created from it. Relates to
discussion in rust-lang#60728.

@CAD97 Do you want to have a look at the new docs?
@vorner
Copy link
Contributor Author

vorner commented Apr 6, 2020

Hello? I know this is probably not a high priority feature, but it also seems somewhat small and I'd really like to push it over the finish line somehow. Is this waiting on something from me? I believe the raised concerns about docs were hopefully addressed.

@Amanieu
Copy link
Member

Amanieu commented Apr 11, 2020

@rfcbot fcp merge

@rfcbot
Copy link

rfcbot commented Apr 11, 2020

Team member @Amanieu has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Apr 11, 2020
@Amanieu
Copy link
Member

Amanieu commented Apr 11, 2020

I would prefer if we made no guarantee as to what pointer value is generated for a dangling Weak. This could allow for a slightly more efficient implementation since we could just blindly apply the offset of usize::MAX to convert to/from a raw pointer.

@rfcbot concern dangling Weak are represented using null

The convention for returning raw pointers to a value is as_ptr. There is no precedent for as_raw in the standard library.

@rfcbot concern as_raw should be as_ptr

@CAD97
Copy link
Contributor

CAD97 commented Apr 12, 2020

@Amanieu Consider not only that other types use as_ptr, but that Arc/Rc already have both into_raw and from_raw. I don't think any type actually has both the "as_ptr" and "into_raw" operations yet:

  • as_ptr: slice, str, Cell, RefCell, CStr, MaybeUninit, NonNull, Vec
  • into_ptr: (none)
  • from_ptr: CStr
  • as_raw: (unstable) rc::Weak, sync::Weak
  • into_raw: Box, CString, Rc, Arc, (unstable) rc::Weak, sync::Weak
  • from_raw: ExitStatus, ExitStatusExt, Box, CString, Rc, Arc, Waker, (unstable) rc::Weak, sync::Weak

I think sticking to _raw methods for the Rc family is more consistent than using as_ptr here. Consider also: does from_raw(as_ptr()) look like it should be allowed? from_raw(as_raw()) is clearly using the same "raw representation".

There's also the fact that I would think "as_ptr" would be &*rc, i.e. deref but to a pointer, not a ref. as_raw is more subtle because it keeps mutable provenance over the location, and as such can be used for from_raw, unlike a deref. (The standard library got this wrong previously!)


Some small-ish concerns:

Arc/Rc should have as_raw/as_ptr to mirror the API surface of Weak. Note that &*arc is not enough for as_raw, see the link above where the standard library got this wrong.

@rfcbot concern api should be mirrored on Arc/Rc as well

I want to make sure that these functions on Weak are forwards-compatible with supporting T: ?Sized in the future. When the requirements on pointer metadata for invalid (non-derefable) pointers are clear, the standard library should be able to manufacture some arbitrary pointer metadata for the dangling Weak case (at least for some subset of T: ?Sized).

@rfcbot concern forwards-compatible to T: ?Sized

I don't know if the rfcbot commands will work for me (they probably won't), but might as well try.

@Amanieu
Copy link
Member

Amanieu commented Apr 12, 2020

Fair enough, let's leave the semantics of as_raw as they are right now and just rename it to as_ptr for consistency with the rest of libstd. Since this is clearly something that would be useful for Rc/Arc it would make sense to add a method there as well.

@vorner
Copy link
Contributor Author

vorner commented Apr 13, 2020

Yes, I think that's a good idea. Note that you'll need to use wrapping_offset instead of offset to support the dangling pointers.

While implementing it, I've noticed one thing. The current implementation will return a pointer with invalid alignment. Is that Ok?

Since this is clearly something that would be useful for Rc/Arc it would make sense to add a method there as well.

Here, as part of this feature?

@Amanieu
Copy link
Member

Amanieu commented Apr 13, 2020

I think it's fine to return an unaligned pointer (you're not allowed to dereference it anyways). We should document it however.

Here, as part of this feature?

Sure, might as well.

Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Apr 19, 2020
…anieu

Address concerns of weak-into-raw

This should address the standing concerns in rust-lang#60728 (comment).

I've still left the ability to create a new dangling pointer from `null`, as I feel like this is the natural behaviour to expect, but I'm fine removing that too. I've modified the documentation to allow the `as_ptr` or `into_ptr` to return whatever garbage in case of a dangling pointer. I've also removed the guarantee to be able to do `from_raw(as_ptr)` from the documentation (but it would still work right now).

I've renamed the method and added implementations for `Rc`/`Arc`.

I've also tried if I can just „enable“ unsized types. I believe the current interface is compatible with them. But the inner implementation will be a bit challenging ‒ I can't use the `data_offset` as is used by `Rc` or `Arc` because it AFAIK „touches“ (creates a reference to) the live value of `T` ‒ and in case of `Weak`, it might be completely bogus or already dead ‒ so that would be UB.

`./x.py test tidy` is completely mad on my own system all over the code base :-(. I'll just hope it goes through CI, or will fix as necessary.

Is it OK if I ask @Amanieu for review, as the concerns are from you?

~r @Amanieu
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Apr 19, 2020
…anieu

Address concerns of weak-into-raw

This should address the standing concerns in rust-lang#60728 (comment).

I've still left the ability to create a new dangling pointer from `null`, as I feel like this is the natural behaviour to expect, but I'm fine removing that too. I've modified the documentation to allow the `as_ptr` or `into_ptr` to return whatever garbage in case of a dangling pointer. I've also removed the guarantee to be able to do `from_raw(as_ptr)` from the documentation (but it would still work right now).

I've renamed the method and added implementations for `Rc`/`Arc`.

I've also tried if I can just „enable“ unsized types. I believe the current interface is compatible with them. But the inner implementation will be a bit challenging ‒ I can't use the `data_offset` as is used by `Rc` or `Arc` because it AFAIK „touches“ (creates a reference to) the live value of `T` ‒ and in case of `Weak`, it might be completely bogus or already dead ‒ so that would be UB.

`./x.py test tidy` is completely mad on my own system all over the code base :-(. I'll just hope it goes through CI, or will fix as necessary.

Is it OK if I ask @Amanieu for review, as the concerns are from you?

~r @Amanieu
@vorner vorner mentioned this issue Apr 25, 2020
5 tasks
@vorner
Copy link
Contributor Author

vorner commented May 5, 2020

May I ask what happens now? I believe I've addressed the concerns and it was merged, but they are still listed here. How are they removed?

@Amanieu
Copy link
Member

Amanieu commented May 5, 2020

@rfcbot resolve dangling Weak are represented using null
@rfcbot resolve as_raw should be as_ptr

@rfcbot rfcbot added the final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. label May 5, 2020
@rfcbot
Copy link

rfcbot commented May 5, 2020

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot removed the proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. label May 5, 2020
@Kixunil
Copy link
Contributor

Kixunil commented May 15, 2020

Hmm, I don't know about that as_ptr. Forcing the user into using into_ptr is a great lint that would've prevented some bug's I experienced in C++ code had C++ have it. @CAD97 use case is legit, though. Does that one use case justify "removing a lint"?

@vorner
Copy link
Contributor Author

vorner commented May 15, 2020

@Kixunil Can you explain what you mean? It was renamed to as_ptr to hint that it should not be interchangeable with the into_raw. The documentation no longer suggests it should be possible to use as_ptr to re-create it with from_raw and then forget it. So I think your concern was already resolved?

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels May 15, 2020
@rfcbot
Copy link

rfcbot commented May 15, 2020

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC will be merged soon.

@Kixunil
Copy link
Contributor

Kixunil commented May 16, 2020

@vorner I mean not having it at all. :) into_raw is much more useful in my experience. If not for @CAD97 use case, I would believe as_ptr to be completely useless.

@vorner
Copy link
Contributor Author

vorner commented May 16, 2020

There are other use cases. Like, hashing it, equivalence checking, etc.

@Kixunil
Copy link
Contributor

Kixunil commented May 16, 2020

@vorner aren't those cases better served by using a reference and converting it? Seems more obvious what it does.

@vorner
Copy link
Contributor Author

vorner commented May 16, 2020

What reference do you have in mind? &Weak<T> is not useful for these cases (where you want to check if eg. an atomic pointer still points to the same thing as the Weak you have cached or if you want to recreate a new one). You don't get &T out of a Weak and have to upgrade it to the Arc. But that is both more expensive and might return None ‒ but you might want to know that the Weak you have is still the one you should have even if it is no longer valid. The as_ptr on Arc is there mostly for symmetry.

But I don't really understand your issue. What kind of misuse are you afraid of? as_ptr doesn't seem to allow more foot guns than into_raw ‒ you still get a raw pointer out of that and can do whatever you like with it. You also need to use unsafe to do anything dangerous with it, which should be a lint enough to stop people from doing stupid things by accident.

I assume that when the FPC is over, I can enjoy submitting a stabilization PR.

@Kixunil
Copy link
Contributor

Kixunil commented May 17, 2020

You don't get &T out of a Weak

Ah, yeah, I missed that. I had strong ones in mind mostly. Might be a good reason to have ptr_eq and hash_ptr methods instead, but pointer is definitely more flexible. But the docs should probably explain the pointer is useless for anything else than comparison.

I was mostly wondering about as_ptr being confused with into_ptr in case of strong versions. A sad thing about these kinds of bugs is when combined with multithread environment they can be really hard to debug.

(Personal experience: code sending a pointer from one thread to another waiting on select over a Unix pipe, the sender forgot to ref the pointer. Took years to find the cause of mysterious crashes.)

@vorner
Copy link
Contributor Author

vorner commented May 17, 2020

Might be a good reason to have ptr_eq and hash_ptr methods instead

That wouldn't allow customizing how you hash or against what you compare the pointer (I'd say ptr_eq should take another Weak as a parameter and in case of storing the pointer out-of-bound, you have *const T, maybe).

But the docs should probably explain the pointer is useless for anything else than comparison.

But that is not true. The pointer is fully usable as long as you make sure by some other means that the lifetime of the object is still valid, same as with any other pointer.

@Kixunil
Copy link
Contributor

Kixunil commented May 17, 2020

Yeah, it's possible to do it by other means, I just can't imagine a real-life situation in which one would have a Weak and know that it's valid without also having Rc/Arc and using a reference to those instead.

That all being said, it's not a huge deal, I guess. Mainly wanted this to be considered.

JohnTitor pushed a commit to JohnTitor/rust that referenced this issue May 28, 2020
JohnTitor added a commit to JohnTitor/rust that referenced this issue May 28, 2020
…tolnay

Stabilization of weak-into-raw

Closes rust-lang#60728.

There are also two removals of `#![feature(weak_into_raw)]` in the `src/tools/miri` submodule. How should I synchronize the changes with there?

* I can ignore it for now and once this gets merged, update the tool, send a pull request to that one and then reference the changes to rustc.
* I could try submitting the changes to miri first, but then the build would fail there, because the attribute would still be needed.

I think the first one is the correct one, extrapolating from the contributing guidelines (even though they speak about breaking the tools and this should not break it, as extra feature should not hurt).
@bors bors closed this as completed in d472f8e May 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants