-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify Box<T>
representation and its use in FFI
#62514
Conversation
r? @dtolnay (rust_highfive has picked a reviewer for you, use r? to override) |
src/liballoc/boxed.rs
Outdated
@@ -63,6 +63,27 @@ | |||
//! T` obtained from `Box::<T>::into_raw` may be deallocated using the | |||
//! [`Global`] allocator with `Layout::for_value(&*value)`. | |||
//! | |||
//! `Box<SomethingSized>` has the same representation as `*mut SomethingSized`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They would always have the same representation, not just if T is Sized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, sorry. I was just so focused on the FFI aspect of it that started this PR in the first place that I phrased that poorly. A fix is on the way. Thank you!
Box<SomethingSized>
and *mut SomethingSized
have the same representationBox<T>
representation and its use in FFI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should probably change the parameter type of bar
to Option<Box<Foo>>
so that the Rust code can handle C code calling this function with a null pointer, which would lead to UB. I will update the PR tonight.
This officializes what was only shown as a code example in [the unsafe code guidelines](https://rust-lang.github.io/unsafe-code-guidelines/layout/function-pointers.html?highlight=box#use) and follows [the discussion](rust-lang/unsafe-code-guidelines#157) in the corresponding repository. It is also related to [the issue](rust-lang#52976) regarding marking `Box<T>` `#[repr(transparent)]`.
I am on board with documenting this as a guarantee. Let's get a set of lang team eyes as well: |
Ping from triage @joshtriplett, any updates? |
There is a PR to propose guaranteeing this as part of the UCGs RFC, but that PR has not been merged yet: rust-lang/unsafe-code-guidelines#164 |
cc @rust-lang/lang |
rust-lang/unsafe-code-guidelines#164 is more general and does not address |
Sorry for the delayed review. I have a concern here regarding semantics, rather than representation. If you actually pass a (Also, this should not work for any This seems consistent with the language you've used here. I just want to raise the point explicitly and ask if this will actually work in all cases. If it will, then by all means let's specify this, and it seems like a very useful way to specify ownership transfer of a heap object. |
@rfcbot merge Based on discussion in the lang team meeting, we're confident that this will work, and does indeed represent ownership transfer of a heap object across an FFI boundary. |
Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members:
No concerns currently listed. Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
(Tagging @pnkfelix and @withoutboats as they are on leave.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from a grammatical nit, these changes look good! However, we may want to address the point that @gnzlbg raised, where using Rust types for functions defined in C can (in subtle cases where things are compiled with PGO etc) cause UB. Do you want me to take a stab at how to word that?
src/liballoc/boxed.rs
Outdated
@@ -63,6 +63,50 @@ | |||
//! T` obtained from `Box::<T>::into_raw` may be deallocated using the | |||
//! [`Global`] allocator with `Layout::for_value(&*value)`. | |||
//! | |||
//! So long as `T: Sized`, a `Box<T>` is guaranteed to be represented as a | |||
//! single pointer and is also ABI-compatible with C pointers (i.e. the C type | |||
//! `T*`). This means that you have Rust code which passes ownership of a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
//! `T*`). This means that you have Rust code which passes ownership of a | |
//! `T*`). This means that you can have Rust code which passes ownership of a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thank you!
src/liballoc/boxed.rs
Outdated
//! single pointer and is also ABI-compatible with C pointers (i.e. the C type | ||
//! `T*`). This means that you have Rust code which passes ownership of a | ||
//! `Box<T>` to C code by using `Box<T>` as the type on the Rust side, and | ||
//! `T*` as the corresponding type on the C side. As an example, consider this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So at this point, @gnzlbg has convinced me that there is one remaining issue that we may want to address here. In particular, it seems to me that it is only safe to use Rust types if all callers to the function obey the Rust restrictions (e.g., non-null, valid, etc). The easiest way to express that is to say that it is safe to use Rust in types in cases where the function is defined in Rust, but not in cases where the function is defined in C. I'm debating how much detail to go into here and where.
@nikomatsakis Yes, if you don't mind. I am sure your wording will be more accurate than mine. Thank you. |
@nikomatsakis I think you correctly understood the issue.
In C, the behavior of programs containing prototypes that do not exactly match the definition is undefined. When using Rust with xLTO, the only thing that should matter are the semantics of LLVM-IR, but AFAIK these semantics are unfortunately not written anywhere and LLVM-IR does not guarantee yet that declarations that are less narrow than definitions are ok. With features like xLTO it can be very hard for declarations to precisely match definitions at the LLVM-IR level, so I also think that allowing these use cases would make sense. Maybe these issues should be brought up with the LLVM community? |
@gnzlbg in my opinion, this feels like an LLVM bug, I do think it'd be good to raise with them. However, for the time being we can at least advise people to only use Rust types for functions that are defined in Rust. |
OK, I pushed some updates notes. I'm inclined to r+ now but I'll give it a bit to see if anyone has any suggestions on my wording. |
23d7e3a
to
fafa489
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM!
@bors r+ rollup |
📌 Commit fafa489 has been approved by |
LGTM as well. The concern about C prototypes also applies to |
Yes, I think so, and |
@gnzlbg do you plan to write a PR, or should we open an issue to not forget about this? |
Clarify `Box<T>` representation and its use in FFI This officializes what was only shown as a code example in [the unsafe code guidelines](https://rust-lang.github.io/unsafe-code-guidelines/layout/function-pointers.html?highlight=box#use) and follows [the discussion](rust-lang/unsafe-code-guidelines#157) in the corresponding repository. It is also related to [the issue](rust-lang#52976) regarding marking `Box<T>` `#[repr(transparent)]`. If the statement this PR adds is incorrect or a more in-depth discussion is warranted, I apologize. Should it be the case, the example in the unsafe code guidelines should be amended and some document should make it clear that it is not sound/supported.
Rollup of 8 pull requests Successful merges: - #62514 (Clarify `Box<T>` representation and its use in FFI) - #66983 (Fix `unused_parens` triggers on macro by example code) - #67215 (Fix `-Z print-type-sizes`'s handling of zero-sized fields.) - #67230 (Remove irelevant comment on `register_dtor`) - #67236 (resolve: Always resolve visibilities on impl items) - #67237 (Some small readability improvements) - #67238 (Small std::borrow::Cow improvements) - #67239 (Make TinyList::remove iterate instead of recurse) Failed merges: r? @ghost
☔ The latest upstream changes (presumably #67246) made this pull request unmergeable. Please resolve the merge conflicts. |
//! pub struct Foo; | ||
//! | ||
//! #[no_mangle] | ||
//! pub extern "C" fn foo_new() -> Box<Foo> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually a function like foo_new()
requires that the returned pointer be destroyed by calling the foo_delete()
function. Returning a Box
is almost always the wrong thing to do here since Box
doesn't enforce that foo_delete
is called when the Box is dropped. So this example is documenting a pattern that is usually the wrong thing to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@briansmith this is a long-merged PR, so new comments here are bound to get lost... please open an issue if you think this needs to be tracked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning a Box is almost always the wrong thing to do here since Box doesn't enforce that foo_delete is called when the Box is dropped.
It's no different than "regular" FFI code where one returns raw pointer and requires the caller to call the corresponding _delete
function with the same pointer, but returning & accepting Box better documents semantics.
If you open a new issue, please cc me, this is a pattern close to my heart :)
This officializes what was only shown as a code example in the unsafe code guidelines and follows the discussion in the corresponding repository.
It is also related to the issue regarding marking
Box<T>
#[repr(transparent)]
.If the statement this PR adds is incorrect or a more in-depth discussion is warranted, I apologize. Should it be the case, the example in the unsafe code guidelines should be amended and some document should make it clear that it is not sound/supported.