-
Notifications
You must be signed in to change notification settings - Fork 706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add option to use DST structs for flex arrays #2772
Conversation
|
1f9175c
to
a56d13d
Compare
This doesn't handle the case of a C struct pointing to a DST object (ie thin vs fat pointers). |
187c2e3
to
31b4fc8
Compare
I've updated the API to:
I've been using this API for one of my own projects, and so far it feels pretty comfortable. I also changed the generation of PhantomData fields for type parameters to put them at the start of the structure. This was necessary as putting them at the end interfered with the FAM field which must be last. The side-effect of this is a lot of test fixtures changed in a trivial way. |
9cc3831
to
a19d291
Compare
bindgen/codegen/mod.rs
Outdated
@@ -1734,6 +1747,7 @@ impl<'a> FieldCodegen<'a> for BitfieldUnit { | |||
accessor_kind, | |||
parent, | |||
parent_item, | |||
idx == bfields.len() - 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, shouldn't this be: last_field && idx == ..
? Otherwise, what guarantees that there isn't a non-bitfield field after the bitfields?
bindgen/ir/comp.rs
Outdated
}; | ||
|
||
// XXX correct with padding on end? | ||
match fields.last() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can simplify by using fields.last()?
right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant that something like this:
diff --git a/bindgen/ir/comp.rs b/bindgen/ir/comp.rs
index 31c058ea..f6c9e629 100644
--- a/bindgen/ir/comp.rs
+++ b/bindgen/ir/comp.rs
@@ -834,9 +834,9 @@ impl CompFields {
CompFields::Error => return None, // panic?
};
- match fields.last() {
- None | Some(Field::Bitfields(..)) => None,
- Some(Field::DataMember(FieldData { ty, .. })) => ctx
+ match fields.last()? {
+ Field::Bitfields(..) => None,
+ Field::DataMember(FieldData { ty, .. }) => ctx
.resolve_type(*ty)
.is_incomplete_array(ctx)
.map(|item| item.expect_type_id(ctx)),
Should compile and work the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I parsed that ? as a question rather than an operator.
bindgen/codegen/mod.rs
Outdated
fn clone(&self) -> Self { *self } | ||
} | ||
}); | ||
} | ||
|
||
if needs_flexarray_impl { | ||
let prefix = ctx.trait_prefix(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move this out to its own function or such? I feel like that'd make it easier to reason about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, happy to do that. But the rest of the function is so enormous - do you think it should all be factored down?
@@ -9,10 +9,10 @@ pub type DoesNotUse_Typedefed<U> = U; | |||
#[repr(C)] | |||
#[derive(Debug, Copy, Clone)] | |||
pub struct DoesNotUse_IndirectUsage<T, U> { | |||
pub member: DoesNotUse_Aliased<T>, | |||
pub another: DoesNotUse_Typedefed<U>, | |||
pub _phantom_0: ::std::marker::PhantomData<::std::cell::UnsafeCell<T>>, | |||
pub _phantom_1: ::std::marker::PhantomData<::std::cell::UnsafeCell<U>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you send the "move phantom fields to the beginning" as a separate PR? That's fine on its own, and would make this PR a lot easier to review (GitHub's UI for this is not my favourite).
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK updated with comments (and still on top of #2783) |
☔ The latest upstream changes (presumably 3b5ce9c) made this pull request unmergeable. Please resolve the merge conflicts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pvdrz do you recall if we have a way of annotating a feature flag as experimental? This is mostly usable only with nightly rust...
But anyways this looks good to me in principle. Need to do a deeper review of the implementation but didn't look bad on my previous skim :)
bindgen/ir/comp.rs
Outdated
}; | ||
|
||
// XXX correct with padding on end? | ||
match fields.last() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant that something like this:
diff --git a/bindgen/ir/comp.rs b/bindgen/ir/comp.rs
index 31c058ea..f6c9e629 100644
--- a/bindgen/ir/comp.rs
+++ b/bindgen/ir/comp.rs
@@ -834,9 +834,9 @@ impl CompFields {
CompFields::Error => return None, // panic?
};
- match fields.last() {
- None | Some(Field::Bitfields(..)) => None,
- Some(Field::DataMember(FieldData { ty, .. })) => ctx
+ match fields.last()? {
+ Field::Bitfields(..) => None,
+ Field::DataMember(FieldData { ty, .. }) => ctx
.resolve_type(*ty)
.is_incomplete_array(ctx)
.map(|item| item.expect_type_id(ctx)),
Should compile and work the same.
yeah we literally have the |
Does |
Fixed:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks fine. Sorry, last week has been short due to easter holidays :)
☔ The latest upstream changes (presumably fb39a30) made this pull request unmergeable. Please resolve the merge conflicts. |
Please rebase and ping, happy to merge. Hopefully it's just a matter of regenerating test expectations. Sorry, this didn't get into the merge queue because it got configured right after I pressed the "Merge when ready" button :/ |
Do you want me to squash the commits? |
This option uses Rust DST types to model C structures with flexible array members. For example, if you declare: ```c struct record { int len; float values[]; } ``` this means it's a C structure with a well-defined prefix, but a tail of `values` which depends on how much memory we've allocated for it. We can model this in Rust with: ```rust struct record { len: c_int, values: [f32], } ``` which means more or less the same thing - there's a type which has a known prefix, but its suffix is not directly known.
There are some tests which have two incomplete array fields; only the very last one is valid as a flexible array member (in C) or can be a DST field in Rust.
Put the pointer converting methods on the sized prefix types ([T; 0]) since they're the default type and what you're likely to be starting with. Emphasize the ones which work on references since they have safe lifetimes, but also have raw pointer variants, esp for handling uninitialized cases. The flex types have methods which return the sized type along with the length.
Previously it was generating inner `unsafe` blocks for all unsafe functions in conformance with Rust 2024, but now only do it when `--wrap-unsafe-ops` is enabled for consistency with other generated code. Also rename `flex_mut_ref` -> `flex_ref_mut` to make it consistent with `flex_ptr_mut` and general Rust convention.
The conversions between fixed and dynamically sized forms are essentially type-level transforms which should have trivial implementations in terms of generated code, so there's no reason not to make them inline.
Nah, all good, thanks! |
Summary: [PR 2772](rust-lang/rust-bindgen#2772) adds the --flexarray-dst option to handle C Flexible Array Members as Rust dynamically sized types. Still pending review, but discussion has been positive. Full details: Add `--flexarray-dst` option to allow use of a dynamically sized struct to model a C flexible array member. For example, given the C struct: ``` struct msg { unsigned int tag; unsigned int len; char payload[]; }; ``` generate the corresponding Rust type: ```rust #[repr(C)] #[derive(Debug)] pub struct msg<FAM: ?Sized = [::std::os::raw::c_char; 0]> { pub tag: ::std::os::raw::c_uint, pub len: ::std::os::raw::c_uint, pub payload: FAM, } ``` This single definition is used for both Sized (with a zero-length array as the FAM type) and DST (with a slice). This also generates the helper methods to convert a raw pointer into a Rust view of the C type, with a given length. ```rust // Sized implementations impl msg<[::std::os::raw::c_char; 0]> { // Construct a DST reference from a Sized object. // SAFETY: there must be at least `len` initialized elements in the underlying storage pub unsafe fn flex_ref( &self, len: usize, ) -> &msg<[::std::os::raw::c_char]> { unsafe { Self::flex_ptr(self, len) } } // Same, but mutable pub unsafe fn flex_mut_ref( &mut self, len: usize, ) -> &mut msg<[::std::os::raw::c_char]> { unsafe { Self::flex_ptr_mut(self, len).assume_init() } } // Raw pointer variants pub unsafe fn flex_ptr<'unbounded>( ptr: *const Self, len: usize, ) -> &'unbounded msg<[::std::os::raw::c_char]> { unsafe { &*::std::ptr::from_raw_parts(ptr as *const (), len) } } pub unsafe fn flex_ptr_mut<'unbounded>( ptr: *mut Self, len: usize, ) -> ::std::mem::MaybeUninit<&'unbounded mut msg<[::std::os::raw::c_char]>> { unsafe { let mut uninit = ::std::mem::MaybeUninit::< &mut msg<[::std::os::raw::c_char]>, >::uninit(); (uninit.as_mut_ptr() as *mut *mut msg<[::std::os::raw::c_char]>) .write(::std::ptr::from_raw_parts_mut(ptr as *mut (), len)); uninit } } } // DST implementations impl msg<[::std::os::raw::c_char]> { // Memory size & alignment for allocation pub fn layout(len: usize) -> ::std::alloc::Layout { unsafe { let p: *const Self = ::std::ptr::from_raw_parts(::std::ptr::null(), len); ::std::alloc::Layout::for_value_raw(p) } } // return the Sized variant along with the length pub fn fixed(&self) -> (&msg<[::std::os::raw::c_char; 0]>, usize) { unsafe { let (ptr, len) = (self as *const Self).to_raw_parts(); (&*(ptr as *const msg<[::std::os::raw::c_char; 0]>), len) } } pub fn fixed_mut( &mut self, ) -> (&mut msg<[::std::os::raw::c_char; 0]>, usize) { unsafe { let (ptr, len) = (self as *mut Self).to_raw_parts(); (&mut *(ptr as *mut msg<[::std::os::raw::c_char; 0]>), len) } } } ``` Upside: - The flexible array member is directly expressed in a fairly conventional Rust type, with no need to interact with the bindgen-specific `__IncompleteArrayField` type - In particular, normal `size_of` will work, taking into account the dynamically sized extension Problems/TODO: - depends on unstable `ptr_metadata` and `layout_for_ptr` features - `from_ptr(_mut)` return references with unbounded lifetimes - it would be up to the caller to constrain them to the underlying storage's lifetime - I experimented with a variant which also takes a "lifetime witness" parameter `_lt: &'a LT` which can be used to bound the returned lifetime, but I'll need to try it in practice to see if it works. This is a prototype for #2771 Reviewed By: dtolnay Differential Revision: D54605921 fbshipit-source-id: 929ec448b758975ce736c21fde6b805e3d297bb7
Summary: [PR 2772](rust-lang/rust-bindgen#2772) adds the --flexarray-dst option to handle C Flexible Array Members as Rust dynamically sized types. Still pending review, but discussion has been positive. Full details: Add `--flexarray-dst` option to allow use of a dynamically sized struct to model a C flexible array member. For example, given the C struct: ``` struct msg { unsigned int tag; unsigned int len; char payload[]; }; ``` generate the corresponding Rust type: ```rust #[repr(C)] #[derive(Debug)] pub struct msg<FAM: ?Sized = [::std::os::raw::c_char; 0]> { pub tag: ::std::os::raw::c_uint, pub len: ::std::os::raw::c_uint, pub payload: FAM, } ``` This single definition is used for both Sized (with a zero-length array as the FAM type) and DST (with a slice). This also generates the helper methods to convert a raw pointer into a Rust view of the C type, with a given length. ```rust // Sized implementations impl msg<[::std::os::raw::c_char; 0]> { // Construct a DST reference from a Sized object. // SAFETY: there must be at least `len` initialized elements in the underlying storage pub unsafe fn flex_ref( &self, len: usize, ) -> &msg<[::std::os::raw::c_char]> { unsafe { Self::flex_ptr(self, len) } } // Same, but mutable pub unsafe fn flex_mut_ref( &mut self, len: usize, ) -> &mut msg<[::std::os::raw::c_char]> { unsafe { Self::flex_ptr_mut(self, len).assume_init() } } // Raw pointer variants pub unsafe fn flex_ptr<'unbounded>( ptr: *const Self, len: usize, ) -> &'unbounded msg<[::std::os::raw::c_char]> { unsafe { &*::std::ptr::from_raw_parts(ptr as *const (), len) } } pub unsafe fn flex_ptr_mut<'unbounded>( ptr: *mut Self, len: usize, ) -> ::std::mem::MaybeUninit<&'unbounded mut msg<[::std::os::raw::c_char]>> { unsafe { let mut uninit = ::std::mem::MaybeUninit::< &mut msg<[::std::os::raw::c_char]>, >::uninit(); (uninit.as_mut_ptr() as *mut *mut msg<[::std::os::raw::c_char]>) .write(::std::ptr::from_raw_parts_mut(ptr as *mut (), len)); uninit } } } // DST implementations impl msg<[::std::os::raw::c_char]> { // Memory size & alignment for allocation pub fn layout(len: usize) -> ::std::alloc::Layout { unsafe { let p: *const Self = ::std::ptr::from_raw_parts(::std::ptr::null(), len); ::std::alloc::Layout::for_value_raw(p) } } // return the Sized variant along with the length pub fn fixed(&self) -> (&msg<[::std::os::raw::c_char; 0]>, usize) { unsafe { let (ptr, len) = (self as *const Self).to_raw_parts(); (&*(ptr as *const msg<[::std::os::raw::c_char; 0]>), len) } } pub fn fixed_mut( &mut self, ) -> (&mut msg<[::std::os::raw::c_char; 0]>, usize) { unsafe { let (ptr, len) = (self as *mut Self).to_raw_parts(); (&mut *(ptr as *mut msg<[::std::os::raw::c_char; 0]>), len) } } } ``` Upside: - The flexible array member is directly expressed in a fairly conventional Rust type, with no need to interact with the bindgen-specific `__IncompleteArrayField` type - In particular, normal `size_of` will work, taking into account the dynamically sized extension Problems/TODO: - depends on unstable `ptr_metadata` and `layout_for_ptr` features - `from_ptr(_mut)` return references with unbounded lifetimes - it would be up to the caller to constrain them to the underlying storage's lifetime - I experimented with a variant which also takes a "lifetime witness" parameter `_lt: &'a LT` which can be used to bound the returned lifetime, but I'll need to try it in practice to see if it works. This is a prototype for #2771 Reviewed By: dtolnay Differential Revision: D54605921 fbshipit-source-id: 929ec448b758975ce736c21fde6b805e3d297bb7
Add
--flexarray-dst
option to allow use of a dynamically sized struct to model a C flexible array member. For example, given the C struct:generate the corresponding Rust type:
This single definition is used for both Sized (with a zero-length array as the FAM type) and DST (with a slice).
This also generates the helper methods to convert a raw pointer into a Rust view of the C type, with a given length.
Upside:
__IncompleteArrayField
typesize_of
will work, taking into account the dynamically sized extensionProblems/TODO:
ptr_metadata
andlayout_for_ptr
featuresfrom_ptr(_mut)
return references with unbounded lifetimes - it would be up to the caller to constrain them to the underlying storage's lifetime_lt: &'a LT
which can be used to bound the returned lifetime, but I'll need to try it in practice to see if it works.This is a prototype for #2771