Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Trait for !Sized thin pointers #3536

Closed
wants to merge 2 commits into from

Conversation

jmillikin
Copy link

@jmillikin jmillikin commented Nov 29, 2023

```

This pattern is used frequently in zero-copy APIs that transmit structured data
between trust boundaries.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this supposed to be "across" the boundaries?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you could also phrase it that way -- the meaning is equivalent. I've usually seen "between" used here, for example "between kernel and userland", or "between host and guest memory".

Copy link

@SOF3 SOF3 Nov 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean "boundary" refers to the line between e.g. kernel and userland, so "between boundaries" would be unreasonable because you are crossing a line not transmitting between two lines. "between boundaries" gave me the impression of "within boundaries" when I first read it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I've tried rewording this sentence to be less ambiguous.

@jmillikin jmillikin force-pushed the unsized-thin-pointers branch 2 times, most recently from 55c8ceb to adec0e3 Compare November 29, 2023 05:29
@jmillikin jmillikin force-pushed the unsized-thin-pointers branch from adec0e3 to 6fb7a28 Compare November 29, 2023 05:50
@programmerjake
Copy link
Member

I think we should add:

unsafe impl<T: Sized> DynSized for T { // : Sized just to be explicit
    fn size_of_val(&self) -> usize {
        core::mem::size_of::<T>()
    }
}

@ChayimFriedman2
Copy link

What should we do with std::mem::size_of_val_raw() (currently unstable, so we can change it)? We cannot use DynSized from its signature, since it takes a reference. And we cannot change it to take a raw pointer, since some types (flexible array members, or CStr) depends on accessing the contents to calculate the size. Should we have three type groups, "Sized", "?Sized but size can be calculated without accessing memory (i.e. can be calculated via a raw pointer)" and "?Sized and needs to access the memory to calculate the size"?

Another thing to consider is alignment. This RFC touches only the size aspect, but some DSTs need custom alignment too.

@jmillikin
Copy link
Author

What should we do with std::mem::size_of_val_raw() (currently unstable, so we can change it)? We cannot use DynSized from its signature, since it takes a reference. And we cannot change it to take a raw pointer, since some types (flexible array members, or CStr) depends on accessing the contents to calculate the size. Should we have three type groups, "Sized", "?Sized but size can be calculated without accessing memory (i.e. can be calculated via a raw pointer)" and "?Sized and needs to access the memory to calculate the size"?

I think it depends on what it's supposed to do. IMO the two good options are:

  1. Have it return Option<usize>, and specify the flavors of pointer for which it will/won't return a value.
    • If you want size_of_val_raw::<T: ?Sized>(NonNull::dangling().as_ptr()) to be allowed, then the only possible return value for !Sized thin pointers is None.
  2. Separate the thin and fat pointers into separate functions, and put a <T: Thin + Sized> bound on the thin one.
    • All the existing caveats around exotic fat pointers still apply for the fat-pointer version, but at least this RFC won't make them worse.

The problem with that function as currently specified is it assumes a pointer to a value contains enough information to determine the layout of that value, even if the pointer is dangling into nowhere. That's generally not something that most low-level languages guarantee because it's expensive to have every pointer to a dynamically-sized value haul around its own size.

Another thing to consider is alignment. This RFC touches only the size aspect, but some DSTs need custom alignment too.

!Sized thin pointers to types with static alignment can use existing mechanisms, such as #[repr(align(N))]. They're always the last field of a struct, so all the existing alignment rules should apply as-is.

DSTs with dynamic/unknown alignment are out of scope for this RFC, and are probably better suited for the extern types feature given how many other parts of the language they touch.

@ehuss ehuss added the T-lang Relevant to the language team, which will review and decide on the RFC. label Nov 29, 2023
Copy link

@Skepfyr Skepfyr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to name drop #3396 which is steadily turning from an extern types v2 RFC to a MetaSized trait RFC, so it addresses very similar issues to this proposal. I really like the idea behind this though - by forcing all types to implement DynSized you've dodged a lot of the thorniest questions.

Comment on lines +118 to +119
It is an error to `impl DynSized` for a type that is `Sized`. In other words,
the following code is invalid:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? In your header example having a data: [u8] as a trailing member forces that data to not have any struct padding in it (because it's undef), I guess you could use data: MaybeUninit<[u8]>, but wouldn't it be simpler to allow this on any struct? Which I think would allow:

#[repr(C)]
struct Header {
    variant: Variant,
}
#[repr(C)]
struct UnknownVariant {
  header: Header,
}
unsafe impl DynSized for UnknownVariant { .. }
#[repr(C)]
struct VariantA {
  header: Header,
  foo: u8,
}
#[repr(C)]
struct VariantB {
  header: Header,
  bar: f64,
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to allow code like that. The behavior of size_of_val() on an UnknownVariant would be ambiguous.

Restricting DynSized to be !Sized means that there's no question about which definition of "value size" is being requested -- it's the one determinable from the reference.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be interested to know how you'd expect people to write code like that then. Would you have to duplicate the fields of Header in all the variants?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect code something like this:

#[repr(C)]
struct Header {
    variant_id: u32,
}

#[repr(C)]
struct Request {
    header: Header,
    data: [u8],
}
unsafe impl DynSized for Request { ... }


impl Request {
    fn header(&self) -> &Header { &self.header }
    fn variant(&self) -> Variant<'_> { ... }
}

enum Variant<'a> {
    A(&'a VariantA),
    B(&'b VariantB),
    Unknown,
}

#[repr(C)]
struct VariantA {
  header: Header,
  foo: u8,
}

#[repr(C)]
struct VariantB {
  header: Header,
  bar: f64,
}

The Request struct is !Sized, each of the variant structs are Sized -- in this example you basically have a tagged union where you've reserved the right to have future variants contain [u8].

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, I think that works, I suspect in most cases you'd want data: [MaybeUninit<u8>] to deal with padding, but I agree that works.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you wanted to support something like this, maybe something like:

struct Thing {
    // header bits
   variant_id: u8,

   u: Variants
}

union Variants {
    f: f32,
    i: i32,
    t: AnotherThing
}

and treat it a bit like an enum with user defined layout, where the fat pointer metadata would encode which union field is active (with a "non selected" case for when you've yet to determine the variant). Implicit in that is that the dynamic size is header + size of selected variant field, rather than size of the overall union.

(Obviously today that would be treated as a normal struct with a union field, so you'd need some way to indicate this magic DST variant thing.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be misunderstanding, but I don't see how that would work. Consider the case of an IPv4 packet -- the data isn't any particular format, it's just [u8], and the packet header can be used to discover the data length.

Your solution would imply a structure like this:

#[repr(C)]
struct Packet {
  header: PacketHeader,
  u: Variants,
}

union Variants {
  len0: [u8; 0],
  len1: [u8; 1],
  len2: [u8; 2],
  // ...
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry that was a digression that doesn't have much to do with this rfc.

`(ptr, usize)` due to being `!Sized`, then they can be reduced to `ptr`.

The `DynSized` trait does not _guarantee_ that a type will have thin pointers,
it merely enables it. This definition is intended to be compatible with RFC
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only (ptr, usize)? What about if you had a trait object at the end, so it was (ptr, vtable)? More generally I think there has to be a separate signal indicating whether the type should have thin pointers, what would happen if I implemented Pointee for a custom DST and just so happened to use usize as my Metadata type, it would be very surprising if that just got dropped because of this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only (ptr, usize)? What about if you had a trait object at the end, so it was (ptr, vtable)?

If the fat pointer is (ptr, vtable) then it's not eligible to be optimized to a thin pointer via the presence of DynSized.

More generally I think there has to be a separate signal indicating whether the type should have thin pointers, what would happen if I implemented Pointee for a custom DST and just so happened to use usize as my Metadata type, it would be very surprising if that just got dropped because of this.

The intention of the wording here is that the optimization is only valid if the type's references are thick pointers due to being !Sized. If they are thick pointers for other reasons, such as the type being a custom DST, then this optimization does not apply.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the optimization is only valid if the type's references are thick pointers due to being !Sized. If they are thick pointers for other reasons, such as the type being a custom DST, then this optimization does not apply.

I don't think this is well defined. It's not just an optimisation it will also affect things like FFI so it needs to be very clear when it is applied. What do you mean by "only valid if the type's references are thick pointers due to being !Sized" even [u8]'s usize is used for more than just size, it's also used for bounds checking.

Copy link
Author

@jmillikin jmillikin Dec 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is well defined. It's not just an optimisation it will also affect things like FFI so it needs to be very clear when it is applied.

I'm using "optimization" in the same sense as the niche optimization that lets Option<&T> be the same size as &T, which is also important for FFI.

The niche optimization is guaranteed for certain type combinations, but the exact rules are not fully specified and it's not guaranteed that any two types with human-discernable niches can be optimized.

the optimization is only valid if the type's references are thick pointers due to being !Sized. If they are thick pointers for other reasons, such as the type being a custom DST, then this optimization does not apply.

What do you mean by "only valid if the type's references are thick pointers due to being !Sized" even [u8]'s usize is used for more than just size, it's also used for bounds checking.

Yes, hence &[u8] cannot be DynSized.

Consider the following example program:

use core::mem;

struct Request {
  len_le: u32,
  _data: [u8],
}

impl Request {
    fn new<'a>(data: &'a [u8]) -> Option<&'a Request> {
        if data.len() < mem::size_of::<u32>() { return None; }
        let data_ptr = data.as_ptr();
        if (data_ptr as usize) % mem::align_of::<u32>() != 0 { return None; }
        let req: &'a Self = unsafe {
            mem::transmute(core::slice::from_raw_parts(data_ptr, 0))
        };
        if data.len() != req.len() { return None; }
        Some(req)
    }

    fn len(&self) -> usize {
        usize::try_from(u32::from_le(self.len_le)).unwrap_or(usize::MAX)
    }

    fn as_bytes(&self) -> &[u8] {
        let len = self.len();
        unsafe {
            let data_ptr = mem::transmute::<&Self, &[u8]>(self).as_ptr();
            mem::transmute(core::slice::from_raw_parts(data_ptr, len))
        }
    }
}

fn main() {
    #[repr(C, align(4))]
    struct Aligned<const N: usize>([u8; N]);
    let req_data = Aligned([9, 0, 0, 0, 1, 2, 3, 4, 5]);
    let request = Request::new(&req_data.0[..]).unwrap();
    
    println!("request.len(): {:?}", request.len());
    println!("request.as_bytes(): {:?}", request.as_bytes());
}

The Request type is !Sized because it contains [u8], and therefore references to it are a (ptr, usize) tuple. There's no need for that extra usize -- the value contains its own size, so there's no need for the references to also have a size. But there's no way to tell the compiler this, no hint or attribute or marker trait to say "this type's size is computed dynamically".

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But there is a need for that extra usize, if I do request._data[4] the compiler usually uses that usize to do bounds checking on that 4.

Comment on lines +161 to +166
A `struct` with a field that implements `DynSized` will also implicitly
implement `DynSized`. The implicit implementation of `DynSized` computes the
size of the struct up until the `DynSized` field, and then adds the result of
calling `DynSized::size_of_val()` on the final field.
- This implies it's not permitted to manually `impl DynSize` for a type that
contains a field that implements `DynSize`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I think this is UB as written, consider Mutex<DynSizedType>, if it accessed then inner field without locking the Mutex then you get a data race. If you forced structs that contain a type that might contain a field with a manually implemented DynSized (consider generics) to implement DynSized, you would have to lock the Mutex for all invocations of size_of_val, which seems scary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mutex<DynSizedType> doesn't seem like it should be permitted -- if the size of a value can only be determined by inspecting the value, then either it can't be in a mutex, or we have to say that size_of_val(&Mutex<T>) is only allowed for <T: Sized> or <T: ?Sized + FatPointerWithSize>`.

Maybe there's another trait that needs to exist to express the concept that the size of a value can be determined just from its pointer:

trait SizeOfValFromPtr {
    fn size_of_val_from_ptr(&self) -> usize;
}

impl<T: Sized> SizeOfValFromPtr<T> { ... }
impl<T: ?Sized + Pointee<Metadata = usize>> SizeOfValFromPtr<T> { ... }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, I might be missing something, but how would a Mutex<DynSizedType> get created? Mutex::new() has a type constraint on Sized, and so do all the auxiliary construction traits like Default or From<T>.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Mutex and [T] you can use unsizing as in this playground, but this doesn't just apply to Mutex: any type that Wrapper<T> that holds its T inline but has safety constraints on accessing that T (e.g. Cell, RefCell) has this issue. Also your ``SizeOfValFromPtr is essentially MetaSized from #3396

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I think about it, the more I think Mutex<T: DynSized> and Box<T: DynSized> should be disallowed. Any type that depends on its generic parameters having a stable layout cannot be used with a value that stores its own size.

I need to look again at why ?DynSized was rejected before, and whether those reasons apply here, since I'm starting to lean in the direction of a new ? impl-by-default marker trait.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lang team has expressed an unwillingness to add more ? bounds as they're weird and hard to teach. It's possible that decision might be changed with a good enough argument but I'd be surprised.

Alternatively, the API contract for `DynSized` implementations could require
that the result of `size_of_val()` not change for the lifetime of the allocated
object. This would likely be true for nearly all interesting use cases, and
would let `DynSized` values be stored in a `Box`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing written here would prevent DynSized values being stored in Box (either via coercion or from_raw) so this does need to work or be banned somehow.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After giving it some thought, Box<T: DynSized> should probably be disallowed. It would be convenient to be able to box a dynamically-sized value as-is, but requiring the size to be immutable for the lifetime of the object would prevent useful operations (for example editing an IPv4 packet in place to drop IP routing options).

Users that need to alloc+free a DynSized would need to use a buffer type that manages the actual total capacity and has a fixed Layout, and that type could go into a Box.

Copy link

@Skepfyr Skepfyr Dec 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what situation could you change the size of an object? You'd have to be very careful to stay inside your allocation. In that situation it feels like you have a separate length and size_of_val() should return the capacity (a bit like Vec, I guess).


The above incompatibility of a redefined `&CStr` exists regardless of this RFC,
but it's worth noting that implementing `DynSized` would be a backwards
incompatible change for existing DSTs.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is it backwards incompatibility specifically? I'm not sure we guarantee the layout of pointers to DSTs atm.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be misunderstanding the question, but the backwards-incompatibility is that the transmutability of DSTs is exposed to user code and depended on heavily by third-party libraries that use type punning for type-state stuff.

It's guaranteed that &[u8] can transmute to &[i8], and that for #[repr(transparent)] struct Foo([u8]) a &Foo has the same layout as &[u8]. If that code were to add impl DynSized for Foo then the layout of its references would change and transmute(&Foo) -> &[u8]) would no longer compile, even though implementing a trait is usually not a breaking change.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you point to the documentation that says that a &T and a &Wrapper<T> have the same layout if Wrapper is transparent? I can't find it, but I'm pretty sure I've relied on it in the past so I agree with what you're saying. We might be able to get away with it as not actually a breaking change depending on whether it's documented and how commonly people depend on that. (I suspect more people want &CStr to be thin than depend on it being wide).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://doc.rust-lang.org/reference/type-layout.html#the-transparent-representation guarantees that T and a #[repr(transparent)] wrapper struct Wrapper<T> have the same layout and ABI. So [u8] and Wrapper<[u8]> must be equivalent.

https://doc.rust-lang.org/nomicon/exotic-sizes.html#dynamically-sized-types-dsts specifies that a slice reference is a pointer to the start of the slice plus an element count.

A pointer to T and a pointer to Wrapper<T> are equivalent, thus the slice pointers (*T, usize) and (*Wrapper, usize)` are equivalent (regardless of whether the pointer components are in that order).


A potential alternative is to have a special attribute that must be paired with impl DynSized:

#[repr(dyn_sized)] // enables *and requires* a `DynSized` impl
struct Foo([u8]);
unsafe impl DynSized for Foo { ... }

Changing the #[repr] of a type is a clear signal that something special is happening.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs so clearly say that you should not rely on the internal representation of CStr https://doc.rust-lang.org/std/ffi/struct.CStr.html so I'm not sure how much concern we should give to applications that violate this with something like the example behavior.

Though that blurb does seem to contradict itself with the "(the repr(transparent) notwithstanding)" clause - I don't know what that is intended to say if you aren't supposed to rely on the layout of the transparent thing.

@jmillikin
Copy link
Author

I'd like to name drop #3396 which is steadily turning from an extern types v2 RFC to a MetaSized trait RFC, so it addresses very similar issues to this proposal.

My main complaint with that RFC (and extern types in general) is that it conflates "dynamic size" and "dynamic alignment". A dynamically-sized value may have a static alignment, which is what allows it to be used in struct fields. A type with dynamic (or unknown) alignment is much less capable, being basically an opaque handle.

The "extern types" idea is also strongly focused on naming types defined elsewhere, but my use case is much simpler -- I want Rust's existing !Sized semantics without the fat-pointer overhead. It's an optimization, like allowing Option<&i32> to fit into a single register.

@jmillikin
Copy link
Author

jmillikin commented Dec 1, 2023

I'm going to have limited Git access for a while, but wanted to write down some quick thoughts based on how this might fit into the DynSized / extern type cinematic universe.


First, the fundamental semantics of !Sized thin pointers is very similar to a union with lots of array members:

struct Request {
    len: u32,
    data: [u8],
}
impl DynSized for Request { ... }

// IS EQUIVALENT TO

struct Request {
    len: u32,
    data: RequestData,
}
union RequestData {
    len_n0: [u8; 0],
    len_n1: [u8; 1],
    len_n2: [u8; 2],
    // ...
    len_max: [u8; u32::MAX],
}

This implies that the data values should be gated behind unsafe of some sort -- all members of the data array are potentially uninitialized and/or beyond the bounds of the Request allocated object. The only safe operation on RequestData is to get a *u8 by interpreting it as a 0-sized array -- the pointer is well-aligned but potentially dangling.

impl RequestData {
    fn as_ptr(&self) -> *const u8 {
        // SAFETY: 0 is always a valid array length
        (unsafe { self.len_n0 }).as_ptr()
    }
    fn as_mut_ptr(&mut self) -> *mut u8 {
        // SAFETY: 0 is always a valid array length
        (unsafe { self.len_n0 }).as_mut_ptr()
    }
}

If the RequestData type is made generic, then it starts to look like a lang item with special semantics.

The DynSized trait doesn't need to be the trigger for thin-pointer optimization, the presence of this magic marker type in the struct fields could do it (similar to today's DST-via-array-field):

struct Request {
    len: u32,
    // the presence of this type, which must be the final field, makes `Request` into
    // a `!Sized` type with thin pointers.
    data: core::marker::UnsizedArray<u8>,
}
impl core::marker::UnsizedArray<T> {
    fn as_ptr(&self) -> *const T {
        // SAFETY: 0 is always a valid array length
        (unsafe { self.len_n0 }).as_ptr()
    }
    fn as_mut_ptr(&mut self) -> *mut T {
        // SAFETY: 0 is always a valid array length
        (unsafe { self.len_n0 }).as_mut_ptr()
    }
}

It should not be possible to call size_of_val() on an UnsizedArray, because it doesn't have a size (not even a dynamic one). This implies core::mem::size_of_val() needs a new implicit trait bound -- implied by even <T: ?Sized>. Which also means Mutex<T: ?Sized>, Box<T: ?Sized>, etc can't directly store an UnsizedArray or any type containing one.

Let's call it ?FixedLayout for now, because it's implemented by all types that have values with a fixed allocated object layout -- in other words, their values' size and alignment can't change. Every type that currently exists in stable Rust implements FixedLayout (extern types don't, but they're unstable).

// core::marker ?
unsafe trait FixedLayout {
    fn align_of_val_raw(ptr: *const Self) -> usize;
    fn size_of_val_raw(ptr: *const Self) -> usize;
}

unsafe impl<T> FixedLayout for T { ... }     // Sized types have fixed layouts
unsafe impl<T> FixedLayout for [T] { ... }  // so do slices
unsafe impl FixedLayout for str { ... }  // and strings
unsafe impl FixedLayout for dyn * { ... }  // and trait vtables

// compiler magic:
// all types that are fat-pointer DSTs due to a DST field have fixed layout,
// for example `struct Foo([u8])` gets an automatic `FixedLayout` impl

// UnsizedArray does NOT impl FixedLayout

This leaves us to think about core::mem::align_of_val() and core::mem::size_of_val(), which have <T: ?Sized> bounds and therefore can't accept !Sized thin-pointer types unless their signature changes.

Option 1 is to give them a ?FixedLayout bound and do some more compiler magic to wire up a trait DynSized, but this seems like a good opportunity to draw a line in the sand about which type categories are guaranteed to be usable with those functions. After all, extern types are also !FixedLayout, and it'd be nice if we could make size_of_val(&some_extern_value) not compile.

So let's go with option 2, a trait for types that have a dynamic layout. No compiler magic, it's just library code for people who want to write a T: ?Sized + ?FixedLayout + DynLayout function deep in the guts of wasm-bindgen or whatever:

// trait bounds imply `T: FixedLayout`, therefore forbid UnsizedArray
// or any type containing one. Also forbids extern types.
fn align_of_val<T: ?Sized>(val: &T) -> usize;
fn size_of_val<T: ?Sized>(val: &T) -> usize;

// Current text of the RFC calls this `DynSized`.
//
// Types with dynamic size (or dynamic alignment, god help them) can impl this,
// but no special compiler magic happens.
pub trait DynLayout {
    fn align_of_val(&self) -> usize;
    fn size_of_val(&self) -> usize;
}

Okay it's about midnight so thanks for coming to my ted talk I guess. Hopefully some of this is coherent.

@tgross35
Copy link
Contributor

tgross35 commented Dec 2, 2023

Is this meant to somehow interact with MetaSized from extern types v2?

@jmillikin
Copy link
Author

Is this meant to somehow interact with MetaSized from extern types v2?

This is intended to be a subset of extern types and custom DST metadata. Both of those proposals handle !Sized thin pointers as a special case of a much larger change, which is impacting their timelines -- for example extern type needs to support types without a statically-known alignment, which is difficult.

My hope is that !Sized thin pointers is well-scoped enough that it could be implemented and stabilized on its own, without blocking long-term progress toward fully opaque extern types (for C/WASM) or custom pointer metadata (for the fancy dyn vtable stuff).

@jmillikin
Copy link
Author

Recording another idea to investigate if the above design doesn't pan out:

If it's important that the size of a value always be knowable just by having a reference to that value, then another option is to do just-in-time conversions from a thin&() to a fat &T. Placing the dynamic length query close to the reference construction should hopefully let the optimizer eliminate it in cases where it's unnecessary to the current function.

unsafe trait DynSized {
    unsafe fn size_of_val(data_address: core::ptr::NonNull<()>) -> usize;
}

struct ThinRef<'a, T: ?Sized> {
    data_address: &'a (),
    _t: core::marker::PhantomData<&'a T>,
}

The user would write code in terms of references, and the compiler would convert between &T and ThinRef<'a, T> at function call boundaries:

#[repr(C, align(8), thin_dst]
struct Request {
    len: u32,
    data: [u8],
}

unsafe impl DynSized for Request {
    unsafe fn size_of_val(data_address: core::ptr::NonNull<()>) -> usize {
        data_address.as_ptr().cast::<u32>().read() as usize
    }
}

fn intrinsics::dyn_sized_thin_ref_to_ref<'a, T: ?Sized + DynSized>(ThinRef<'a, T>) -> &'a T;
fn intrinsics::dyn_sized_ref_to_thin_ref<'a, T: ?Sized + DynSized>(&'a T) -> ThinRef<'a, T>;

// user writes this
fn handle_request(r: &Request) {
    if r.is_empty() { return; }
    do_something_with_request(r);
}

// converted to this
fn handle_request<'1>(r: ThinRef<'1, Request>) {
    let r: &'1 Request = intrinsics::dyn_sized_thin_ref_to_ref(r);
    if r.is_empty() { return; }
    do_something_with_request(intrinsics::dyn_sized_ref_to_thin_ref(r));
}

A somewhat rough approximation in Playground indicates that this approach produces the expected behavior, i.e. only one register is used for passing around a ThinRef<Request>, and functions that receive it only inspect the length when the dynamically-sized portion of a &Request is inspected.

@Skepfyr
Copy link

Skepfyr commented Dec 2, 2023

Plausibly purely for my own reference, in the terminology of the extern types v2 PR:
FixedLayout == metadata sized + metadata aligned (the same as the MetaSized trait from extern types v2)
DynLayout == dynamically sized (+ at least metadata aligned, this proposal is specifically avoiding adding anything less in the aligned category)
It's also proposing ?Sized + ?FixedLayout + ?DynLayout as the minimum trait bound meaning all types are at least dynamically sized + metadata aligned.

The extra trait has the effect of allowing people to write dynamically sized types, but the higher minimum bound means you can't write entirely opaque types, like extern types. This isn't quite enough to accurately represent CStr though (can you do align_of::<CStr>(), if you can what are its bounds?). Note that does mean that proposal hits this issue:

pub trait Trait {
    fn foo(self: Box<Self>) where Self: FixedLayout;
    fn bar(self: Arc<Self>) where Self: FixedLayout;
}

You have to have the where clause because it's no longer true that all types satisfy the bounds on Box or Arc (T: ?Sized + ?FixedLayout) which would be a breaking change without a separate mitigation.

@AaronKutch
Copy link

Linking this because I think it is closely related and shows a bunch of special cases that could be relevant rust-lang/unsafe-code-guidelines#256

@CAD97
Copy link

CAD97 commented Dec 17, 2023

There's a subtle but severe danger to requiring a read to determine size: shared mutability. I can use size_of_val on &Mutex<CStr>. I can do so while the mutex is locked and I'm holding &mut CStr. If size_of_val needs to read the value, now I have a read of my value aliasing my writes through &mut, i.e. this is unsound at a minimum, and likely quick UB.

So either every type that exposes shared mutability of a generic ?Sized type needs to forbid DynSized types, or shared mutability needs to surpress the "thin pointer optimization." And, oops, UnsafeCell isn't the only shared mutability primitive, *mut T and *const T are shared mutability points as well, so the very thing you're trying to make thin mustn't elide the pointer metadata lest it be used to encapsulate some shared mutability.

At least you thought to make it require unsafe impl

[That] a pointer to a value contains enough information to determine the layout of that value [is] generally not something that most low-level languages guarantee because it's expensive to have every pointer to a dynamically-sized value haul around its own size.

On a PDP11, yeah, passing around single pointers to sentinel terminated lists could often be cheaper than also passing around the length, if you rarely needed the length except to iterate through it. It's also how simple string manipulation can end up being accidentally O(n²). On any somewhat modern machine, registers and stack aren't free, but they're quite cheap, and even heap memory is fairly cheap when in cache. Additionally, while you're "spending" a word more per pointer, you're also "saving" a word per object, so if you're caring this much about memory usage optimization and do some sort of pointer compression scheme (i.e. a u32 indexed arena of pointers), you might even manage to save memory pressure compared to inline lengths.

(C++ doesn't not use fat pointers because of some performance reason, it has simple pointers only because of C compatibility. For a long time, the "correct" and idiomatic way to pass strings around in C++ was cosnt std::string& (i.e. &String), and it took a long time to get std::string_span (i.e. &str).

Rust also uses the length significantly more often than C or C++ would, because array indexing is checked. The optimizer does a pretty decent job at minimizing the cost of that specifically because the slice length is already available on the stack. If the length is on the stack, it's an additional logical memory access requirement for each indexing operation, and the optimizer has a much more interesting time proving that it's guaranteed already proven this check true, so using thin pointers can result in more index checking, more panicking arms, and worse code optimization.

Additionally, you don't even need language support or to lie with &Header references in order to implement thin pointers as a library item (disclaimer: my crate1). All you need to do is use Thin<Ptr<Dst>> where Thin stores ptr::NonNull<()> instead of Ptr<Dst>. Now it's a case of choosing your abstractions, just like every other pointer kind in Rust.

Footnotes

  1. It's due for a bit of a makeover, since I understand the problem space a lot better now than I did when I originally wrote the crate, and am just generally better at designing food Rust APIs, but it's been languishing hoping for feature(ptr_metadata) to become available, and I haven't been writing the kind of code that would use it. But I'm about to be again, so it's likely I'll be fixing up erasable and slice-dst to a more modern implementation standard.

@dead-claudia
Copy link

There's a subtle but severe danger to requiring a read to determine size: shared mutability. I can use size_of_val on &Mutex<CStr>. I can do so while the mutex is locked and I'm holding &mut CStr. If size_of_val needs to read the value, now I have a read of my value aliasing my writes through &mut, i.e. this is unsound at a minimum, and likely quick UB.

@CAD97 Could you elaborate on how this might be unsound?

Specifically, how would size_of_val work on &Mutex<T> for T: !Sized? Only reasonable way I can think of to implement that is size_of_val(&*mutex.lock()), which is safe as it just panics if there's an existing lock.

@tojocky
Copy link

tojocky commented Apr 1, 2024

Any progress on this item? This would greatly convince existing C/C++ low latency apps to consider Rust.

@dead-claudia
Copy link

dead-claudia commented Apr 2, 2024

I'd like to raise a bit of a concern here. It interacts way too much with the pointer metadata RFC, and should really be merged into it IMHO. rust-lang/rust#123353

This RFC proposal fails to encapsulate alignment (which is dynamic for dyn Trait and undefined for extern types), and there's some type safety concerns around sizing.

Edit: ignore all of that. Filed that issue very prematurely.

but it's worth noting that implementing `DynSized` would be a backwards
incompatible change for existing DSTs.

# Rationale and alternatives

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered just having everything automatically implement DynSized except extern types and structs ending with them directly or indirectly except through UnsafeCell<T> (thanks @CAD97)?

In this world:

  • Sized: std::mem::size_of::<T>() works and DynSized returns the result of that.
  • DynSized: std::mem::size_of_val(&t) works and delegates to <T as DynSized>::size_of_val(&t)
  • Extern types must manually implement DynSized if applicable.
  • UnsafeCell<T> doesn't proxy through extern type implementations as it depends on the outer type how it should be accessed. For instance, mutexes could return mutex.lock().unwrap_or_else(|e| e.into_inner()).size_of_val() and read-write locks rwlock.read().unwrap_or_else(|e| e.into_inner()).size_of_val().

@dead-claudia
Copy link

Coming back with a fresh mind:

DSTs with dynamic/unknown alignment are out of scope for this RFC, and are probably better suited for the extern types feature given how many other parts of the language they touch.

@jmillikin This feature is pretty useless without alignment info. You can't generically box a value without knowing its runtime alignment, for one.

However, alignment is just a field access for dyn Trait and it's statically calculable for everything else except extern types (where the alignment could be externally determined).

@dead-claudia
Copy link

@CAD97 (re: rust-lang/rust#123353) As mentioned before, mutexes can simply return mutex.lock().unwrap_or_else(|e| e.into_inner()).size_of_val(). I see no good reason why they can't "just" lock. UnsafeCell<T> is a good callout, though, and I've folded it into the above review accordingly.

@jmillikin
Copy link
Author

jmillikin commented Apr 2, 2024

I believe this is covered elsewhere, but in the current proposal DynSized is a trait for types that do not have a statically-known size. This implies:

  • A statically-sized type is not DynSized.
  • A dynamically-sized type still has a static alignment. It's not a total free-for-all like external types.
  • I do not believe there is any connection to custom pointer metadata.

Regarding the overall proposal, I've been trying to figure out how to get the benefits of unsized thin pointers without breaking MIRI or other semantics based on the assumption of a &T having a defined size for all types. At present I'm starting to come around to the idea of matching C's semantics, where size_of for unsized thin pointers would return a size of the statically-known portion of the type (excluding the DST part).

In any case the goal is to iterate toward a simpler design that solves the concrete problem of DST pointers being too big. I'm not interested in the enormous work implied by custom metadata or fully-opaque external types. Also, introducing unsafe locking semantics into Mutex is not a viable approach.

@dead-claudia
Copy link

I believe this is covered elsewhere, but in the current proposal DynSized is a trait for types that do not have a statically-known size. This implies:

  • A statically-sized type is not DynSized.

  • A dynamically-sized type still has a static alignment. It's not a total free-for-all like external types.

  • I do not believe there is any connection to custom pointer metadata.

Note the section I posted that comment in. 😉

It's intended to be a possible alternative model, where DynSized would instead represent the set of types that has a size known at runtime. In this alternative model, it generalizes Sized, where the size is known both statically and at runtime.

Regarding the overall proposal, I've been trying to figure out how to get the benefits of unsized thin pointers without breaking MIRI or other semantics based on the assumption of a &T having a defined size for all types.

Unfortunately, extern types will break that anyways. They're already in nightly (albeit in a somewhat broken state), so you can't really avoid it.

At present I'm starting to come around to the idea of matching C's semantics, where size_of for unsized thin pointers would return a size of the statically-known portion of the type (excluding the DST part).

This would be helpful, but should be its own separate function IMHO. Maybe something like std::mem::size_of_static::<T>(). Both are useful, just in different contexts.

In any case the goal is to iterate toward a simpler design that solves the concrete problem of DST pointers being too big. I'm not interested in the enormous work implied by custom metadata or fully-opaque external types.

See above regarding extern types.

Custom metadata can provide a way to compute size at runtime, but all you'd be doing is consuming it in a few specific places with my alternative. If you rely on external types, you can focus on only the cases where there isn't runtime size/alignment information, and so your proposal (and work) ends up a lot smaller.

You could even define that alternative trait and the not-auto-generated bits today:

pub unsafe trait DynSized {
    fn size_of_val(&self) -> usize;
}

// Your proposal wouldn't have this, but my alternative would
unsafe impl<T: Sized> DynSized for T {
    fn size_of_val(&self) -> usize {
        std::mem::size_of::<T>()
    }
}

// For `dyn Trait`
unsafe impl<T: std::ptr::Pointee<Metadata=std::ptr::DynMetadata<U>>, U: ?Sized> DynSized for T {
    fn size_of_val(&self) -> usize {
        std::ptr::metadata(self).size()
    }
}

unsafe impl<T> DynSized for [T] {
    fn size_of_val(&self) -> usize {
        self.len() * std::mem::size_of::<T>()
    }
}

unsafe impl DynSized for str {
    fn size_of_val(&self) -> usize {
        self.len()
    }
}

Also, introducing unsafe locking semantics into Mutex is not a viable approach.

The trait itself needs to be unsafe in both of our versions (it implies memory validity from ptr to ptr.add((*ptr).size_of_val())) regardless, but I don't see how else memory safety should be of concern here? Trying to figure out what you mean by "unsafe" here.

@jsgf
Copy link

jsgf commented Apr 2, 2024

So if I understand correctly, the motivation is that a fat pointer is unnecessary if the size is already encoded within the object, so if we define a way of extracting that encoded size then we need only have a thin pointer. I guess that's a fairly common FFI situation, but it still seems a bit narrow.

First thing that comes to mind is that it only applies when the size is encoded in a way that's reachable from &self. In a lot of cases the size might be elsewhere (eg a separate pointer/size list or similar) so this wouldn't work for those cases.

Secondly, DynSized is an unsafe trait. The RFC doesn't really go into what specific safety properties the implementer is required to uphold. For example, is it returning the actual allocation size of the object (capacity, for allocation) or the valid size of the object (length, for bounds checks). Are there limits of what the implementation of size_of_val is allowed to do? Certainly in the common case the expectation is that it's extracting a size from a field or scanning for a \0. But is it allowed to do anything? IO? Take locks? Start threads? Access static or thread-local state? Can it be fallible? Is it required to be a deterministic?

But I think the big problems are those that arise from having to access the representation in order to get the size: when it's valid to access the representation, and whether the returned size is constant? With a standard DST, when you form the fat pointer, you're effectively making a commitment to the size, and that size is stored independently from the representation of the object itself, so neither problem arises.

It seems to me that you could solve this in a bespoke way using the ptr_metadata API - you define your own thin pointer types for your internal use (eg if pointer density is important), and have a method that can use ptr::from_raw_parts to materialize a fat pointer when needed to interact with the rest of the Rust world. It's possibly a little more burdensome than having it magically happen with transparent invocations of the DynSized trait, but it doesn't seem overwhelming.

I just landed changes to bindgen to enable the use of DST to represent Flexible Array Members (rust-lang/rust-bindgen#2772) which generates an API along similar lines for the types it generates (though mostly to convert between the "fixed" and "dynamic" variants of each structure).

@jmillikin
Copy link
Author

So if I understand correctly, the motivation is that a fat pointer is unnecessary if the size is already encoded within the object, so if we define a way of extracting that encoded size then we need only have a thin pointer. I guess that's a fairly common FFI situation, but it still seems a bit narrow.

That's a correct summary, but it's not about FFI. For FFI that wants to pass around handles to an externally-allocated resource, *mut () works fine.

Doubling the size of pointers means that function parameters spill to the stack much more frequently, so using thin pointers where possible can provide a significant performance uplift. I've measured something like 10-20% improvement in some of my own codebases from using wrapper types to emulate thin pointers. There's a lot of performance being left on the table compared to C/C++ for zero-copy use cases like packet processing.

First thing that comes to mind is that it only applies when the size is encoded in a way that's reachable from &self. In a lot of cases the size might be elsewhere (eg a separate pointer/size list or similar) so this wouldn't work for those cases.

Yep, that's true. If a value's size can't be determined from the value itself, then unsized thin pointers aren't practical.

Secondly, DynSized is an unsafe trait. The RFC doesn't really go into what specific safety properties the implementer is required to uphold [...]

I think it's a bit early in the design process to discuss such minutae, when there isn't even consensus on unsized thin pointers being desirable at all.

It seems to me that you could solve this in a bespoke way using the ptr_metadata API - you define your own thin pointer types for your internal use (eg if pointer density is important), and have a method that can use ptr::from_raw_parts to materialize a fat pointer when needed to interact with the rest of the Rust world. It's possibly a little more burdensome than having it magically happen with transparent invocations of the DynSized trait, but it doesn't seem overwhelming.

That's already possible in today's Rust, using struct Thin<'a, T> { inner: &'a () } and similar idioms. The problem is that such wrappers pollute the public API, so instead of struct SomeStructure you also end up with SomeStructurePtr, SomeStructureRefMut, etc. It's quite unpleasant to work with that kind of API, and definitely can't be published for general use.

I just landed changes to bindgen to enable the use of DST to represent Flexible Array Members (rust-lang/rust-bindgen#2772) which generates an API along similar lines for the types it generates (though mostly to convert between the "fixed" and "dynamic" variants of each structure).

Pulling in bindgen and all the rest of its FFI semantics just to get unsized thin pointers doesn't seem great, and also it doesn't work in normal Rust due to the use of unstable nightly features.

@programmerjake
Copy link
Member

This feature is pretty useless without alignment info. You can't generically box a value without knowing its runtime alignment, for one.

actually, I have a proposal that would allow Box<MyTypeWithUnknownAlign, MyTypeDropper>: #3470 (comment) with sample usage #3470 (comment)

@dead-claudia
Copy link

dead-claudia commented Apr 3, 2024

This feature is pretty useless without alignment info. You can't generically box a value without knowing its runtime alignment, for one.

actually, I have a proposal that would allow Box<MyTypeWithUnknownAlign, MyTypeDropper>: #3470 (comment) with sample usage #3470 (comment)

@programmerjake Could you explain how specifically a type would end up having an unknown alignment in that? I can't figure out how one would construct such a type, even for the sake of that proposed ABI.

@programmerjake
Copy link
Member

programmerjake commented Apr 3, 2024

@programmerjake Could you explain how specifically a type would ebd up having an unknown alignment in that? I can't figure out how one would construct such a type, even for the sake of that proposed ABI.

it would be constructed from FFI, basically transmuting a pointer to a C struct with no body to Box<MyType, MyTypeDropper>:

extern "C" {
    pub type MyType;
    fn make_my_type() -> Option<Pin<Box<MyType, MyTypeDropper>>>;
    fn destroy_my_type(p: Option<Pin<Box<MyType, MyTypeDropper>>>);
}

impl MyType {
    pub fn new() -> Pin<Box<MyType, MyTypeDropper>> {
        unsafe { make_my_type() }.unwrap_or_else(|| panic!("out of memory"))
    }
}

pub struct MyTypeDropper;

impl BoxDrop<MyType> for MyTypeDropper {
    fn box_drop(v: Pin<Box<MyType, MyTypeDropper>>) {
        unsafe { destroy_my_type(Some(v)); }
    }
}
// in C header
struct MyType;
struct MyType *make_my_type(void);
void destroy_my_type(struct MyType *);

// in C source
struct MyType {
    // some example fields
    int field;
    char *field2;
    int *field3;
};

struct MyType *make_my_type(void) {
    struct MyType *retval;
    retval = (struct MyType *)calloc(sizeof(MyType));
    if(!retval)
        return NULL;
    retval->field = 3;
    retval->field2 = strdup("a string");
    if(!retval->field2)
        goto fail;
    retval->field3 = (int *)malloc(sizeof(int));
    if(!retval->field3)
        goto fail;
    *retval->field3 = 42;
    return retval;
fail:
    free(retval->field2);
    free(retval->field3);
    free(retval);
    return NULL;
}

void destroy_my_type(struct MyType *p) {
    if(!p)
        return;
    free(p->field2);
    free(p->field3);
    free(p);
}

@jmillikin
Copy link
Author

jmillikin commented Apr 3, 2024

For most FFI use cases, you don't need unsized thin pointers. You're not trying to declare the type in Rust or interact directly with dynamically-sized fields. All you need is a wrapper.

mod c {
  #[repr(transparent)]
  pub(super) struct MyType(core::ffi::c_void);

  extern "C" {
    pub(super) fn make_my_type() -> *mut MyType;
    pub(super) fn destroy_my_type(p: *mut MyType);
  }
}

pub struct MyType {
  raw: *mut c::MyType,
}

impl Drop for MyType {
  fn drop(&mut self) {
    unsafe { c::destroy_my_type(self.raw) }
  }
}

impl MyType {
  pub fn new() -> MyType {
    let raw = unsafe { c::make_my_type() };
    if raw.is_null() {
      panic!("out of memory");
    }
    MyType { raw }
  }
}

To reiterate, the goal of this RFC is to allow Rust code to define types that are dynamically-sized but do not have the overhead of fat pointers. Types defined wholly outside of Rust (such as opaque C structures allocated/deallocated by external code) or types without a static alignment are out of scope.

@jsgf
Copy link

jsgf commented Apr 3, 2024

@jmillikin

That's already possible in today's Rust, using struct Thin<'a, T> { inner: &'a () } and similar idioms. The problem is that such wrappers pollute the public API, so instead of struct SomeStructure you also end up with SomeStructurePtr, SomeStructureRefMut, etc. It's quite unpleasant to work with that kind of API, and definitely can't be published for general use.

I think I'd have to see an example of what you have in mind. I've found it fairly straightforward to use type parameters to encode fixed vs dynamic size states, but I'm not sure that relates to what you're saying here.

Pulling in bindgen and all the rest of its FFI semantics just to get unsized thin pointers doesn't seem great,

Oh, I wasn't suggesting that bindgen was necessary, just pointing to it as an example implementation of using ptr_metadata to manipulate thin pointers, and fatten them on demand. You can have:

struct Packet<PAYLOAD: ?Sized = [u8; 0]> {
    size: u16,
//...
    payload: PAYLOAD
}

and pass that around as a thin pointer/ref, and then when you want to actually use the payload do:

let fatpacket: &Packet<[u8]> = unsafe { &*ptr::from_raw_parts(packet as *const (), packet.size as usize) };

to get the DST variant which exposes the payload using the embedded size.

and also it doesn't work in normal Rust due to the use of unstable nightly features.

Well I'd hope ptr_metadata gets stabilized before whatever arises from this RFC does...

@jmillikin jmillikin closed this Jun 9, 2024
@jmillikin jmillikin deleted the unsized-thin-pointers branch June 9, 2024 07:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.