-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enumerating type safety guarantees in MemoryMarshal and friends #41418
Comments
|
@jkotas Those are both interesting points. There are lots of APIs which fall over if you create them from a poisoned source. But since Re: I'm also curious about |
reminds me |
I think it is in the same category as
I would not read into verifiability rules in ECMA-335 too much. The rules have not been updated with evolution of the runtime. I think it may be better to explain all the reasoning here from the first principles. |
ECMA-335 clearly states "Managed pointers cannot be null" (see §II.14.4.2). It does not take a very strict reading of the specification to see that they are, in theory, completely disallowed. This is likely one of the parts that can be changed when the team figures out how to rev the specification. Maybe this could be added to the ECMA-335 augments doc? |
What standard is being used to make an API unsafe? As discussed, (7) does not allow to subvert the type system. It merely introduces state corruption which any number of things can do (for example, struct tearing). If the definition of "unsafe" is expanded to "anything that seems rather dangerous but might be correct" then the set grows considerably and it's often unclear what should be included. Another example would be OS handle access. That can cause "bugs at a distance". |
Oh, definitely. That's why I didn't include FWIW, (7) absolutely does allow subverting the type system. Consider the following application. public readonly struct MyStruct
{
private readonly sbyte _myByte;
public MyStruct(sbyte value)
{
if (value < 0) { throw new ArgumentOutOfRangeException(); }
_myByte = value;
}
public override string ToString()
{
if (_myByte < 0) { Environment.FailFast("What just happened?"); }
return _myByte.ToString();
}
}
public void MyMethod()
{
MyStruct[] arr = ArrayPool<MyStruct>.Shared.Rent(1024);
foreach (MyStruct value in arr)
{
Console.WriteLine(value); // this might Environment.FailFast
}
} Structs of less than one machine word in size cannot be torn through standard multithreaded access. There's no way to get the _myByte field to be negative without having bypassed normal ctor validation. (This is true more generally: even for torn structs, it is never possible for any single field to have a "never legal" value without first having bypassed ctor validation to create a standalone instance.) |
I think it should be the same standard that was used to mark API as SecurityCritical in Silverlight. |
OK, I see what you mean. Essentially random bytes can be "blitted" over a struct. So you're saying, if a struct is no longer able to guard its state that counts as subverting the type system. |
Would put reflection setters of private members also in that category? |
Yes, any private reflection is in this category. This issue is specific to MemoryMarshal and friends. The platform as a whole has a lot more unsafe or partially unsafe APIs. |
This could potentially be a bit too broad. I'm pretty sure |
Agree. I meant just the reasons related to type and memory safety. The other reasons like CAS (where |
Since we're talking about An unmanaged type is any value type which does not contain GC-tracked references. (See the C# unmanaged keyword and the API A full-range unmanaged type is any value type where any arbitrary backing bit pattern is legal (verifiably type-safe) for an instance of that type. The set of full-range unmanaged types is a subset of the set of unmanaged types. The difference is best demonstrated through examples.
This matters for two reasons. First, Second, the shared This is a bit nuanced, and I'm trying to figure out a good way to weave these concepts into the docs being created. Otherwise I think our guidance on how to use these APIs correctly will be lacking. (Yes, I know you can tear structs to violate ctor invariants. But that's a rabbit hole I don't want to go down here.) |
AFAIK, this is totally legal in the CLR. C# requires unsafe code to construct such a bool (and indeed C# will malfunction when encountering such bools with operator |
I realize that with
Unsafe
,MemoryMarshal
, and friends entering wide use, we never formally stated what type safety guarantees (if any) the APIs offer. That is, we never provided a list of what APIs are "safe" and which are "unsafe equivalents" (related: #31354).This issue is an attempt to enumerate the APIs on
Unsafe
,MemoryMarshal
, and related types from the perspective of type safety / memory safety. Ultimately I think this needs to be tracked somewhere, but whether that's a .md file in this repo or an official doc page I don't really know.I'm also soliciting feedback on this list. Please let me know if I got something wrong.
I'm not listing APIs which expose raw pointers through their public API surface. APIs which take or return pointers are always assumed unsafe.
System.Runtime.CompilerServices.Unsafe
Add<T>(ref T, int)
Add<T>(ref T, IntPtr)
AddByteOffset<T>(ref T, IntPtr)
AreSame<T>(ref T, ref T)
AsRef<T>(in T)
As<T>(object)
As<TFrom, TTo>(ref TFrom)
ByteOffset<T>(ref T, ref T)
CopyBlock(ref byte, ref byte, uint)
CopyBlockUnaligned(ref byte, ref byte, uint)
InitBlock(ref byte, ref byte, uint)
InitBlockUnaligned(ref byte, ref byte, uint)
IsAddressGreaterThan<T>(ref T, ref T)
IsAddressLessThan<T>(ref T, ref T)
IsNullRef<T>(ref T)
NullRef<T>()
ReadUnaligned<T>(ref byte)
SkipInit<T>(out T)
SizeOf<T>()
Subtract<T>(ref T, int)
Subtract<T>(ref T, IntPtr)
SubtractByteOffset<T>(ref T, IntPtr)
Unbox<T>(object)
WriteUnaligned<T>(ref byte)
System.Runtime.InteropServices.MemoryMarshal
AsBytes<T>(ReadOnlySpan<T>)
AsBytes<T>(Span<T>)
AsMemory<T>(ReadOnlyMemory<T>)
AsRef<T>(ReadOnlySpan<byte>)
AsRef<T>(Span<byte>)
Cast<TFrom, TTo>(ReadOnlySpan<T>)
Cast<TFrom, TTo>(Span<T>)
CreateFromPinnedArray<T>(T[], int, int)
CreateReadOnlySpan<T>(ref T, int)
CreateSpan<T>(ref T, int)
GetArrayDataReference<T>(T[])
GetReference<T>(ReadOnlySpan<T>)
GetReference<T>(Span<T>)
Read<T>(ReadOnlySpan<byte>)
ToEnumerable<T>(ReadOnlyMemory<T>)
TryGetArray<T>(ReadOnlyMemory<T>, ...)
TryGetMemoryManager<T>(ReadOnlyMemory<T>, ...)
TryGetString<T>(ReadOnlyMemory<char>, ...)
TryRead<T>(ReadOnlySpan<byte>, out T)
TryWrite<T>(Span<byte>, ref T)
Write<T>(Span<byte>, ref T)
System.Runtime.InteropServices.SequenceMarshal
TryGetArray<T>(...)
TryGetReadOnlyMemory<T>(...)
TryGetReadOnlySequenceSegment<T>(...)
TryRead<T>(...)
System.Runtime.InteropServices.CollectionsMarshal
AsSpan<T>(List<T>)
System.GC
AllocateArray<T>(int, bool)
AllocateUninitializedArray<T>(int, bool)
GetPinnableReference
patternThough
GetPinnableReference
methods are intended for compiler use within fixed blocks, they're designed to be type-safe when called by hand.string.GetPinnableReference()
ReadOnlySpan<T>.GetPinnableReference()
Span<T>.GetPinnableReference()
Miscellaneous
ArrayPool<T>.Shared.Rent(int)
MemoryPool<T>.Shared.Rent(int)
Notes
In the below notes, I'm using the terms gcref and managed pointer interchangeably.
(1) Arithmetic operations on gcrefs (such as via
Unsafe.Add
) are not checked for correctness by the runtime. The resulting gcref may point to invalid memory or to a different object. See ECMA-335, Sec. III.1.5.(2) It is legal and type-safe to perform comparisons against gcrefs. See ECMA-225, Sec. III.1.5 and Table III.4.
(3) Stripping the "readonly"-ness of a gcref is analogous to using C++'s
const_cast
operator. It could allow mutation of a value that the caller did not intend to make mutable.(4) The runtime will not validate that casts performed by these APIs are correct. This is equivalent to C++'s
reinterpret_cast
operator. Improper casts could result in buffer overruns when accessing the backing value or in incorrect entry points being invoked when calling instance methods.(5) While it is legal to calculate the absolute offset between two gcrefs, it is unverifiable to do so. See ECMA-335, Sec. III.1.5 and Table III.2.
(6) The runtime does not validate the buffer lengths provided to these APIs. Improper usage could result in buffer overruns.
(7) Use of this API could expose uninitialized memory to the caller. See ECMA-335, Sec. II.15.4.1.3 and Sec. III.1.8.1.1. If the uninitialized memory is projected as a non-primitive struct, the instance's backing fields could contain data which violates invariants that would normally be guaranteed by the instance's ctor.
(8) The
sizeof
CIL instruction is always safe. See ECMA-335, Sec. III.4.25.(9) The
unbox
CIL instruction is intended to return a controlled-mutability managed pointer. However,Unsafe.Unbox
returns a fully mutable gcref. This could allow mutation of a boxed readonly struct, which is illegal. See theUnsafe.Unbox
docs for more information.(10) Per ECMA-335, Sec. II.14.4.2, it is not strictly legal for a gcref to point to null. However, all .NET runtimes allow this and treat it in a type-safe fashion, including guarding accesses to null gcrefs by throwing
NullReferenceException
as appropriate.(11) This method performs the equivalent of a C++-style
reinterpret_cast
. This bypasses normal constructor validation, potentially returning values with inconsistent internal state. Projecting unmanaged types as byte buffers may also expose or allow modification of private fields that the type author did not intend, an unsafe reflection equivalent.(12) The runtime does not perform alignment checks. The caller is responsible for ensuring that any returned refs or spans are properly aligned. Most APIs that accept refs or spans as parameters assume that the references are properly aligned, and they may exhibit undefined behavior if this assumption is violated.
(13) This method handles unaligned data accesses correctly.
(14) This method is safe if TFrom and TTo are integral primitives of the same width. For example, TFrom =
int
with TTo =uint
is safe. Integral primitives are:byte
,sbyte
,short
,ushort
,int
,uint
,long
,ulong
,nint
,nuint
, and enums backed by any of these. The caller is responsible for providing a correct TFrom and TTo; the runtime will not validate these type parameters.(15) The runtime will not validate that the array is pre-pinned. Additionally, since
Memory<T>
instances are subject to struct tearing, any instances backed by pre-pinned arrays must be used with caution in multithreading scenarios, as callingMemory<T>.Pin
on a torn instance backed by a pre-pinned array may result in an access violation.(16) If called against a zero-length array or buffer, returns a gcref to where the value at index 0 would have been. It is legal to use such a gcref for comparison purposes (see, e.g.,
Unsafe.IsAddressLessThan
), and the gcref will be properly GC-tracked. However, it is illegal to dereference such a gcref. See ECMA-335, Sec. III.1.1.5.2.(17)
Memory<T>
's implementation is currently backed by one of:T[]
,string
, orMemoryManager<T>
. However, sinceMemory<T>
is an abstraction, new backing mechanisms may be introduced in the future. Callers must account for the runtime allowing all ofTryGet{Array|MemoryManager|String}
to return false; and callers must have a fallback code path in order to remain future-proof.(18) This API may expose the larger buffer beyond the slice bounded by the
Memory<T>
instance. Callers should take care not to reference data beyond the slice provided to them.(19) This API will never return a null reference. If called on an empty
string
, it will return a reference to the null terminator. The return value can always be safely dereferenced.(20) This API will return a null reference if the underlying span contains no elements. Attempting to dereference it will result in a normal
NullReferenceException
being thrown. Note also that unlike pinningstring
instances, the buffer resulting from pinning aReadOnlySpan<char>
orSpan<char>
reference is not guaranteed to be null-terminated. Consumers must not attempt to read off the end of such buffers.(21) Improper use of this API could corrupt the state of the associated object. However, it would not be considered a type safety or memory safety violation.
(22) The runtime will not validate that writes to the ref will satisfy covariant type safety constraints. For example, a local of type
ref readonly object
may actually point to a field typed asstring
. Removing the readonly constraint and treating it as a mutableref object
may allow assignment of a non-string
to the backingstring
field.The text was updated successfully, but these errors were encountered: