-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: Zero-overhead member access with suppressed visibility checks #81741
Comments
Tagging subscribers to this area: @dotnet/area-system-runtime-compilerservices Issue DetailsBackground and motivationNumber of existing .NET serializers depend on skipping member visibility checks for data serialization. Examples include System.Text.Json or EF Core. In order to skip the visibility checks, the serializers typically use dynamically emitted code (Reflection.Emit or Linq.Expressions) and classic reflection APIs as slow fallback. Neither of these two options are great for source generated serializers and native AOT compilation. This API proposal introduces a first class zero-overhead mechanism for skipping visibility checks. API Proposalnamespace System.Runtime.CompilerServices;
[AttributeUsage(AttributeTargets.Method, AllowMultiple = false, Inherited = false)]
public class UnsafeAccessorAttribute : Attribute
{
public LibraryImportAttribute(string name, UnsafeAccessorKind kind)
{
Name = name;
Kind = kind;
}
}
public enum UnsafeAccessorKind
{
Constructor = 1,
InstanceMethod = 2,
StaticMethod = 3,
FieldGetter = 4,
FieldSetter = 5
}; This attribute will be applied on API Usageclass UserData {
private UserData() { }
public string Name { get; set; }
}
[UnsafeAccessor(".ctor", UnsafeAccessorKind.Constructor)]
extern UserData CallPrivateConstructor()
// This API allows accessing backing fields for auto-implemented properties with unspeakable names
[UnsafeAccessor("<Name>k__BackingField", UnsafeAccessorKind.FieldSetter)]
extern static void SetName(UserData ud, string v);
UserData ud = CallPrivateConstructor();
ud.SetName(ud, "Joe"); Alternative Designs
Risks
|
Is there an advantage in providing public enum UnsafeAccessorKind
{
Constructor = 1,
InstanceMethod = 2,
StaticMethod = 3,
Field = 4,
}; [UnsafeAccessor("<Name>k__BackingField", UnsafeAccessorKind.Field)]
extern static ref int GetName(ref StructType @this); I'm also curious, should the return type need to match the field type? If not, how would this work for fixed buffers or fields of non-public types? |
Thanks. Applied your suggestion.
The proposal does not work for types that you do not have access to. It is called out in the risks section. |
Do fixed buffers also fall under that, since the type is unspeakable and "hidden" by the compiler? Since fixed buffer fields are annotated with That would extend the proposal to: public enum UnsafeAccessorKind
{
Constructor = 1,
InstanceMethod = 2,
StaticMethod = 3,
Field = 4,
FixedBufferField = 5,
};
public struct Example
{
private unsafe fixed int _buffer[5];
}
[UnsafeAccessor("_buffer", UnsafeAccessorKind.FixedBufferField)]
static extern Span<int> GetBuffer(ref Example @this); Alternatively, allowing the element type to be used as the accessor's return type could work. Callers would then need to query the length with reflection and use |
What if my type has a private field, and the type of that field is also private (e.g. |
I was thinking about how non-public and unspeakable types could be handled here, and the idea of using It can be passed through a pointer though. So if // ExampleAssembly
internal struct CustomString
{
public string ToSerializedString() => throw null;
}
public class UserData
{
private UserData() { }
internal CustomString Name { get; set; }
} // Generated
[UnsafeAccessor("<Name>k__BackingField", UnsafeAccessorKind.Field)]
extern static unsafe void GetName(UserData @this, TypedReference* outReference);
[UnsafeAccessor("ToSerializedString", UnsafeAccessorKind.InstanceMethod, DeclaringTypeName = "CustomString, ExampleAssembly")]
extern static string ToSerializedString(TypedReference @this); TypedReference field;
UserData ud = CallPrivateConstructor();
GetName(ud, &field);
string name = ToSerializedString(field); Something about this feels really nasty though, I'm not sure that I like it. Even if it's only intended to be used by generated code, it seems like "too much". |
Isn't this would be fragile if used without source generator. what if we have class defined like this. // v1
class UserData {
private UserData() { }
public string Name { get; set; }
}
[UnsafeAccessor("<Name>k__BackingField", UnsafeAccessorKind.Field)]
extern static ref int GetName(UserData @this); and then change it to // v2
class UserData {
private string name;
private UserData() { }
public string Name { get => this.name; set => this.name = value; }
}
[UnsafeAccessor("<Name>k__BackingField", UnsafeAccessorKind.Field)]
extern static ref int GetName(UserData @this); Now it's easy to forget to update |
I initially miss namespace where this would be living, and probably I less worry about wrong people misusing stuff. anyway what about multiple construtors and constructors with parameters? V1. Custom parameterclass UserData {
private UserData(string name) { }
public string Name { get; set; }
}
[UnsafeAccessor(".ctor", UnsafeAccessorKind.Constructor)]
extern UserData CallPrivateConstructor(string param) Is this possible? V2. Multiple construtorsclass UserData {
private UserData() { }
private UserData(string name) { }
public string Name { get; set; }
}
[UnsafeAccessor(".ctor", UnsafeAccessorKind.Constructor)]
extern UserData CallPrivateConstructor(string param) how to disambiguate between construtors? |
Not sure how feasible/desirable but another alternative would be to extend unsafer unsafe proposal like i proposed here dotnet/csharplang#6476 (comment) namely to allow |
/cc @vargaz @SamMonoRT FYI |
Yes, fixed buffers fall under that. I do not think it makes sense to create a special solution for fixed buffers. If we do anything here, it should be a general solution for all unaccessible types.
Yes, there are ways to naturally extend this design to support inaccessible types. It is not necessary for the first iteration. The serializers scenarios that are motivating this proposal do not need it. TypedReference is one of the tools that can be part of the solution. The design you have proposed would only work for fields, it would not work for methods that may have multiple types in signatures.
Yes, this has the exact same problems as private reflection has today.
The accessor method has to have signature that matches the target method. It means that accessor method with
This is not feasible. It is variant of "Allow suppressing visibility checks in Roslyn" and "Expand unsafe accessors in Roslyn" mentioned in the alternative designs. |
It would be nice (for the runtime - but would require probably a lot of changes in Roslyn) if instead of |
It is "Expand unsafe accessors in Roslyn" in alternative designs. In general case, it would require Roslyn to create the token out of a thin air without actually seeing the member in reference assembly. It sounded problematic. |
@jkotas Would these special accessor functions be defined in the assembly containing the type for serialization? I assume so, but I want to make sure I understand the workflow here.
|
Not necessarily. For example, EF Core has situations where the type is defined in one assembly and the source generated serializer lives in a different assembly. |
How would the usage for static methods look like? How the type would be specified? Would it require another parameter for the attribute? Should Alternatively, whether the target is static could be inferred from the annotated method signature. |
Good point. Updated the proposal.
I have updated the proposal to say that the first argument has to identify the owning type for both instance and static fields. It will work better with eventual extension for skipping type visibility. |
why is this one not [UnsafeAccessor(".ctor", UnsafeAccessorKind.Constructor)]
extern UserData CallPrivateConstructor() |
Alright. If I am reading this correctly this looks like a convenient "macro" for pattern matching, in the JIT for optimizations and in the trimmer for precise trimming, because it is going to be identical to simply using reflection. Is that fair? |
Oversight - fixed.
Yep. |
This will make the static calls look awkward as they would need to pass in |
This is not meant to look pretty. A lot of this looks awkward. For example, setting the field by calling a method that returns a byref for the field looks awkward too.
I think that the most straightforward design for inaccessible types would be to use
If the owning type was to be specified by some other means, we would need to duplicate the UnsafeAccessorType descriptor there. |
I think this design won't work for accessing members of static classes. For example, if I wanted to access the field [UnsafeAccessor(UnsafeAccessorKind.Field)]
extern static ref TextWriter s_out(Console @this); But this won't compile:
Though I guess the |
@jkotas, is there any other feedback needed here or is this available to be marked "api-ready-for-review"? It sounds like there is no language work intended to be done here and so this won't require any cross-team coordination, is that right? |
Would it make sense to allow defining a class UserData
{
private UserData() {}
public string Name { get; set; }
}
[UnsafeAccessor(UnsafeAccessorKind.Constructor)]
extern static void CallPrivateConstructor(UserData @this); UserData ud = (UserData)RuntimeHelpers.GetUninitializedObject(typeof(UserData));
CallPrivateConstructor(ud); |
If It could be done like this: [UnsafeAccessor(UnsafeAccessorKind.NonVirtualMethod, Name = ".ctor")]
extern static void CallPrivateConstructor(UserData @this); |
@AaronRobinsonMSFT We may want to add a test for this case to #86932. |
It does. |
This comment was marked as resolved.
This comment was marked as resolved.
Yes, ref string keeps ud alive. It does not keep it pinned. (It would be pinned in Mono due to conservative stack scanning.) |
@jkotas I am going to set this to .NET 8 so we don't forget to move this to .NET 9 when the tag is available. |
Question about the use case for this: the unsafe accessors are supposed to be generated by source generators right? How will source generators discover private members that need unsafe accessors, if the private members don't show up in ref assemblies? |
This isn't about discovery. Source generators (Roslyn and otherwise) often generate source that needs to be private for reason X or Y. In this case those source generator can continue to do what they were designed to do and create private APIs. Discovery isn't the scenario here and shouldn't be performed by the source generator for the ref scenario. This is about providing a mechanism that is akin to the private reflection scenario but with lower overhead. Whether the API is present or not is not the domain of this API, only that is fails in the same way if private reflection was used. To that end no verification should be performed in any analyzer to validate the target is there if that analyzer consumes ref assemblies. |
Thanks, that makes sense! What I was missing is that this isn't designed to be a general solution for all reflection-based serializers, but only for those which don't have to do discovery at runtime in the first place. And from our conversation it sounds like there are cases for example in EF.Core where the schema is known at compilation time, so discovery isn't a problem. |
Source generators that wish to discover private members in referenced assemblies need to instruct their users to disable use of reference assemblies for compilation, e.g. by setting |
The core of the change is that `UnsafeAccessor` creates a code dependency from the accessor method to the target specified by the attribute. The trimmer needs to follow this dependency and preserve the target. Additionally, because the trimmer operates at the IL level, it needs to make sure that the target will keep its name and some other properties intact (so that the runtime implementation of the `UnsafeAccessor` can still work). Implementation choices: * The trimmer will mark the target as "accessed via reflection", this is a simple way to make sure that name and other properties about the target are preserved. This could be optimized in the future, but the savings are probably not that interesting. * The implementation ran into a problem when trying to precisely match the signature overload resolution. Due to Cecil issues and the fact that Cecil's resolution algorithm is not extensible, it was not possible to match the runtime's behavior without adding lot more complexity (currently it seems we would have to reimplement method resolution in the trimmer). So, to simplify the implementation, trimmer will mark all methods of a given name. This means it will mark more than necessary. This is fixable by adding more complexity to the code base if we think there's a good reason for it. * Due to the above choices, there are some behavioral differences: * Trimmer will warn if the target has data flow annotations, always. There's no way to "fix" this in the code without a suppression. * Trimmer will produce different warning codes even if there is a true data flow mismatch - this is because it treats the access as "reflection access" which produces different warning codes from direct access. * These differences are fixable, but it was not deemed necessary right now. * We decided that analyzer will not react to the attribute at all, and thus will not produce any diagnostics around it. The guiding reason to keep the implementation simple is that we don't expect the unsafe accessor to be used by developers directly, instead we assume that vast majority of its usages will be from source generators. So, developer UX is not as important. Test changes: * Adds directed tests for the marking behavior * Adds tests to verify that `Requires*` attributes behave correctly * Adds tests to verify that data flow annotations behave as expected (described above) * The tests are effectively a second validation of the NativeAOT implementation as they cover NativeAOT as well. Fixes in CoreCLR/NativeAOT: This change fixes one bug in the CoreCLR/NativeAOT implementation, unsafe accessor on a instance method of a value type must use "by-ref" parameter for the `this` parameter. Without the "by-ref" the accessor is considered invalid and will throw. This change also adds some tests to the CoreCLR/NativeAOT test suite. Part of #86161. Related to #86438. Feature design in #81741.
Is this still being worked on for .NET 8, or can we move it out to .NET 9/future? |
This work was completed in .NET 8. |
Tracking of Generic support across all runtimes is being tracked with #89439. |
Background and motivation
Number of existing .NET serializers depend on skipping member visibility checks for data serialization. Examples include System.Text.Json or EF Core. In order to skip the visibility checks, the serializers typically use dynamically emitted code (Reflection.Emit or Linq.Expressions) and classic reflection APIs as slow fallback. Neither of these two options are great for source generated serializers and native AOT compilation. This API proposal introduces a first class zero-overhead mechanism for skipping visibility checks.
API Proposal
This attribute will be applied on
extern static
method. The implementation of theextern static
method annotated with this attribute will be provided by the runtime based on the information in the attribute and the signature of the method that the attribute is applied to. The runtime will try to find the matching method or field and forward the call to it. If the matching method or field is not found, the body of the extern method will throwMissingFieldException
orMissingMethodException
.For
UnsafeAccessorKind.{Static}Method
andUnsafeAccessorKind.{Static}Field
, the type of the first argument of the annotated extern method identifies the owning type. The value of the first argument is treated as@this
pointer for instance fields and methods. The first argument must be passed asref
for instance fields and methods on structs. The value of the first argument is not used by the implementation for static fields and methods.The generic parameters of the
extern static
method are concatenation of the type and method generic arguments of the target method. For example,extern static void Method1<T1, T2>(Class1<T1> @this)
can be used to callClass1<T1>.Method1<T2>()
. The generic constraints of theextern static
method must match generic constraints of the target type, field or method.Return type is considered for the signature match. modreqs and modopts are not considered for the signature match.
API Usage
Alternative Designs
typeof(MyType).GetMethod("set_IntProperty", BindingFlags.Public | BindingFlags.Instance).Invoke(BindingFlags.DoNotWrapExceptions , ptr, new object[] { (object)intValue });
, optimize it top.IntProperty = intValue;
.Risks
UnsafeAccessorType
in the discussion below for the potential design).The text was updated successfully, but these errors were encountered: