Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use GC.KeepAlive() to prevent early collection of method parameters #719

Closed
jonpryor opened this issue Sep 17, 2020 · 2 comments · Fixed by #725
Closed

Use GC.KeepAlive() to prevent early collection of method parameters #719

jonpryor opened this issue Sep 17, 2020 · 2 comments · Fixed by #725
Assignees
Labels
bug Component does not function as intended generator Issues binding a Java library (generator, class-parse, etc.)

Comments

@jonpryor
Copy link
Member

@brendanzagaeski has diagnosed an issue wherein object references are collected when we think they shouldn't be.

Consider the following generator-emitted binding code:

        public unsafe bool HideSoftInputFromWindow (Android.OS.IBinder? windowToken, [global::Android.Runtime.GeneratedEnum] Android.Views.InputMethods.HideSoftInputFlags flags)
        {
            const string __id = "hideSoftInputFromWindow.(Landroid/os/IBinder;I)Z";
            try {
                JniArgumentValue* __args = stackalloc JniArgumentValue [2];
/* 1 */          __args [0] = new JniArgumentValue ((windowToken == null) ? IntPtr.Zero : ((global::Java.Lang.Object) windowToken).Handle);
/* 2 */          __args [1] = new JniArgumentValue ((int) flags);
/* 3 */          var __rm = _members.InstanceMethods.InvokeAbstractBooleanMethod (__id, this, __args);
                return __rm;
            } finally {
            }
        }

Depending on "various factors", it is possible that Mono's GC will decide that windowToken -- used on line (1) -- is eligible for collection on line (2), when it must be kept alive until after line (3) completes execution.

If windowToken isn't kept alive, then it could be finalized before/during the InvokeAbstractBooleanMethod() invocation, which could invalidate/change windowToken.Handle, and cause lots of head scratching, confusion, and app crashes.

The fix? We need to emit a GC.KeepAlive() for all reference types until after the Invoke* invocation has completed:

        public unsafe bool HideSoftInputFromWindow (Android.OS.IBinder? windowToken, [global::Android.Runtime.GeneratedEnum] Android.Views.InputMethods.HideSoftInputFlags flags)
        {
            const string __id = "hideSoftInputFromWindow.(Landroid/os/IBinder;I)Z";
            try {
                JniArgumentValue* __args = stackalloc JniArgumentValue [2];
                __args [0] = new JniArgumentValue ((windowToken == null) ? IntPtr.Zero : ((global::Java.Lang.Object) windowToken).Handle);
                __args [1] = new JniArgumentValue ((int) flags);
                var __rm = _members.InstanceMethods.InvokeAbstractBooleanMethod (__id, this, __args);
/* FIX */       global::System.GC.KeepAlive (windowToken);
                return __rm;
            } finally {
            }
        }
@jpobst jpobst added bug Component does not function as intended generator Issues binding a Java library (generator, class-parse, etc.) labels Sep 17, 2020
@jpobst jpobst changed the title Need to use GC.KeepAlive() Use GC.KeepAlive() to prevent early collection of method parameters Sep 17, 2020
jonpryor pushed a commit that referenced this issue Sep 21, 2020
)

Context: #719

@brendanzagaeski has been investigating a Xamarin.Android app crash:

	JNI DETECTED ERROR IN APPLICATION: use of deleted global reference 0x3d86
	    from android.view.View crc64720bb2db43a66fe9.FragmentContainer.n_onCreateView(android.view.LayoutInflater, android.view.ViewGroup, android.os.Bundle)
	…
	  at crc64720bb2db43a66fe9.FragmentContainer.n_onCreateView(Native method)
	  at crc64720bb2db43a66fe9.FragmentContainer.onCreateView(FragmentContainer.java:41)

This had been a head-scratcher, but we had a GREF log, so all should
be clear, right?

	09-10 17:56:48.280 10123 10123 I monodroid-gref: +g+ grefc 1141 gwrefc 0 obj-handle 0x79/I -> new-handle 0x3d86/G from thread '(null)'(1)
	09-10 17:56:48.294 10123 10123 I monodroid-gref: +w+ grefc 1140 gwrefc 2 obj-handle 0x3d86/G -> new-handle 0x1e3/W from thread 'finalizer'(10123)
	09-10 17:56:48.294 10123 10123 I monodroid-gref: -g- grefc 1139 gwrefc 2 handle 0x3d86/G from thread 'finalizer'(10123)

The GREF log *wasn't* immediately clear: sure, the GREF was turned
into a Weak-GREF, and the Weak-GREF was then collected, but none of
this explained *how* were were using this deleted GREF.

(We were at this state of affairs for months: we "know" we're using
a deleted GREF, but we don't know *how* or *why*.  It was a very hard
to hit bug.)

Eventually we had a ["that's funny"][0] event: *sure*, the GREF is
deleted, but:

 1. Why is it being deleted by the finalizer?
 2. …14ms *after construction*?

A [Garbage Collection][1] refresher may be in order, but in short:

 1. All `Java.Lang.Object` subclasses are "bridged" objects.

 2. During a GC, all "collectable" bridged objects are gathered.
    A collectable object is one in which nothing in the managed
    GC references the object.

 3. Once the GC is complete, all gathered collectable bridged objects
    are passed to a "cross references" callback.

    The callback is called *outside* the "scope" of a GC; "the world"
    is *not* frozen, other threads may be executing.

 4. The "cross references" callback is the
    `MonoGCBridgeCallbacks::cross_references` field provided provided
    to [`mono_gc_register_bridge_callbacks()`][2].
    
    In a Xamarin.Android app, the "cross references" callback will
    "toggle" a JNI Global Reference to a JNI Weak Global Reference,
    invoke a Java-side GC, and then try to obtain a
    JNI Global Reference from the JNI Weak Global Reference.
    If a non-`NULL` pointer is returned, the bridged object is kept
    alive.  If a `NULL` pointer is returned, the bridged object will
    be considered dead, and will be added to the Finalization Queue
    (as `Java.Lang.Object` has a finalizer).

Thus, it seemed "funny" that within 14ms an instance was created,
GC'd, and determined to be garbage, *especially* when we *knew* that
this instance was being passed to Java, which we expected to retain
the instance.  (Yet wasn't…?)

After much banging of heads, and the yeoman's work of creating a
simplified and consistently reproducible test case, we *think* we
know the cause of the crash.

Consider our normal Managed-to-Java marshaling code, e.g.

	partial class MaterialButton  {
	    public unsafe MaterialButton (global::Android.Content.Context context)
	    	: base (IntPtr.Zero, JniHandleOwnership.DoNotTransfer)
	    {
	        const string __id = "(Landroid/content/Context;)V";

	        if (((global::Java.Lang.Object) this).Handle != IntPtr.Zero)
	            return;

	        try {
	            JniArgumentValue* __args = stackalloc JniArgumentValue [1];
	/* 1 */     __args [0] = new JniArgumentValue ((context == null) ? IntPtr.Zero : ((global::Java.Lang.Object) context).Handle);
	/* 2 */     var __r = _members.InstanceMethods.StartCreateInstance (__id, ((object) this).GetType (), __args);
	            SetHandle (__r.Handle, JniHandleOwnership.TransferLocalRef);
	/* 3 */     _members.InstanceMethods.FinishCreateInstance (__id, this, __args);
	        } finally {
	        }
	    }
	}

At (1), `context.Handle` is -- a JNI GREF -- is stored into a
`JniArgumentValue* __args` value, and at (3) `__args` is passed into
JNI, which will presumably "Do Something" with that handle.

However, nothing ties `context.Handle` to `context`, so from
(2) onward, `context` *may* be eligible for garbage collection.

See also Chris Brumme's [Lifetime, GC.KeepAlive, handle recycling][3]
blog article.  It's about .NET Framework, but the same fundamental
multithreading concepts apply.

The fix is to *ensure* that `context` is kept alive
*for as long as* `context.Handle` will be used, i.e. across the
JNI `_members.InstanceMethods.FinishCreateInstance()` call:

	partial class MaterialButton  {
	    public unsafe MaterialButton (global::Android.Content.Context context)
	    	: base (IntPtr.Zero, JniHandleOwnership.DoNotTransfer)
	    {
	        const string __id = "(Landroid/content/Context;)V";

	        if (((global::Java.Lang.Object) this).Handle != IntPtr.Zero)
	            return;

	        try {
	            JniArgumentValue* __args = stackalloc JniArgumentValue [1];
	/* 1 */     __args [0] = new JniArgumentValue ((context == null) ? IntPtr.Zero : ((global::Java.Lang.Object) context).Handle);
	/* 2 */     var __r = _members.InstanceMethods.StartCreateInstance (__id, ((object) this).GetType (), __args);
	            SetHandle (__r.Handle, JniHandleOwnership.TransferLocalRef);
	/* 3 */     _members.InstanceMethods.FinishCreateInstance (__id, this, __args);
	        } finally {
	/* 4 */     global::System.GC.KeepAlive (context);
	        }
	    }
	}

This should prevent e.g. `context` from being prematurely GC'd,
which in turn should prevent the `JNI DETECTED ERROR` message.

Update `tools/generator` to emit `GC.KeepAlive()` statements for
every parameter type which isn't a value type (`enum`, `int`, `string`,
etc.).  `string` is considered a value type because we always send a
"deep copy" of the string contents, so it won't matter if the `string`
instance is GC'd immediately.

[0]: https://quoteinvestigator.com/2015/03/02/eureka-funny/
[1]: https://docs.microsoft.com/xamarin/android/internals/garbage-collection
[2]: http://docs.go-mono.com/?link=xhtml%3adeploy%2fmono-api-gc.html
[3]: https://docs.microsoft.com/archive/blogs/cbrumme/lifetime-gc-keepalive-handle-recycling
jonpryor added a commit to jonpryor/xamarin-android that referenced this issue Sep 21, 2020
Fixes: dotnet/java-interop#682
Fixes: dotnet/java-interop#717

Context: dotnet/java-interop#719

Changes: dotnet/java-interop@a807961...79d9533

  * dotnet/java-interop@79d95334: [generator] Use GC.KeepAlive for reference type method parameters. (dotnet#722)
  * dotnet/java-interop@1a19ec04: [Xamarin.Android.Tools.Bytecode] Hide Kotlin nested types inside (dotnet#723)
  * dotnet/java-interop@24a9abdb: [Xamarin.Android.Tools.ApiXmlAdjuster] Find app.android.IntentService (dotnet#718)
jonpryor added a commit to dotnet/android that referenced this issue Sep 22, 2020
Fixes: dotnet/java-interop#682
Fixes: dotnet/java-interop#717

Context: dotnet/java-interop#719

Changes: dotnet/java-interop@a807961...79d9533

  * dotnet/java-interop@79d95334: [generator] Use GC.KeepAlive for reference type method parameters. (#722)
  * dotnet/java-interop@1a19ec04: [Xamarin.Android.Tools.Bytecode] Hide Kotlin nested types inside (#723)
  * dotnet/java-interop@24a9abdb: [Xamarin.Android.Tools.ApiXmlAdjuster] Find app.android.IntentService (#718)
jpobst added a commit that referenced this issue Sep 22, 2020
Context: #719

@brendanzagaeski has been investigating a Xamarin.Android app crash:

 JNI DETECTED ERROR IN APPLICATION: use of deleted global reference 0x3d86
     from android.view.View crc64720bb2db43a66fe9.FragmentContainer.n_onCreateView(android.view.LayoutInflater, android.view.ViewGroup, android.os.Bundle)
 …
   at crc64720bb2db43a66fe9.FragmentContainer.n_onCreateView(Native method)
   at crc64720bb2db43a66fe9.FragmentContainer.onCreateView(FragmentContainer.java:41)

This had been a head-scratcher, but we had a GREF log, so all should
be clear, right?

 09-10 17:56:48.280 10123 10123 I monodroid-gref: +g+ grefc 1141 gwrefc 0 obj-handle 0x79/I -> new-handle 0x3d86/G from thread '(null)'(1)
 09-10 17:56:48.294 10123 10123 I monodroid-gref: +w+ grefc 1140 gwrefc 2 obj-handle 0x3d86/G -> new-handle 0x1e3/W from thread 'finalizer'(10123)
 09-10 17:56:48.294 10123 10123 I monodroid-gref: -g- grefc 1139 gwrefc 2 handle 0x3d86/G from thread 'finalizer'(10123)

The GREF log *wasn't* immediately clear: sure, the GREF was turned
into a Weak-GREF, and the Weak-GREF was then collected, but none of
this explained *how* were were using this deleted GREF.

(We were at this state of affairs for months: we "know" we're using
a deleted GREF, but we don't know *how* or *why*.  It was a very hard
to hit bug.)

Eventually we had a ["that's funny"][0] event: *sure*, the GREF is
deleted, but:

 1. Why is it being deleted by the finalizer?
 2. …14ms *after construction*?

A [Garbage Collection][1] refresher may be in order, but in short:

 1. All `Java.Lang.Object` subclasses are "bridged" objects.

 2. During a GC, all "collectable" bridged objects are gathered.
    A collectable object is one in which nothing in the managed
    GC references the object.

 3. Once the GC is complete, all gathered collectable bridged objects
    are passed to a "cross references" callback.

    The callback is called *outside* the "scope" of a GC; "the world"
    is *not* frozen, other threads may be executing.

 4. The "cross references" callback is the
    `MonoGCBridgeCallbacks::cross_references` field provided provided
    to [`mono_gc_register_bridge_callbacks()`][2].

    In a Xamarin.Android app, the "cross references" callback will
    "toggle" a JNI Global Reference to a JNI Weak Global Reference,
    invoke a Java-side GC, and then try to obtain a
    JNI Global Reference from the JNI Weak Global Reference.
    If a non-`NULL` pointer is returned, the bridged object is kept
    alive.  If a `NULL` pointer is returned, the bridged object will
    be considered dead, and will be added to the Finalization Queue
    (as `Java.Lang.Object` has a finalizer).

Thus, it seemed "funny" that within 14ms an instance was created,
GC'd, and determined to be garbage, *especially* when we *knew* that
this instance was being passed to Java, which we expected to retain
the instance.  (Yet wasn't…?)

After much banging of heads, and the yeoman's work of creating a
simplified and consistently reproducible test case, we *think* we
know the cause of the crash.

Consider our normal Managed-to-Java marshaling code, e.g.

 partial class MaterialButton  {
     public unsafe MaterialButton (global::Android.Content.Context context)
      : base (IntPtr.Zero, JniHandleOwnership.DoNotTransfer)
     {
         const string __id = "(Landroid/content/Context;)V";

         if (((global::Java.Lang.Object) this).Handle != IntPtr.Zero)
             return;

         try {
             JniArgumentValue* __args = stackalloc JniArgumentValue [1];
 /* 1 */     __args [0] = new JniArgumentValue ((context == null) ? IntPtr.Zero : ((global::Java.Lang.Object) context).Handle);
 /* 2 */     var __r = _members.InstanceMethods.StartCreateInstance (__id, ((object) this).GetType (), __args);
             SetHandle (__r.Handle, JniHandleOwnership.TransferLocalRef);
 /* 3 */     _members.InstanceMethods.FinishCreateInstance (__id, this, __args);
         } finally {
         }
     }
 }

At (1), `context.Handle` is -- a JNI GREF -- is stored into a
`JniArgumentValue* __args` value, and at (3) `__args` is passed into
JNI, which will presumably "Do Something" with that handle.

However, nothing ties `context.Handle` to `context`, so from
(2) onward, `context` *may* be eligible for garbage collection.

See also Chris Brumme's [Lifetime, GC.KeepAlive, handle recycling][3]
blog article.  It's about .NET Framework, but the same fundamental
multithreading concepts apply.

The fix is to *ensure* that `context` is kept alive
*for as long as* `context.Handle` will be used, i.e. across the
JNI `_members.InstanceMethods.FinishCreateInstance()` call:

 partial class MaterialButton  {
     public unsafe MaterialButton (global::Android.Content.Context context)
      : base (IntPtr.Zero, JniHandleOwnership.DoNotTransfer)
     {
         const string __id = "(Landroid/content/Context;)V";

         if (((global::Java.Lang.Object) this).Handle != IntPtr.Zero)
             return;

         try {
             JniArgumentValue* __args = stackalloc JniArgumentValue [1];
 /* 1 */     __args [0] = new JniArgumentValue ((context == null) ? IntPtr.Zero : ((global::Java.Lang.Object) context).Handle);
 /* 2 */     var __r = _members.InstanceMethods.StartCreateInstance (__id, ((object) this).GetType (), __args);
             SetHandle (__r.Handle, JniHandleOwnership.TransferLocalRef);
 /* 3 */     _members.InstanceMethods.FinishCreateInstance (__id, this, __args);
         } finally {
 /* 4 */     global::System.GC.KeepAlive (context);
         }
     }
 }

This should prevent e.g. `context` from being prematurely GC'd,
which in turn should prevent the `JNI DETECTED ERROR` message.

Update `tools/generator` to emit `GC.KeepAlive()` statements for
every parameter type which isn't a value type (`enum`, `int`, `string`,
etc.).  `string` is considered a value type because we always send a
"deep copy" of the string contents, so it won't matter if the `string`
instance is GC'd immediately.

[0]: https://quoteinvestigator.com/2015/03/02/eureka-funny/
[1]: https://docs.microsoft.com/xamarin/android/internals/garbage-collection
[2]: http://docs.go-mono.com/?link=xhtml%3adeploy%2fmono-api-gc.html
[3]: https://docs.microsoft.com/archive/blogs/cbrumme/lifetime-gc-keepalive-handle-recycling
jonpryor pushed a commit that referenced this issue Sep 24, 2020
)

Fixes: #719

@brendanzagaeski has been investigating a Xamarin.Android app crash:

	JNI DETECTED ERROR IN APPLICATION: use of deleted global reference 0x3d86
	    from android.view.View crc64720bb2db43a66fe9.FragmentContainer.n_onCreateView(android.view.LayoutInflater, android.view.ViewGroup, android.os.Bundle)
	…
	  at crc64720bb2db43a66fe9.FragmentContainer.n_onCreateView(Native method)
	  at crc64720bb2db43a66fe9.FragmentContainer.onCreateView(FragmentContainer.java:41)

This had been a head-scratcher, but we had a GREF log, so all should
be clear, right?

	09-10 17:56:48.280 10123 10123 I monodroid-gref: +g+ grefc 1141 gwrefc 0 obj-handle 0x79/I -> new-handle 0x3d86/G from thread '(null)'(1)
	09-10 17:56:48.294 10123 10123 I monodroid-gref: +w+ grefc 1140 gwrefc 2 obj-handle 0x3d86/G -> new-handle 0x1e3/W from thread 'finalizer'(10123)
	09-10 17:56:48.294 10123 10123 I monodroid-gref: -g- grefc 1139 gwrefc 2 handle 0x3d86/G from thread 'finalizer'(10123)

The GREF log *wasn't* immediately clear: sure, the GREF was turned
into a Weak-GREF, and the Weak-GREF was then collected, but none of
this explained *how* were were using this deleted GREF.

(We were at this state of affairs for months: we "know" we're using
a deleted GREF, but we don't know *how* or *why*.  It was a very hard
to hit bug.)

Eventually we had a ["that's funny"][0] event: *sure*, the GREF is
deleted, but:

 1. Why is it being deleted by the finalizer?
 2. …14ms *after construction*?

A [Garbage Collection][1] refresher may be in order, but in short:

 1. All `Java.Lang.Object` subclasses are "bridged" objects.

 2. During a GC, all "collectable" bridged objects are gathered.
    A collectable object is one in which nothing in the managed
    GC references the object.

 3. Once the GC is complete, all gathered collectable bridged objects
    are passed to a "cross references" callback.

    The callback is called *outside* the "scope" of a GC; "the world"
    is *not* frozen, other threads may be executing.

 4. The "cross references" callback is the
    `MonoGCBridgeCallbacks::cross_references` field provided provided
    to [`mono_gc_register_bridge_callbacks()`][2].
    
    In a Xamarin.Android app, the "cross references" callback will
    "toggle" a JNI Global Reference to a JNI Weak Global Reference,
    invoke a Java-side GC, and then try to obtain a
    JNI Global Reference from the JNI Weak Global Reference.
    If a non-`NULL` pointer is returned, the bridged object is kept
    alive.  If a `NULL` pointer is returned, the bridged object will
    be considered dead, and will be added to the Finalization Queue
    (as `Java.Lang.Object` has a finalizer).

Thus, it seemed "funny" that within 14ms an instance was created,
GC'd, and determined to be garbage, *especially* when we *knew* that
this instance was being passed to Java, which we expected to retain
the instance.  (Yet wasn't…?)

After much banging of heads, and the yeoman's work of creating a
simplified and consistently reproducible test case, we *think* we
know the cause of the crash.

Consider our normal Managed-to-Java marshaling code, e.g.

	partial class MaterialButton  {
	    public unsafe MaterialButton (global::Android.Content.Context context)
	    	: base (IntPtr.Zero, JniHandleOwnership.DoNotTransfer)
	    {
	        const string __id = "(Landroid/content/Context;)V";

	        if (((global::Java.Lang.Object) this).Handle != IntPtr.Zero)
	            return;

	        try {
	            JniArgumentValue* __args = stackalloc JniArgumentValue [1];
	/* 1 */     __args [0] = new JniArgumentValue ((context == null) ? IntPtr.Zero : ((global::Java.Lang.Object) context).Handle);
	/* 2 */     var __r = _members.InstanceMethods.StartCreateInstance (__id, ((object) this).GetType (), __args);
	            SetHandle (__r.Handle, JniHandleOwnership.TransferLocalRef);
	/* 3 */     _members.InstanceMethods.FinishCreateInstance (__id, this, __args);
	        } finally {
	        }
	    }
	}

At (1), `context.Handle` is -- a JNI GREF -- is stored into a
`JniArgumentValue* __args` value, and at (3) `__args` is passed into
JNI, which will presumably "Do Something" with that handle.

However, nothing ties `context.Handle` to `context`, so from
(2) onward, `context` *may* be eligible for garbage collection.

See also Chris Brumme's [Lifetime, GC.KeepAlive, handle recycling][3]
blog article.  It's about .NET Framework, but the same fundamental
multithreading concepts apply.

The fix is to *ensure* that `context` is kept alive
*for as long as* `context.Handle` will be used, i.e. across the
JNI `_members.InstanceMethods.FinishCreateInstance()` call:

	partial class MaterialButton  {
	    public unsafe MaterialButton (global::Android.Content.Context context)
	    	: base (IntPtr.Zero, JniHandleOwnership.DoNotTransfer)
	    {
	        const string __id = "(Landroid/content/Context;)V";

	        if (((global::Java.Lang.Object) this).Handle != IntPtr.Zero)
	            return;

	        try {
	            JniArgumentValue* __args = stackalloc JniArgumentValue [1];
	/* 1 */     __args [0] = new JniArgumentValue ((context == null) ? IntPtr.Zero : ((global::Java.Lang.Object) context).Handle);
	/* 2 */     var __r = _members.InstanceMethods.StartCreateInstance (__id, ((object) this).GetType (), __args);
	            SetHandle (__r.Handle, JniHandleOwnership.TransferLocalRef);
	/* 3 */     _members.InstanceMethods.FinishCreateInstance (__id, this, __args);
	        } finally {
	/* 4 */     global::System.GC.KeepAlive (context);
	        }
	    }
	}

This should prevent e.g. `context` from being prematurely GC'd,
which in turn should prevent the `JNI DETECTED ERROR` message.

Update `tools/generator` to emit `GC.KeepAlive()` statements for
every parameter type which isn't a value type (`enum`, `int`, `string`,
etc.).  `string` is considered a value type because we always send a
"deep copy" of the string contents, so it won't matter if the `string`
instance is GC'd immediately.

[0]: https://quoteinvestigator.com/2015/03/02/eureka-funny/
[1]: https://docs.microsoft.com/xamarin/android/internals/garbage-collection
[2]: http://docs.go-mono.com/?link=xhtml%3adeploy%2fmono-api-gc.html
[3]: https://docs.microsoft.com/archive/blogs/cbrumme/lifetime-gc-keepalive-handle-recycling
@brendanzagaeski
Copy link
Contributor

Release status update

A new Preview version of Xamarin.Android has now been published that includes the fix for this item. The fix is not yet included in a Release version. I will update this item again when a Release version is available that includes the fix.

Fix included in Xamarin.Android SDK version 11.1.0.15.

Fix included on Windows in Visual Studio 2019 version 16.8 Preview 4. To try the Preview version that includes the fix, check for the latest updates in Visual Studio Preview.

Fix included on macOS in Visual Studio 2019 for Mac version 8.8 Preview 4. To try the Preview version that includes the fix, check for the latest updates on the Preview updater channel.

jonpryor pushed a commit to dotnet/android that referenced this issue Oct 20, 2020
Fixes: dotnet/java-interop#461
Fixes: dotnet/java-interop#682
Fixes: dotnet/java-interop#717
Fixes: dotnet/java-interop#719
Fixes: dotnet/java-interop#728

Changes: dotnet/java-interop@ac914ce...b991bb8

  * dotnet/java-interop@b991bb86: [generator] Revert change to use auto-properties in EventArgs classes (#736)
  * dotnet/java-interop@ee50d89b: Bump to xamarin/xamarin-android-tools/master@f2af06f2 (#733)
  * dotnet/java-interop@a0b895c1: [build] Suppress NuGet warnings (#730)
  * dotnet/java-interop@8b1b0507: [generator] Fix parsing of complex generic types (#729)
  * dotnet/java-interop@ee7afeed: [generator] Prevent generating duplicate EventArgs classes (#726)
  * dotnet/java-interop@1f21f38c: [generator] Use GC.KeepAlive for reference type method parameters. (#725)
  * dotnet/java-interop@5136ef98: [Xamarin.Android.Tools.Bytecode] Hide Kotlin nested types inside (#723)
  * dotnet/java-interop@53d60513: [jnimarshalmethod-gen] Fix registration on Windows (#721)
  * dotnet/java-interop@5a834d42: [jnimarshalmethod-gen] Avoid creating AppDomains (#720)
  * dotnet/java-interop@a76edb8c: [Xamarin.Android.Tools.ApiXmlAdjuster] Find app.android.IntentService (#718)
  * dotnet/java-interop@6cde0877: [Java.Interop] Emit a reference assembly for Java.Interop.dll (#716)
  * dotnet/java-interop@b858dc59: [generator] Provide line/col numbers for api.xml warnings (#715)
  * dotnet/java-interop@9be92a04: [ci] Don't kick off CI for documentation only changes. (#712)
  * dotnet/java-interop@03c22722: [jnimarshalmethod-gen] Fix type resolution crash (#706)
@brendanzagaeski
Copy link
Contributor

Release status update

A new Release version of Xamarin.Android has now been published that includes the fix for this item.

Fix included in Xamarin.Android SDK version 11.1.0.17.

Fix included on Windows in Visual Studio 2019 version 16.8. To get the new version that includes the fix, check for the latest updates or install the most recent release from https://visualstudio.microsoft.com/downloads/.

Fix included on macOS in Visual Studio 2019 for Mac version 8.8. To get the new version that includes the fix, check for the latest updates on the Stable updater channel.

jonpryor pushed a commit to dotnet/android that referenced this issue Dec 18, 2020
Context: dotnet/java-interop#719
Context: dotnet/java-interop@1f21f38
Context: https://docs.microsoft.com/archive/blogs/cbrumme/lifetime-gc-keepalive-handle-recycling

We discovered the cause of a latent potential crash within
Xamarin.Android apps:

	JNI DETECTED ERROR IN APPLICATION: use of deleted global reference 0x3d86
	    …

The cause of the "use of deleted global reference" was that the GC
collected an instance after it was provided to Java, but before Java
could keep the instance alive in a manner which would cause our GC
bridge to keep it alive, akin to:

	CallIntoJava (new JavaLangObjectSubclass ().Handle);

In the above example code, if the `JavaLangObjectSubclass` instance
is collected by the GC *after* the `.Handle` property is accessed
but *before* `CallIntoJava()` is executed, then the `.Handle` value
will refer to a "deleted global reference" and cause the app crash.

The appropriate fix is to ensure that the GC *won't* collect it,
usually by calling [`GC.KeepAlive()`][0]:

	var value = new JavaLangObjectSubclass ();
	CallIntoJava (value.Handle);
	GC.KeepAlive (value);

An important aspect of the problem is that much of the relevant code
introducing this scenario is within *binding assemblies*:

	partial class Activity {
	    public virtual unsafe Android.App.PendingIntent? CreatePendingResult (
	            int requestCode,
	            Android.Content.Intent data,
	            [global::Android.Runtime.GeneratedEnum] Android.App.PendingIntentFlags flags)
	    {
	        const string __id = "createPendingResult.(ILandroid/content/Intent;I)Landroid/app/PendingIntent;";
	        try {
	            JniArgumentValue* __args = stackalloc JniArgumentValue [3];
	            __args [0] = new JniArgumentValue (requestCode);
	            __args [1] = new JniArgumentValue ((data == null) ? IntPtr.Zero : ((global::Java.Lang.Object) data).Handle);
	            __args [2] = new JniArgumentValue ((int) flags);
	            var __rm = _members.InstanceMethods.InvokeVirtualObjectMethod (__id, this, __args);
	            return global::Java.Lang.Object.GetObject<Android.App.PendingIntent> (__rm.Handle, JniHandleOwnership.TransferLocalRef);
	        } finally {
	        }    
	    }
	}

Here, the issue is that `data.Handle` is passed into Java code, but
it's possible that `data` may be collected *after* `data.Handle` is
accessed but before
`_members.InstanceMethods.InvokeVirtualObjectMethod()` is invoked.

dotnet/java-interop@1f21f38c introduced a `generator` fix for this
this problem, inserting an appropriate `GC.KeepAlive()`:

	partial class Activity {
	    public virtual unsafe Android.App.PendingIntent? CreatePendingResult (
	            int requestCode, Android.Content.Intent data,
	            [global::Android.Runtime.GeneratedEnum] Android.App.PendingIntentFlags flags)
	    {
	        const string __id = "createPendingResult.(ILandroid/content/Intent;I)Landroid/app/PendingIntent;";
	        try {
	            JniArgumentValue* __args = stackalloc JniArgumentValue [3];
	            __args [0] = new JniArgumentValue (requestCode);
	            __args [1] = new JniArgumentValue ((data == null) ? IntPtr.Zero : ((global::Java.Lang.Object) data).Handle);
	            __args [2] = new JniArgumentValue ((int) flags);
	            var __rm = _members.InstanceMethods.InvokeVirtualObjectMethod (__id, this, __args);
	            return global::Java.Lang.Object.GetObject<Android.App.PendingIntent> (__rm.Handle, JniHandleOwnership.TransferLocalRef);
	        } finally {
	            GC.KeepAlive (data);
	        }    
	    }
	}

The problem is that a fix within *generator* is meaningless until all
relevant binding assemblies are *also* updated: fixing the issue
within `Mono.Android.dll` only fixes `Mono.Android.dll`!

We don't -- and can't! -- expect the entire ecosystem to rebuild and
republish binding assemblies, which means a `generator` fix isn't a
complete fix!

To help fix the *ecosystem*, update the *linker* to insert appropriate
`GC.KeepAlive()` invocations into marshal methods.  This allows fixing
the "problem domain" -- premature instance collection -- without
requiring that the entire ecosystem be rebuilt.

Instead, it "merely" requires that all *applications* be rebuilt.

Add a new `$(AndroidAddKeepAlives)` MSBuild property which controls
whether or not the linker inserts the `GC.KeepAlive()` calls.
`$(AndroidAddKeepAlives)` defaults to True for Release configuration
apps, and is False by default for "fast deployment" Debug builds.
`$(AndroidAddKeepAlives)` can be overridden to explicitly enable or
disable the new linker steps.

The impact on release build of XA forms template (flyout variety) is
cca 137ms on @radekdoulik's machine.

  * Disabled:

        12079 ms  LinkAssemblies                             1 calls

  * Enabled:

        12216 ms  LinkAssemblies                             1 calls

Co-authored-by: Radek Doulik <[email protected]>

[0]: https://docs.microsoft.com/dotnet/api/system.gc.keepalive
@github-actions github-actions bot locked and limited conversation to collaborators Apr 13, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Component does not function as intended generator Issues binding a Java library (generator, class-parse, etc.)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants