Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in master branch (automation related?) #8240

Open
BAndysc opened this issue Jun 1, 2022 · 16 comments
Open

Memory leak in master branch (automation related?) #8240

BAndysc opened this issue Jun 1, 2022 · 16 comments

Comments

@BAndysc
Copy link
Contributor

BAndysc commented Jun 1, 2022

Describe the bug
The latest master branch (tested 11.0.4 and 0.10.999-cibuild0020870-beta cea6bc2) has a memory leak, probably related to Automation code.

Note: this is a regression from the current stable 0.10.x version (which doesn't have automation related modules).

To Reproduce
The repro is very simple - just add a button, press it and the button will not be freed anymore.

  1. Clone the repo: https://github.com/BAndysc/AvaloniaMenuLeak
  2. Checkout branch leak2
  3. Run the app
  4. The top text shows managed used memory, in the beginning it is less than 3 MBs
  5. Press the button Add control
  6. This will spawn a control, that has a reference to 1GB byte array, so obviously the ram usage will rise up to 1 GB
  7. Press Remove Control
  8. GC will free the control and datacontext almost instantly, and it will go down to 3 MBs, so far so good.
  9. Press Add control again
  10. This time click the button 'Click me'
  11. Press Remove control
  12. Notice this time the memory usage doesn't drop. The control is not freed!

Screenshots
dotMemory screenshot:
image

Desktop:

  • OS: Mac
  • Version: 0.10.999-cibuild0020870-beta (it works fine on 0.10.14)
@BAndysc BAndysc added the bug label Jun 1, 2022
@robloo
Copy link
Contributor

robloo commented Jun 1, 2022

Possibly related to #7751

@maxkatz6
Copy link
Member

maxkatz6 commented Jun 1, 2022

Pretty weird, I used dotMemory as well to fix previous memory leak, and I used latest master (well, couple of days old now). And there was only one memory leak with old sample code.

@grokys I will take a look tomorrow, but if you know something about automation what might be a reason here, it can help.

@robloo while I am on it, I can fix that one too. Thanks for reminding.

@BAndysc
Copy link
Contributor Author

BAndysc commented Jun 1, 2022

Pretty weird, I used dotMemory as well to fix previous memory leak, and I used latest master (well, couple of days old now). And there was only one memory leak with old sample code.

if you mean my sample code, then the previous one (menu related) didn't have this button, I've discovered it accidentally, while trying to reproduce another leak

@robloo
Copy link
Contributor

robloo commented Jun 2, 2022

@maxkatz6 If you want to close #7751 great! It's otherwise still on my list to complete before 11.0 but it's one of the lowest priority.

@maxkatz6
Copy link
Member

maxkatz6 commented Jun 3, 2022

Yeah, reproduced using Accessibility Insights to trigger automation.
Got three weak references that live after button was detached:
image

@flexxxxer
Copy link
Contributor

I think the reason for this memory leak is that the method responsible for removing one, two or three weak references is simply not called: I'm talking about Avalonia.Win32.Automation.Automation.AutomationNode.Release method:
image

I was told by @maxkatz6 that @grokys may know where this method can/should be called. Pinged the second one, hopefully he will reply and help with fix.

@timunie
Copy link
Contributor

timunie commented Sep 1, 2023

Hey @flexxxxer I just talked to @grokys and asked him if he has any idea. Unfortunately, this seems to be hard to debug as e.g. "dotmemory can't really help with COM leaks". So it may take quite a while for us to find the root cause and track this down. Thank you for your hands off on this :-)

@grokys
Copy link
Member

grokys commented Sep 1, 2023

Thanks @timunie . To address the usage of Release - I believe this should be called by the win32 automation APIs but I may be mistaken, need to find a little time to actually look into it (I may have this completely wrong rn, can't really remember offhand how it all works).

@flexxxxer
Copy link
Contributor

I have idea how to fix that (where call Release), will try implement it in few days

@BAndysc
Copy link
Contributor Author

BAndysc commented Apr 18, 2024

This problem still happens with latest master (11.2.999-cibuild0047389-alpha), but only on macOS

@BAndysc
Copy link
Contributor Author

BAndysc commented Apr 18, 2024

I am trying to understand the code, it would be great if anybody could go through this and help me understand it.

Native AvnAccessibilityElement holds a pointer to IAvnAutomationPeer.

@implementation AvnAccessibilityElement
{
IAvnAutomationPeer* _peer;
AutomationNode* _node;
NSMutableArray* _children;

It is saved into a field here:

- (AvnAccessibilityElement *)initWithPeer:(IAvnAutomationPeer *)peer
{
self = [super init];
_peer = peer;
_node = new AutomationNode(self);
_peer->SetNode(_node);
return self;
}

And initWithPeer is called from acquire:

+ (AvnAccessibilityElement *)acquire:(IAvnAutomationPeer *)peer
{
if (peer == nullptr)
return nil;
auto instance = peer->GetNode();
if (instance != nullptr)
return dynamic_cast<AutomationNode*>(instance)->GetOwner();
if (peer->IsRootProvider())
{
auto window = peer->RootProvider_GetWindow();
if (window == nullptr)
{
NSLog(@"IRootProvider.PlatformImpl returned null or a non-WindowBaseImpl.");
return nil;
}
auto holder = dynamic_cast<INSWindowHolder*>(window);
auto view = holder->GetNSView();
return [[AvnRootAccessibilityElement alloc] initWithPeer:peer owner:view];
}
else
{
return [[AvnAccessibilityElement alloc] initWithPeer:peer];
}
}

acquire is called from few places. I.e. from accessibilityHitTest, afaik called by the OS when the user press a pointer

- (id)accessibilityHitTest:(NSPoint)point
{
auto clientPoint = [[_owner window] convertPointFromScreen:point];
auto localPoint = [_owner translateLocalPoint:ToAvnPoint(clientPoint)];
auto hit = [self peer]->RootProvider_GetPeerFromPoint(localPoint);
return [AvnAccessibilityElement acquire:hit];
}

hit comes from the RootProvider_GetPeerFromPoint call, which goes to c# world via microcom generated code (code reformated for visibility, i.e. removed try{}catch{}):

static void* RootProvider_GetPeerFromPoint(void* @this, AvnPoint point)
{
    IAvnAutomationPeer __target = (IAvnAutomationPeer)global::MicroCom.Runtime.MicroComRuntime.GetObjectFromCcw(new IntPtr(@this));
    var __result = __target.RootProvider_GetPeerFromPoint(point);
    return global::MicroCom.Runtime.MicroComRuntime.GetNativePointer(__result, true);
}

Lastly, let's see MicroComRuntime.GetNativePointer (also removed parts that are not important)

public static void* GetNativePointer<T>(T obj, bool owned = false) where T : IUnknown
{
    if (obj is IMicroComShadowContainer container)
    {
        container.Shadow ??= new MicroComShadow(container);
        void* ptr = null;
        var res = container.Shadow.GetOrCreateNativePointer(typeof(T), &ptr);
        if (owned)
            container.Shadow.AddRef((Ccw*)ptr);
        return ptr;
    }
    throw new ArgumentException("Unable to get a native pointer for " + obj);
}

So it creates MicroComShadow if needed, then increment a reference counter (AddRef). Wait a minute. So if this method increases the reference counter, somewhere it should be decremented. But where? I don't see in the code when those objects can be deallocated. MicroComShadow is the object that keeps the reference to automation classes, which eventually holds references to view classes. Please correct me if am wrong.

@Gillibald
Copy link
Contributor

https://github.com/AvaloniaUI/Avalonia/blob/master/src/Avalonia.Native/AvnAutomationPeer.cs is the managed part. So if that is cleaned up it should also clean up the native part.

@BAndysc
Copy link
Contributor Author

BAndysc commented Apr 18, 2024

@Gillibald

As far as I understand, the problem is that the Native wrapper - MicroComShadow keeps a strong reference to AvnAutomationPeer, while MicroComShadow itself if rooted, thus never collected, neither AvnAutomationPeer nor Control nor DataContext in result, leading to massive leaks.

image

https://github.com/kekekeks/MicroCom/blob/2f698e682efba38feeaf4e84ca9a39052fe6cab3/src/MicroCom.Runtime/MicroComShadow.cs#L15-L20

internal IMicroComShadowContainer Target { get; }
internal MicroComShadow(IMicroComShadowContainer target)
{
    Target = target;
    Target.Shadow = this;
}

I am not exactly why MicroComShadow gets rooted (it makes sense since it has to be accessible from the unmanaged code, but I am not sure when it exactly it happens).

@maxkatz6
Copy link
Member

I always was suspicious of these IAvnString usages:
https://github.com/AvaloniaUI/Avalonia/blob/master/src/Avalonia.Native/AvnAutomationPeer.cs#L32-L40

You can clearly see that AvnString is created on C# side, but never disposed.

For file picker code I had a hack like this, ensuring that all of these strings will be disposed:

public IAvnStringArray GetAppleUniformTypeIdentifiers(int index)
{
return EnsureDisposable(new AvnStringArray(_types![index].AppleUniformTypeIdentifiers ?? Array.Empty<string>()));
}
protected override void Destroyed()
{
foreach (var disposable in _disposables)
{
disposable.Dispose();
}
}
private T EnsureDisposable<T>(T input) where T : IDisposable
{
_disposables.Add(input);
return input;
}

Which are all disposed together after file picked:
using var fileTypes = new FilePickerFileTypesWrapper(options.FileTypeChoices, options.DefaultExtension);

As well as on ObjC side all strings (ideally) should be released as soon as possible. For example, right when they are converted to NSStrings:

NSString* type_description = GetNSStringAndRelease(filters->GetName(i));

@maxkatz6
Copy link
Member

But yes, whole AutomationPeer leaking is a bigger object/problem than some strings.

@Tenjim
Copy link

Tenjim commented Aug 26, 2024

This issue is also on windows. We are using this with UIAutomation and we have that here. There is any news about it ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants