Skip to content

02. UIA tree

Descolada edited this page Oct 23, 2023 · 8 revisions

What is the tree

The UIAutomation framework holds the entire desktop with all the windows and controls inside a big tree consisting of elements. At the top is the root element: the Desktop itself. The Desktop elements direct children are all the windows (Notepad, Chrome, etc), then in turn the windows have child elements (toolbars, main content, buttons, etc) and so on.

The UIAutomation tree is not a fixed structure. For example if a window closes, its element with all its child elements are destroyed. When a webpage reloads or navigates to a new page, the tree for the browser window also updates with the changes. Sometimes parts of the tree are not available right away, but only constructed when the client needs them (for example hovers over the element with the mouse).

Accessing the whole tree is usually undesirable because it contains thousands of elements, and searching through it for our elements of interest might take a very long time. Usually we try to get a smaller part of the tree, for example a window element, and for big complex windows even a smaller part of the tree.

Accessing the tree

There are multiple ways of getting elements from the tree, and here we go over some of the more important ones. To access the tree, we need to use the main UIA class. Read more about this in the "UIA class" section of this Wiki.

GetRootElement

UIA.GetRootElement()

Calling UIA.GetRootElement will get us the root, or Desktop element. This contains the whole desktop. If there are multiple desktops, then only the currently visible one is accessible.

ElementFromHandle

UIA.ElementFromHandle(hwnd:="", cacheRequest:=0, activateChromiumAccessibility:=500)

ElementFromHandle is usually used only with the hwnd argument:

  1. hwnd can be a WinTitle, a window handle (returns the UIA Element for the main window), or a control handle (returns the UIA handle for a control). To get the element for the Notepad window, we might use notepadEl := UIA.ElementFromHandle("ahk_exe notepad.exe")
  2. cacheRequest: optionally a cache request object. Read more in the Caching section of this Wiki.
  3. activateChromiumAccessibility: some windows are actually web applications based on Chromium, the same thing Chrome and Edge browsers are based upon. Sometimes accessibility for UIA isn't automatically activated, so the ElementFromHandle might need to send additional signals to the window. If activateChromiumAccessibility is set, then it tries to activate UIA if it isn't enabled. The value itself will determine how long to wait for activation confirmation. If a VarRef is passed, then that variable will be set to the Chromium Document element (only if the activation was done).

ElementFromPoint

UIA.ElementFromPoint(x?, y?, cacheRequest:=0, activateChromiumAccessibility:=500)

ElementFromPoint does what the name suggests: gets an element from screen coordinates. If no coordinates are provided, the current mouse position is used. Optionally a cache request can be provided, and activateChromiumAccessibility works the same as in ElementFromHandle.

NOTE: AutoHotkey MouseGetPos returns coordinates by default relative to the active window. Make sure to change that to screen coordinates with CoordMode "Mouse", "Screen"

GetFocusedElement

UIA.GetFocusedElement(cacheRequest:=0)

Retrieves the element that has the input focus, like a textbox or a button. The element might still have child elements as well.

ElementFromChromium

UIA.ElementFromChromium(winTitle:="", activateChromiumAccessibility:=500, cacheRequest:=0)

Retrieves the Chromium element from a Chromium app (web-browser based applications such as Skype, Discord etc). activateChromiumAccessibility works the same as in ElementFromHandle (sets the timeout, and if a VarRef is provided then it's set to the Chromium document element).

ElementFromIAccessible

UIA.ElementFromIAccessible(IAcc, childId:=0)

Gets an element from a MSAA/Acc object (the predecessor for UIA). This only works with Acc objects where the window natively implements IAccessible, which excludes most Win32 windows (like Notepad, Windows Explorer etc). The use of this method is discouraged, since UIA usually has more reliable ways of getting elements available.

Moving around in the tree

Moving from one element to another inside the tree can done in multiple ways:

FindElement/WaitElement/ElementExist

One way is using FindElement/WaitElement/ElementExist methods (which all start from an Element object) with an appropriate condition. Eg Element.FindElement({Name:"Something"}). Read more about Condition objects and Element methods in this Wiki.

Array notation

"Array notation" can be used to traverse the tree level by level. This can be done with condition objects, comma-separated paths, or both. Example 1: Element[1,2] would return the Elements first childs second child. Example 2: Element[{Type:"Button"}, {Name:"Something}] would return Elements first Button-type childs first "Something"-named element.

ElementFromPath/WaitElementFromPath/ElementFromPathExist

Traversal with ElementFromPath/WaitElementFromPath/ElementFromPathExist looks very similar to Array notation, the previous examples would look like this: Element.ElementFromPath("1,2") and Element.ElementFromPath({Type:"Button"}, {Name:"Something})

WalkTree

This is a simple method of using TreeWalkers to traverse the tree. See more in the section about TreeWalkers.

TreeWalkers

Using TreeWalkers is explained in the TreeWalker section in this Wiki. These rarely need to be used.