Building a Unified View Tree Analyser for Android (Views and Compose)
Introduction
Working with Android UIs today often means navigating a landscape where traditional XML Views and modern Jetpack Compose live side-by-side. This hybrid environment presents unique challenges, especially when you need to programmatically understand the structure of your entire screen. I recently took on a project for a client who needed an SDK capable of traversing any screen, capturing the position, size, and ID of every UI component, and sending that data to a server. This immediately highlighted a crucial requirement: the SDK had to work seamlessly across both XML and Compose.
The reality of the Android ecosystem is this fragmentation. Many apps still contain legacy XML code while gradually adopting Compose. Clients require tools that function across their entire codebase, regardless of the UI toolkit used. This unified approach is essential for various use cases, such as accessibility analysis or generating runtime UI analytics, where a complete picture of the screen is necessary.
The Challenge: Two Distinct UI Worlds
In a traditional Android View system, everything is a View
. Your XML layouts are inflated into a hierarchy of objects that inherit from View
, with special ViewGroup
s acting as containers and layout managers. Traversing this tree using methods like ViewGroup.getChildAt
is straightforward.
Jetpack Compose does things differently. While we write @Composable
functions, at runtime, Compose builds a tree of internal nodes (like LayoutNode
s and SemanticsNode
s) that the system uses for layout, drawing, and semantics. This runtime tree is distinct from the View hierarchy.
The core issue stems from this fundamental difference in modelling. The View tree and the Compose tree don’t naturally connect. Specifically a ComposeView
, while being a View
itself, acts like a black box to the traditional View traversal process – anything rendered inside the ComposeView
using Compose is hidden from standard getChildAt
calls. Conversely, Compose's internal mechanisms don't inherently traverse down into traditional Views embedded within a Compose layout (though this latter case is less common for the root problem).
This boundary between the View hierarchy and the Compose tree within a ComposeView
was the primary technical hurdle we needed to overcome to create a truly unified analyser.
Now, handling the traditional Android Views part was relatively straightforward. Starting from the root of the view hierarchy, typically the decorView
of the window, we can reliably get the entire tree structure. This is done by iterating through the children of each View
, and recursively traversing into any ViewGroup
encountered. This process allows us to walk the entire XML-inflated tree.
However, it’s crucial to ensure that the layout pass has completed before attempting to read properties like size and position. To guarantee this, we need to enqueue our traversal logic to run after the layout is finished, often by using decorView.post { ... }
. Once the layout is ready, we can traverse the tree and, voilà, the structure and basic properties of your XML views are captured.
Here’s a simplified look at the analyzeViewElement
function that handles traversing the traditional View hierarchy. This function takes the current View
and a callback (onMatch
) that we use to report the data we extract for each element.
private fun analyzeViewElement(root: View, onMatch: (JsonObject) -> Unit) {
// Extract data and add properties to the json object for the current view
onMatch(
JsonObject().apply {
// add properties to the json (e.g., id, type, visibility, frame)
}
)
// Recursively call for children if it's a ViewGroup
if (root is ViewGroup) {
for (i in 0 until root.childCount) {
analyzeViewElement(root.getChildAt(i), onMatch)
}
}
}
Exploring Solutions (And Why They Weren’t Quite Right)
With the challenge clearly defined, traversing both View and Compose trees from the View side, I began exploring different technical approaches to access the Compose tree’s structure and layout information at runtime.
Idea 1: Compose Testing API
My initial thought was to see if there was any close equivalent to the view tree. The Android developer documentation points to the “semantics tree,” which is used during testing and by accessibility services to understand the meaning and structure of UI components. This seemed promising, as it’s where properties like contentDescription
are stored, and it felt like the Compose equivalent of the View tree I was already traversing.
The printToString()
method available in instrumented tests provides a detailed dump of this semantics tree.
@RunWith(AndroidJUnit4::class)
class ComposeAnalyser {
@get:Rule
val composeTestRule = createAndroidComposeRule<MainActivity>()
@Test
fun analyseComposeUi() {
try {
composeTestRule.waitForIdle()
val root = composeTestRule.onRoot(useUnmergedTree = true).fetchSemanticsNode()
// ViewTreeAnalyser.analyseComposeRoot(root, "Home Screen")
Log.d("ComposeAnalyser", root.printToString())
} catch (e: Exception) {
Log.e("ComposeAnalyser", "Error analysing compose UI: ${e.message}")
}
// Hack to keep the activity alive for manual inspection (not for automated runs)
try {
Thread.sleep(Long.MAX_VALUE) // Sleep forever
} catch (_: InterruptedException) {
}
}
}
However, the documentation didn’t mention how to access the semantics tree outside of an instrumented testing environment. The core issue is that these testing APIs and the underlying test infrastructure (@RunWith
, createAndroidComposeRule
) are specifically designed to run in a separate test process or environment provided by the AndroidX Test framework. This environment is not active in a regular application build.
While I could create a test to extract the data, it was highly inconvenient for the SDK’s purpose. It would require the client to integrate the SDK to launch a test just to get the UI data, pass the Activity context to create a ComposeTestRule
, and potentially add complex hacks (like the Thread.sleep
shown above to keep the screen visible for inspection) to handle the test lifecycle. This was a very big "no" for a runtime SDK.
Idea 2: Hijacking the Composition and Hooking an Applier
Since the testing API wasn’t suitable for runtime analysis, I next considered whether I could hook into Compose’s internal rendering pipeline. A recommendation led me to look into the Google Maps Compose SDK, which uses custom composition and appliers.
Heading back to the Android docs and source code, I learned that Appliers are used by the Compose runtime to track insertions and removals of nodes during the Composition phase. Composition is where Compose figures out what UI elements should exist based on your @Composable
functions and state.
The problem here is that Composition happens before the Layout phase, where elements are measured and placed. The Applier itself doesn’t hold the final size and position properties I was looking for.
Apart from not having the necessary data, hooking into and potentially modifying Compose’s internal Composition process and Applier is complex and involves accessing internal, unstable APIs, similar to the reflection issues I wanted to avoid. This approach seemed overly complicated for the goal and still wouldn’t provide the layout data directly.
Idea 3: Using Modifiers (onGloballyPositioned
)
At this point, the only reliable public API I found for getting layout information at runtime was through modifiers like onGloballyPositioned
. This modifier executes a lambda after the composable it's attached to has been measured and placed, providing its LayoutCoordinates
which contain size and position.
The semantics tree (from Idea 1) also seemed like the right place to get semantic IDs and other properties. To connect these, I realised I would need a way to associate an element in the semantics tree with its layout information obtained via a modifier. This led to the idea of using a custom modifier that takes a unique ID and uses onGloballyPositioned
to capture the layout data for that ID.
I convinced the client to allow custom modifiers for tagging elements, which was a step forward. My approach was to build a custom modifier, let’s call it layoutInfoLogger
, that uses onGloballyPositioned
to get the coordinates.
data class ComposableLayoutInfo(
val id: String,
val size: IntSize,
val position: Offset
)
// CompositionLocal to pass a collector lambda down the tree
val LocalLayoutInfoCollector = compositionLocalOf<(ComposableLayoutInfo) -> Unit> {
{ /* Default: do nothing */ }
}
// Custom modifier that logs/collects layout info
fun Modifier.layoutInfoLogger(id: String): Modifier = this.then(
Modifier.onGloballyPositioned { coordinates ->
val size: IntSize = coordinates.size
val position: Offset = coordinates.positionInRoot() // Position relative to the root composable
val layoutInfo = ComposableLayoutInfo(id, size, position)
// Get the collector lambda from the CompositionLocal and call it
val collector = LocalLayoutInfoCollector.current
collector(layoutInfo)
}
)
While this modifier successfully captured the data for the element it was attached to, the challenge lay in collating this information from multiple scattered modifiers across the UI tree and knowing when the collection was complete for all relevant elements. I would need a central data structure to receive the information.
To manage this collection centrally without passing callbacks through every layer, I explored using a CompositionLocal
to provide a collector lambda down the tree. A parent composable could host a data structure (like a MutableMap
) and provide a lambda via CompositionLocalProvider
that the layoutInfoLogger
modifier could call.
@Composable
fun CollectLayoutInfo(
content: @Composable (collectedInfo: SnapshotStateMap<String, ComposableLayoutInfo>) -> Unit
) {
val collectedInfo = remember { mutableStateMapOf<String, ComposableLayoutInfo>() }
val collector: (ComposableLayoutInfo) -> Unit = { info ->
collectedInfo[info.id] = info
}
CompositionLocalProvider(LocalLayoutInfoCollector provides collector) {
content(collectedInfo)
}
}
This pattern worked for collecting data, but the problem of knowing when all the expected data had been collected remained. I would still need a way for the collector (the CollectLayoutInfo
composable) to know the total number or specific IDs of elements it was expecting data from. Relying on arbitrary workarounds like waiting a fixed amount of time was unreliable and fragile. Furthermore, while I could control whether the modifier collected data (e.g., using a CompositionLocal
or flag), the core issue of coordinating completion from scattered modifiers persisted.
Idea 4: Reflection on Compose Internals (Current Implementation)
While working through the other implementations, it became clear that they either required the client to run tests (Idea 1) or add significant boilerplate (Idea 3: applying modifiers everywhere and wrapping sections with a collector composable and defining expected IDs). We wanted to minimise the integration effort for the client.
Before finally settling on Idea 3, I decided to scour the Compose source code to see if I could find a better way, even if it meant using reflection. Going back to Idea 1, the Semantics tree seemed promising, but the issue was accessing it outside of testing. I realised that the ComposeView
itself, being a View
, likely held a reference to the Compose tree's root structure internally. Could I use reflection to access that from the View side?
Luckily, I found that I could, indeed, get access to the internal SemanticsOwner
of a AndroidComposeView
via reflection. The SemanticsOwner
holds the root SemanticsNode
. This meant I could start my traversal from the decorView
, walk the View tree, and when I hit an AndroidComposeView
, use reflection to jump into the Compose world and walk the semantics tree directly from there.
This approach didn’t require hooking up a separate Compose process (like testing or a custom composition) or requiring the client to add special modifiers everywhere just for tree analysis (though we still needed a modifier for custom IDs). It integrated seamlessly into the existing View traversal logic. This came out as a clear winner in terms of client integration effort.
The downsides to this approach are significant and must be acknowledged:
- Reflection Fragility: Accessing internal fields (
semanticsOwner
) and potentially internal classes (AndroidComposeView
) via reflection is highly unstable. These are not public APIs and can be changed or removed in any Compose update without notice, which will break the reflection code. - Obfuscation: In release builds that have gone through R8/ProGuard, the internal field and class names will be obfuscated (renamed), causing the reflection to fail. This approach might effectively be limited to debug builds unless complex and risky R8 keep rules are added.
In as much as the underlying structure of SemanticsOwner
holding the root SemanticsNode
is fundamental and unlikely to change drastically; the specific field names used to access it via reflection are internal implementation details. We need to be careful and test thoroughly with Compose updates. Despite these risks, for the specific requirements of this SDK, the reduced client effort made this the chosen implementation path.
The Hybrid Approach: Bridging the Gap
Given the limitations of the other approaches, I decided to refine my final idea: start with the standard Android View tree traversal and find a way to “jump” into the Compose tree when an AndroidComposeView
is encountered.
During my initial exploration of the View tree, I observed elements whose class names included both “ComposeView” and “AndroidComposeView”. This was a bit confusing at first. I knew ComposeView
is the public class we add to our layouts and that it's a ViewGroup
. My traversal showed AndroidComposeView
often appearing as a child related to the ComposeView
.
Upon consulting the Compose UI source code for AbstractComposeView
(which ComposeView
extends), I found the setContent
method implementation confirms this relationship. When setContent { ... }
is called on a ComposeView
, it checks for an existing child AndroidComposeView
and, if one doesn't exist, it creates a new AndroidComposeView
instance and adds it as a child View to the ComposeView
.
This clarified that the ComposeView
acts as a ViewGroup
container in the traditional View hierarchy, and its primary child (added by the framework when Compose content is set) is the AndroidComposeView
. This AndroidComposeView
child is the actual internal View
implementation that hosts and manages the Compose UI tree, including holding the reference to the SemanticsOwner
and other Compose internals I needed to access.
Therefore, to access the Compose tree from the View side, my strategy needed to be:
- Traverse the standard Android View hierarchy.
- When a
View
is encountered, check if it is the specificAndroidComposeView
instance that hosts the Compose content. - If it is, use reflection on this
AndroidComposeView
instance to get its internalSemanticsOwner
.
Since AndroidComposeView
is an internal implementation class and not intended for direct use or casting via public APIs; I had to identify it by matching its class name string.
Here’s how I incorporated this check into the View traversal logic:
private fun analyzeViewElement(view: View, onElementAnalyzed: (JsonObject) -> Unit) {
if (view is ViewGroup) {
for (i in 0 until view.childCount) {
analyzeViewElement(view.getChildAt(i), onElementAnalyzed)
}
}
// Check if the current view is the internal AndroidComposeView host
if (view::class.java.name == "androidx.compose.ui.platform.AndroidComposeView") {
analyzeComposeView(view, onElementAnalyzed)
}
}
// analyzeComposeView function will take the AndroidComposeView instance
// and use reflection to get its SemanticsOwner and traverse the SemanticsNode tree.
private fun analyzeComposeView(androidComposeView: View, onElementAnalyzed: (JsonObject) -> Unit) {
// ... reflection code to get SemanticsOwner from androidComposeView ...
// ... then traverse SemanticsNode children ...
}
By checking the class name string, I could identify the specific View
instance that serves as the host for the Compose UI and proceed to analyse its internal Compose tree structure using reflection. This hybrid approach allowed me to traverse the unified View and Compose hierarchy from the root decorView
, correctly handling the boundary presented by ComposeView
and its AndroidComposeView
child.
Diving into the Implementation
Following the previous section, where we identified the AndroidComposeView
as the bridge into the Compose world, we now dive into the code that performs the actual analysis of the Compose tree. This process involves accessing internal Compose structures via reflection and then recursively traversing the Compose semantics tree.
Our entry point into the Compose analysis is the analyzeComposeView
function, which receives the AndroidComposeView
instance found during our View tree traversal. This function's primary job is to obtain the root of the Compose tree hosted by this specific View
. As discussed, this requires accessing the internal SemanticsOwner
.
Since the SemanticsOwner
field is internal to AndroidComposeView
and not exposed publicly, we must use reflection. This involves getting the AndroidComposeView
class, finding the semanticsOwner
field by name, and making it accessible.
// Inside analyzeComposeView(view: View, ...)
// 'view' is the AndroidComposeView instance
// Attempt to get the internal 'semanticsOwner' field via reflection
val androidComposeViewClass = Class.forName("androidx.compose.ui.platform.AndroidComposeView")
val semanticsOwnerField = androidComposeViewClass
.getDeclaredField("semanticsOwner") // Find the private field
.apply { isAccessible = true } // Bypass access controls (risky!)
// Get the value of the field from the current view instance
val semanticsOwner = semanticsOwnerField.get(view) as? SemanticsOwner
// If successful, start traversing the Semantics tree
if (semanticsOwner != null) {
analyzeSemanticsNode(semanticsOwner.rootSemanticsNode, view.context, onElementAnalyzed)
}
// ... include try/catch for reflection errors ...
Important Warning: As shown in the snippet, accessing the semanticsOwner
field requires reflection and using isAccessible = true
. This bypasses normal access controls and relies on the internal name and structure of the AndroidComposeView
class, which is not a stable public API. This code is highly fragile and is likely to break with future Compose UI library updates or in release builds with R8/ProGuard obfuscation (unless specific, risky keep rules are applied). We include error handling to prevent crashes, but analysis of that Compose subtree will be skipped if reflection fails.
If we successfully obtain the SemanticsOwner
, we then call the analyzeSemanticsNode
function, passing it the SemanticsOwner.rootSemanticsNode
. This function is responsible for recursively traversing the Compose semantics tree, similar in concept to how we traversed the View tree, but operating on SemanticsNode
objects.
// Function signature for the recursive SemanticsNode traversal
private fun analyzeSemanticsNode(
semanticsNode: SemanticsNode,
context: Context, // Needed for density if converting units (though we use pixels)
onElementAnalyzed: (JsonObject) -> Unit // Callback to report data
) {
// ... code to extract data from this semanticsNode ...
// Recursively call for children SemanticsNodes
semanticsNode.children.forEach { child ->
analyzeSemanticsNode(child, context, onElementAnalyzed)
}
}
Inside analyzeSemanticsNode
, for each SemanticsNode
, we extract the necessary properties. The ID is obtained from our custom ViewTagKey
(if the modifier was used), or we fall back to a generated ID. The element's type can often be inferred from SemanticsProperties.Role
. We also extract other useful semantic information like Text, Content Description, and whether it's clickable or focusable from the node's config
.
// Inside analyzeSemanticsNode, extracting data
// Get window-relative position and size in pixels
val xInWindow = semanticsNode.positionInWindow.x.roundToInt()
val yInWindow = semanticsNode.positionInWindow.y.roundToInt()
val widthPx = semanticsNode.size.width
val heightPx = semanticsNode.size.height
// Create JSON object for this node
onElementAnalyzed(
JsonObject().apply {
addProperty("id", semanticsNode.config.getOrNull(ViewTagKey) ?: "no-id-${semanticsNode.hashCode()}")
addProperty("type", semanticsNode.config.getOrNull(SemanticsProperties.Role)?.toString())
addProperty("text", semanticsNode.config.getOrNull(SemanticsProperties.Text)?.joinToString(""))
// ... add other properties like contentDescription ...
// Add frame data (position and size in pixels)
add(
"frame",
JsonObject().apply {
addProperty("x", xInWindow)
addProperty("y", yInWindow)
addProperty("width", widthPx)
addProperty("height", heightPx)
}
)
}
)
This recursive process, starting from the root semantics node obtained via reflection, allows us to traverse the entire Compose UI tree hosted within the AndroidComposeView
, collect data from each SemanticsNode
, and report it via the callback.
For the complete implementation of the ViewTreeAnalyzer
object, including error handling and the full JSON construction, please refer to the full code listing at the end of this article.
Handling Size and Position Consistently
Now that the traversal logic was in place for both View and Compose trees, a critical step was ensuring consistency in the reported size and position data for each element. This data, which we’re structuring into a “frame” object in our JSON output, needed to be comparable regardless of whether the element was a traditional View or a Compose node.
Since the ultimate goal was to capture a screenshot and send it along with the JSON data to the backend, I decided to use the application window as the consistent point of reference for each element’s position. This ensures that the (x, y) coordinates in the JSON are directly mapped to the pixel locations within the screenshot image of the window.
For traditional XML views, calculating the position relative to the window is straightforward using the View.getLocationInWindow
method. This method populates an array with the view's top-left coordinates in window coordinates (pixels).
Here’s how I used View.getLocationInWindow
to get the window-relative position for a traditional View:
// inside analyzeViewElement(view: View, ...)
// Calculate window-relative position for traditional Views
val locationInWindow = IntArray(2)
view.getLocationInWindow(locationInWindow) // Get window-relative position in pixels
val xInWindow = locationInWindow[0]
val yInWindow = locationInWindow[1]
// ... continue in analyzeViewElement ...
// Report frame in pixels relative to the application window
add(
"frame",
JsonObject().apply {
addProperty("x", xInWindow) // Use window-relative x (pixels)
addProperty("y", yInWindow) // Use window-relative y (pixels)
addProperty("width", view.width)
addProperty("height", view.height)
}
)
For Compose nodes, the SemanticsNode
object provides a similar field, positionInWindow
, which returns the node's top-left offset relative to the window in pixels. The size is available directly from semanticsNode.size
.
Similarly, for Compose nodes, SemanticsNode.positionInWindow
provides the window-relative coordinates:
// Inside analyzeSemanticsNode(semanticsNode: SemanticsNode, ...)
// Use positionInWindow for window-relative coordinates in pixels
val xInWindow = semanticsNode.positionInWindow.x.roundToInt()
val yInWindow = semanticsNode.positionInWindow.y.roundToInt()
val width = semanticsNode.size.width
val height = semanticsNode.size.height
// Report frame in pixels relative to the application window
add(
"frame",
JsonObject().apply {
addProperty("x", xInWindow) // Use window-relative x (pixels)
addProperty("y", yInWindow) // Use window-relative y (pixels)
addProperty("width", width)
addProperty("height", height)
}
)
By consistently using window-relative coordinates, the “frame” data in the JSON output accurately represents the location and dimensions of each element, as shown in a screenshot of the application window.
Putting it Together (Code Structure)
Having explored the challenges and the hybrid approach, we’ve combined all the discussed techniques — View traversal, reflection into AndroidComposeView
, and SemanticsNode
analysis with consistent window-relative positioning – into a single ViewTreeAnalyzer
object.
Here is the complete code for the analyser:
Risks, Limitations, and Future Hopes
The major risk of this approach is the use of reflection to access internal APIs. This makes the code inherently fragile. While the SemanticsOwner
interface itself is public, the field within AndroidComposeView
that holds an instance of it is internal. Accessing this specific internal field via reflection is what makes the code fragile, as its name or structure can change at any given time without prior notice.
This code is primarily intended for use in debug builds, where no obfuscation will occur. However, if you intend to use it in release builds, you must ensure you add appropriate ProGuard/R8 keep
rules for the AndroidComposeView
class and its semanticsOwner
field to prevent them from being renamed or removed during optimization. Failing to do so will break the reflection functionality.
I do hope that future versions of Jetpack Compose will provide more stable and public APIs for this kind of deep UI introspection. As we transition to a new approach for building UIs, it’s essential that developers retain the ability to perform powerful runtime analysis, similar to what was possible with traditional XML-based applications.
Conclusion
In this article, we’ve navigated the complexities of building a unified UI analyser for Android, capable of inspecting both traditional XML Views and modern Jetpack Compose UIs. We explored various approaches, from leveraging testing APIs to custom modifiers, ultimately settling on a hybrid strategy that combines standard View traversal with targeted reflection to bridge the gap into the Compose tree.
This solution allows us to programmatically extract crucial information like element IDs, types, sizes, and consistent window-relative positions, providing a comprehensive snapshot of your screen’s components regardless of their underlying UI toolkit. While the use of reflection introduces fragility and requires careful maintenance, it offers a powerful way to achieve deep UI introspection when public APIs are not available.
As the Android ecosystem continues its exciting evolution towards Compose, the need for robust tools that understand both paradigms will only grow. I hope that future iterations of Jetpack Compose will provide more stable and public APIs for such advanced UI analysis, empowering developers to build even more sophisticated tools.
Thanks for reading