@whateverneveranywhere Thank you!
Yeah, that staleness hole is strictly the place we were given burned early too. The planner alternatives one thing like “click on part 47” and by the point it runs, the web page has rerendered and 47 is now an absolutely other button.
What we do in OpenClick is mainly two layers.
Inside of a batch: each and every AX motion (click on, kind, and so forth.) re-resolves the objective proper sooner than execution the use of a recent AX snapshot, no longer the only the planner noticed. We by no means depend on part indices. The whole thing is matched by the use of extra solid indicators like __ax_id, name, or function + body.
If an motion is more likely to trade state, we drive an AX refresh sooner than your next step, since that’s the place issues in most cases flow. Pixel coords are just a fallback for such things as canvas or WebGL the place AX is pointless.
Between batches: we take a recent screenshot and AX snapshot, then run a verifier style that tests if what we meant if truth be told took place. If no longer, or most effective in part, we replan with the brand new state plus a brief critique of what drifted.
So we don’t in reality agree with plans past a unmarried batch, and we stay batches small (in most cases 3–5 movements) for this reason.
In truth, the toughest instances now aren’t AX flow, however apps that divulge AX unevenly or lazily. Gmail is a vintage. Message rows may also be bizarre, so we every so often drive an AX refresh proper sooner than clicking them. Differently you get instances the place a coord click on “works” however the row by no means if truth be told turns on.
Curious to listen to what means you ended up taking right here.



