CLI Use

When an agent runs sootsim do tap-id loginButton, you see a visible cursor animate to the element, the touch fires, and the canvas updates. No screenshot OCR, no XCUITest, no visual reasoning — same surface as a human, just faster.

The realtime cursor view

Every do command renders a visible agent cursor directly into the canvas. The cursor walks to the target node, the press visually depresses, and the screen updates in place.

terminal

sootsim do tap-text "Sign in"
sootsim do tap-id createThread
sootsim do scroll feed-list 0 500
sootsim do type "hello world"

What this gets you:

Watch a remote or CI agent work in real time on a single sim window. Useful for “why didn’t this tap take?” without re-running.
Recordings (sootsim record) capture cursor too, so flow replays look like a person used the app.
One canonical interaction surface — the same path a human’s pointer takes, the same hit testing, the same gesture pipeline.

Per-session claim leases

Multiple agents can attach to the bridge at once without stepping on each other:

Reads (describe, tree, screenshot, get *) always pass through.
Writes (do tap, do type, close) are gated by a per-session lease.
Leases expire after 10 minutes of inactivity.
Same-session sockets coexist — one agent spawning parallel CLI calls works without self-disconnects.

terminal

sootsim list                    # see connected tabs and their session ids
sootsim claim tab-2             # take the write lease on tab-2
sootsim claim tab-2 --force     # boot the incumbent if locked

Two Claude Codes in two terminals, two Cursors in two IDEs, a CI test alongside a human-driven session — all coexist without fighting over the bridge.

Agent-aware auto-settle

Every write polls layout stability before returning, so the agent sees post-animation state instead of mid-transition state.

Caller	Settle budget	Why
Agent	1200 ms	Gives animations room to finish
Human	350 ms	Keeps interactive use snappy

Sootsim auto-detects agent environments via CLAUDECODE, CURSOR_TRACE_ID, and a few other env vars. If your tool isn’t detected automatically, set SOOTSIM_AGENT=1.

No sleep chaining, no do settle “just to be safe” — every write already participates.

The accessibility DOM mirror

Canvas nodes are mirrored into a hidden DOM tree as real semantic elements — <button>, <a>, <input>, <textarea>, <h2>, <img> — with full ARIA (role, label, state, value, hint), data-testid, and text content.

Any MCP-aware browser tool (Claude Code, Cursor, Codex, Chrome MCP) treats the sim like a website — clicks the same elements, reads the same labels, types into the same inputs.

Two modes:

Shallow — flat DOM, one element per hit-testable node. Default.
Deep — nested interactive proxies. Scrollable <div>, real <input>/<textarea>, <button> that forwards to canvas.

Throttle defaults to 100 ms; window.__sootsimA11y.active() drops to 30 ms for high-frequency agent loops.

describe, find, do — the daily loop

terminal

# 1. ground the agent in current state
sootsim describe --verbose

# 2. locate the target
sootsim find --testid createThread

# 3. act
sootsim do tap-id createThread

# 4. verify
sootsim describe --verbose
sootsim get errors 5

# 5. record the flow for a PR preview
sootsim record --mode combined --duration 8 --open

Companion reads:

sootsim tree — compact tree
sootsim find --testid loginButton — locate a node by test ID
sootsim get errors 5 — recent console errors
sootsim get requests 5 — recent network calls
sootsim debug state — shell state, keyboard state, scroll node, hit-test, gesture state

Env vars

Variable	Effect
`SOOTSIM_AGENT=1`	Force agent mode (longer auto-settle, verbose JSON where applicable)
`SOOTSIM_SESSION_ID=<id>`	Stable session key so same-session sockets coexist
`SOOTSIM_UPLOAD_ORIGIN=<url>`	Override the upload host for `sootsim upload` and `flow --preview`

Setup

Agent Devtool (Inspect)