CLI Use
When an agent runs sootsim do tap-id loginButton, you see a visible cursor
animate to the element, the touch fires, and the canvas updates. No
screenshot OCR, no XCUITest, no visual reasoning — same surface as a human,
just faster.
The realtime cursor view
Every do command renders a visible agent cursor directly into the canvas.
The cursor walks to the target node, the press visually depresses, and the
screen updates in place.
terminal
What this gets you:
- Watch a remote or CI agent work in real time on a single sim window. Useful for “why didn’t this tap take?” without re-running.
- Recordings (
sootsim record) capture cursor too, so flow replays look like a person used the app. - One canonical interaction surface — the same path a human’s pointer takes, the same hit testing, the same gesture pipeline.
Per-session claim leases
Multiple agents can attach to the bridge at once without stepping on each other:
- Reads (
describe,tree,screenshot,get *) always pass through. - Writes (
do tap,do type,close) are gated by a per-session lease. - Leases expire after 10 minutes of inactivity.
- Same-session sockets coexist — one agent spawning parallel CLI calls works without self-disconnects.
terminal
Two Claude Codes in two terminals, two Cursors in two IDEs, a CI test alongside a human-driven session — all coexist without fighting over the bridge.
Agent-aware auto-settle
Every write polls layout stability before returning, so the agent sees post-animation state instead of mid-transition state.
| Caller | Settle budget | Why |
|---|---|---|
| Agent | 1200 ms | Gives animations room to finish |
| Human | 350 ms | Keeps interactive use snappy |
Sootsim auto-detects agent environments via CLAUDECODE, CURSOR_TRACE_ID,
and a few other env vars. If your tool isn’t detected automatically, set
SOOTSIM_AGENT=1.
No sleep chaining, no do settle “just to be safe” — every write already
participates.
The accessibility DOM mirror
Canvas nodes are mirrored into a hidden DOM tree as real semantic elements —
<button>, <a>, <input>, <textarea>, <h2>, <img> — with full ARIA
(role, label, state, value, hint), data-testid, and text content.
Any MCP-aware browser tool (Claude Code, Cursor, Codex, Chrome MCP) treats the sim like a website — clicks the same elements, reads the same labels, types into the same inputs.
Two modes:
- Shallow — flat DOM, one element per hit-testable node. Default.
- Deep — nested interactive proxies. Scrollable
<div>, real<input>/<textarea>,<button>that forwards to canvas.
Throttle defaults to 100 ms; window.__sootsimA11y.active() drops to 30 ms
for high-frequency agent loops.
describe, find, do — the daily loop
terminal
Companion reads:
sootsim tree— compact treesootsim find --testid loginButton— locate a node by test IDsootsim get errors 5— recent console errorssootsim get requests 5— recent network callssootsim debug state— shell state, keyboard state, scroll node, hit-test, gesture state
Env vars
| Variable | Effect |
|---|---|
SOOTSIM_AGENT=1 | Force agent mode (longer auto-settle, verbose JSON where applicable) |
SOOTSIM_SESSION_ID=<id> | Stable session key so same-session sockets coexist |
SOOTSIM_UPLOAD_ORIGIN=<url> | Override the upload host for sootsim upload and flow --preview |