See the webthe way humans do.
A perception layer that gives your agent human-level web understanding. DOM, accessibility, vision, and time — fused into a single scene graph. Ships as an MCP server.
Agents can read HTML.
They can’t see the page.
Modern web apps hide meaning in CSS, layout, animation, and canvas rendering. An agent with only DOM access is blind to what a human instantly understands — which button is primary, what state a form is in, whether the spinner means loading or frozen.
This is why agents that work great in demos fall apart on real websites.
Without UIPE
dom onlyWith UIPE
dom + a11y + vision + timethese trees are the actual output shape. not a mockup.
A unified scene graph
your agent can actually reason about.
Human-level understanding.
Not just HTML. The page the way a human sees it — visual hierarchy, state, motion, meaning. Every node carries role, affordance, salience, and confirmed text.
MCP-native.
One line in your agent's config. Works with Claude Code, Cursor, Zed — anything that speaks MCP. No adapters, no wrappers, no bespoke plumbing.
Temporal awareness.
Knows what just happened. Did the click work? Did the page load? Did that spinner stop? Frame-by-frame diffs turn guesswork into ground truth.
exposed as 12 mcp tools · drop-in for claude code, cursor, zed.
Four signals,
fused into one scene graph.
UIPE captures what a browser renders and what a user experiences — then merges them. The result is a single representation your agent can reason about in one pass.
Structural markup — tags, attributes, text nodes, containment.
Semantic roles and states derived from ARIA and heuristics.
Pixel truth — a screenshot parsed by a local vision model.
Frame-to-frame deltas that reveal what just changed.
no single signal is enough. the fusion is the product.
Pay what a coffee costs.
Get a month of agent evaluations.
Transparent per-session pricing. No seats, no minimums, no retainer. Spin up an agent, use what you need, stop when you're done.
Hack, explore, prototype. Same engine, quota-metered.
- 10 Scan sessions / day
- 2 Deep sessions / day
- GitHub account, 30 days old
- Community support
Structural + a11y + fast vision. The everyday workhorse.
- DOM + a11y + OmniParser
- Typical agent eval loop
- Sub-second for most pages
- Pay-as-you-go via x402
Scan plus frontier vision and temporal analysis.
- Everything in Scan
- Frontier vision pass
- Frame-by-frame temporal diff
- Pay-as-you-go via x402
paid tiers settled via x402 · usdc on base
Free tier requires a GitHub account that’s at least 30 days old. Soft limits; we’ll contact you before anything hard stops.
Drop it into your agent.
One config block, no SDK.
UIPE ships as a remote MCP server. If your agent speaks the Model Context Protocol, it speaks UIPE.
- Claude Code
- Cursor
- Zed
- Windsurf
{
"mcpServers": {
"uipe": {
"url": "https://mcpaasta.uipe.dev/mcp",
"transport": "sse"
}
}
}coming soon · join the waitlist for early access