A runtime for agent-authored apps

You’re chasing a flaky deploy. You ask your agent: “make me a live status page — polls our /deploys endpoint every ten seconds, shows the last fifty, highlights failures in red. And while you’re at it, a webhook our CI can POST to that pages oncall when a deploy fails.” The agent writes both. Nice little dashboard rendered as an inline artifact, clean webhook handler in the next message. “Great,” you say, “give me the URLs so I can leave the page open on my other monitor and point CI at the webhook.” The agent does not have URLs. The dashboard lives inside the conversation. The webhook handler is a function that nothing can call.

What you needed was for both to exist on the web — the page openable in any browser, the webhook reachable by your CI, both running without a human babysitting. To get there you’d pick a host, deploy the page, deploy the handler behind a public URL, configure CORS, ship secrets, set up monitoring. The agent built the things in five minutes. Shipping them is the rest of the afternoon.

The same gap shows up everywhere a real thing needs to exist outside the chat:

These are all addressable HTTP endpoints. The agent can write the code. It can’t host it, can’t subscribe an external system to it, can’t hand you a URL. Scheduled-prompt features in agent hosts — Claude routines, similar things elsewhere — solve the “fire something on a timer” half of the problem and only that half. None of them give you a thing on the web.

I built a small daemon called cue to be that missing primitive.

What’s wrong with the alternatives

What’s missing is a thin primitive that an agent can author into directly: “here is a piece of code; run it when this thing happens; give me back a URL.” That’s what cue is.

What cue does

cue is a single daemon (cue serve) plus an MCP server. Any agent that speaks MCP — Claude Code, Cursor, VS Code Copilot, a chatbot backend wired to the MCP SDK — gets two new tool families:

That’s almost the whole surface area. Around it there are namespaces (first-class isolation boundaries with active | paused | archived lifecycle), secrets (resolved at run time, never logged), artifacts (the agent can host a static HTML/JS file alongside the action that uses it), and a per-invocation trace stored on disk. No UI layer. No catalog of pre-built integrations. No GUI for humans to click through. The whole thing is roughly one filesystem-backed store + an HTTP listener + a cron loop.

agent authors via MCP cue action (persistent) triggers • cron schedule • webhook POST • direct invoke URL • another action / agent on fire, invoke fresh sandbox unitask (per-call unikernel) run record + output caller, browser, or another action
The agent authors an action over MCP. Triggers (cron, webhook, direct URL, or another action) fire it. Each invocation runs in a fresh unitask unikernel under the action's declared policy. The trace persists; the VM does not.

The point of the minimalism is that cue isn’t trying to be a workflow product. It’s the runtime layer underneath whatever app the agent decides to build for you.

The ephemeral / persistent split

cue depends on unitask — the disposable-unikernel runtime — as a binary on PATH. Every action invocation shells out to unitask run under that action’s declared policy. There’s no library coupling between the two projects; they communicate by CLI.

The split is the architecture:

Keeping them separate has a couple of useful effects. unitask can stay narrow and reusable — anyone can use it for ad-hoc sandboxed execution, agent platform or not. cue gets to focus on what’s unique to it: the persistence and triggering layer. And the security boundary is clean: cue holds metadata and orchestrates calls, but the actual untrusted code only ever runs inside a fresh VM that unitask brought up for that one invocation. Compromising the daemon doesn’t compromise prior runs (their VMs are gone), and compromising one invocation doesn’t reach the daemon (it can only do what its policy permits).

BYO UI

cue deliberately has no UI layer. That sounds like an omission until you notice that the agent already has one — whatever surface it’s talking to the user on.

The “app” the agent built is the action plus whatever rendering surface fits the conversation. cue doesn’t try to standardize that surface — every standardization attempt would be wrong for half the cases. If an agent wants a real HTML dashboard, it can also use cue’s artifacts API to host static HTML/JS that polls the action’s URL, on the same origin. The agent ships both halves; cue serves them.

What this looks like in practice

The interaction collapses to one prompt:

You: Build me that deploy status page from earlier — polls our /api/deploys endpoint every ten seconds, shows the last fifty, highlights failures. And the webhook our CI can POST to that pages oncall when a deploy fails.

The agent picks the right tools, writes the action and the dashboard, calls into cue however many times it needs to, and hands back something like:

Dashboard:  https://cue.your-team.example.com/u/deploys/index.html
Webhook:    https://cue.your-team.example.com/w/trg_01KPY3GZQX...
Bearer:     cue_tk_8f3c9a...

You open the dashboard URL on your other monitor. You paste the webhook URL into your CI config. Both are real things on the web. Every invocation runs in a fresh unitask unikernel under the action’s declared policy; nothing executes on your host; cue inspect shows the full trace of any past run.

The point isn’t that there are four MCP tools and the agent calls them in a particular order. The point is that the agent treats cue the same way it treats read_file or run_command — just another tool — and the result is a durable, addressable, sandboxed mini-app you didn’t have to deploy.

What this is for

The most obvious use case is the one above — a developer or a power user gets a personal mini-app out of a conversation. That’s worth having.

The more interesting use case, and the one that drove me to build this, is agent-extended SaaS. A vertical product gives its customers an agent that can describe a feature in natural language and ship it as a scoped extension. The agent authors a cue action against the product’s API; cue runs it sandboxed; the customer’s “feature request” becomes a webhook URL or scheduled job inside their account. Each customer’s extensions are isolated by namespace and policy. No long-tail roadmap pressure on the product team; the platform grows from below.

For that to work you need a runtime that can be authored into safely by agents working on behalf of users you don’t fully trust. That’s a tall order. cue’s specific bet on it is: actions are sandboxed at the VM level via unitask; policy is declarative and project-pinned via .cue.toml; namespaces are the isolation primitive; every invocation is traceable; nothing ever runs on the host. The bet might not be exactly right. The shape — agent-author + sandboxed-per-call + addressable — is the part I’m confident is durable.

The pair

cue is the second half of a two-tool stack. The first half — sandboxing one piece of code in a disposable unikernel — is unitask. cue is what happens when you stop thinking of code execution as a one-shot operation and start treating it as a persistent thing the agent can build.

If the agent stack is going to grow a layer where “go build me a tool that does X, then keeps doing it” stops being a request that bounces back to the human, this is roughly what that layer looks like.

Code is at github.com/jnormore/cue.