A runtime for agent-authored apps
You’re chasing a flaky deploy. You ask your agent: “make me a live status page — polls our /deploys endpoint every ten seconds, shows the last fifty, highlights failures in red. And while you’re at it, a webhook our CI can POST to that pages oncall when a deploy fails.” The agent writes both. Nice little dashboard rendered as an inline artifact, clean webhook handler in the next message. “Great,” you say, “give me the URLs so I can leave the page open on my other monitor and point CI at the webhook.” The agent does not have URLs. The dashboard lives inside the conversation. The webhook handler is a function that nothing can call.
What you needed was for both to exist on the web — the page openable in any browser, the webhook reachable by your CI, both running without a human babysitting. To get there you’d pick a host, deploy the page, deploy the handler behind a public URL, configure CORS, ship secrets, set up monitoring. The agent built the things in five minutes. Shipping them is the rest of the afternoon.
The same gap shows up everywhere a real thing needs to exist outside the chat:
- A webhook your Stripe or GitHub or Zendesk integration POSTs to.
- A Slack slash-command endpoint that answers from your docs.
- A live dashboard you leave open in a browser tab.
- A URL you hand a teammate so they can run a one-off check without you.
- A tool another agent calls into.
These are all addressable HTTP endpoints. The agent can write the code. It can’t host it, can’t subscribe an external system to it, can’t hand you a URL. Scheduled-prompt features in agent hosts — Claude routines, similar things elsewhere — solve the “fire something on a timer” half of the problem and only that half. None of them give you a thing on the web.
I built a small daemon called cue to be that missing primitive.
What’s wrong with the alternatives
- Roll your own server. Gives up the durability win — now the agent is half a serverless deploy pipeline you’d rather not exist.
- Zapier / n8n / Make / Pipedream. GUI-authored for humans. They don’t expose agent-native authoring primitives, and they aren’t sandboxed per-call. Per-seat pricing, remote execution, vendor lock-in.
- Cloud functions (Lambda, Workers, Vercel). The deploy pipeline, IAM, build step, and runtime constraints add up to a surface the agent can’t drive without a lot of plumbing — and you end up writing the plumbing.
- Add persistence to the agent host. Claude.ai, Cursor, Codex don’t expose cron or webhook primitives, and you can’t add them. The host owns the loop.
What’s missing is a thin primitive that an agent can author into directly: “here is a piece of code; run it when this thing happens; give me back a URL.” That’s what cue is.
What cue does
cue is a single daemon (cue serve) plus an MCP server. Any agent that speaks MCP — Claude Code, Cursor, VS Code Copilot, a chatbot backend wired to the MCP SDK — gets two new tool families:
- Actions — named snippets of code with a policy. Each invocation runs in a fresh sandbox.
- Triggers — cron schedules and webhook endpoints. When a trigger fires, the corresponding action runs.
That’s almost the whole surface area. Around it there are namespaces (first-class isolation boundaries with active | paused | archived lifecycle), secrets (resolved at run time, never logged), artifacts (the agent can host a static HTML/JS file alongside the action that uses it), and a per-invocation trace stored on disk. No UI layer. No catalog of pre-built integrations. No GUI for humans to click through. The whole thing is roughly one filesystem-backed store + an HTTP listener + a cron loop.
The point of the minimalism is that cue isn’t trying to be a workflow product. It’s the runtime layer underneath whatever app the agent decides to build for you.
The ephemeral / persistent split
cue depends on unitask — the disposable-unikernel runtime — as a binary on PATH. Every action invocation shells out to unitask run under that action’s declared policy. There’s no library coupling between the two projects; they communicate by CLI.
The split is the architecture:
- unitask is ephemeral one-shot execution. Code in, fresh unikernel runs, output back, VM dies. No daemon, no state.
- cue is the persistent host on top. Stores named actions, runs the cron loop, exposes webhook endpoints, hands out invoke URLs.
Keeping them separate has a couple of useful effects. unitask can stay narrow and reusable — anyone can use it for ad-hoc sandboxed execution, agent platform or not. cue gets to focus on what’s unique to it: the persistence and triggering layer. And the security boundary is clean: cue holds metadata and orchestrates calls, but the actual untrusted code only ever runs inside a fresh VM that unitask brought up for that one invocation. Compromising the daemon doesn’t compromise prior runs (their VMs are gone), and compromising one invocation doesn’t reach the daemon (it can only do what its policy permits).
BYO UI
cue deliberately has no UI layer. That sounds like an omission until you notice that the agent already has one — whatever surface it’s talking to the user on.
- A Claude.ai artifact polls the action’s invoke URL and renders the result.
- A Claude Code session prints the latest run in the terminal.
- A Slack-based agent posts the trigger output as a message.
- The user opens the invoke URL in a browser and gets raw JSON.
- A vertical-SaaS product the agent extends embeds the URL in its existing dashboard.
The “app” the agent built is the action plus whatever rendering surface fits the conversation. cue doesn’t try to standardize that surface — every standardization attempt would be wrong for half the cases. If an agent wants a real HTML dashboard, it can also use cue’s artifacts API to host static HTML/JS that polls the action’s URL, on the same origin. The agent ships both halves; cue serves them.
What this looks like in practice
The interaction collapses to one prompt:
You: Build me that deploy status page from earlier — polls our
/api/deploysendpoint every ten seconds, shows the last fifty, highlights failures. And the webhook our CI can POST to that pages oncall when a deploy fails.
The agent picks the right tools, writes the action and the dashboard, calls into cue however many times it needs to, and hands back something like:
Dashboard: https://cue.your-team.example.com/u/deploys/index.html
Webhook: https://cue.your-team.example.com/w/trg_01KPY3GZQX...
Bearer: cue_tk_8f3c9a...
You open the dashboard URL on your other monitor. You paste the webhook URL into your CI config. Both are real things on the web. Every invocation runs in a fresh unitask unikernel under the action’s declared policy; nothing executes on your host; cue inspect shows the full trace of any past run.
The point isn’t that there are four MCP tools and the agent calls them in a particular order. The point is that the agent treats cue the same way it treats read_file or run_command — just another tool — and the result is a durable, addressable, sandboxed mini-app you didn’t have to deploy.
What this is for
The most obvious use case is the one above — a developer or a power user gets a personal mini-app out of a conversation. That’s worth having.
The more interesting use case, and the one that drove me to build this, is agent-extended SaaS. A vertical product gives its customers an agent that can describe a feature in natural language and ship it as a scoped extension. The agent authors a cue action against the product’s API; cue runs it sandboxed; the customer’s “feature request” becomes a webhook URL or scheduled job inside their account. Each customer’s extensions are isolated by namespace and policy. No long-tail roadmap pressure on the product team; the platform grows from below.
For that to work you need a runtime that can be authored into safely by agents working on behalf of users you don’t fully trust. That’s a tall order. cue’s specific bet on it is: actions are sandboxed at the VM level via unitask; policy is declarative and project-pinned via .cue.toml; namespaces are the isolation primitive; every invocation is traceable; nothing ever runs on the host. The bet might not be exactly right. The shape — agent-author + sandboxed-per-call + addressable — is the part I’m confident is durable.
The pair
cue is the second half of a two-tool stack. The first half — sandboxing one piece of code in a disposable unikernel — is unitask. cue is what happens when you stop thinking of code execution as a one-shot operation and start treating it as a persistent thing the agent can build.
If the agent stack is going to grow a layer where “go build me a tool that does X, then keeps doing it” stops being a request that bounces back to the human, this is roughly what that layer looks like.
Code is at github.com/jnormore/cue.