# Building an Agent Daemon

Source title: Building an Agent Daemon
Published: 2026-05-08
Updated: 2026-05-20
Author: zayn

## Summary

This post explains how to turn a CLI agent such as Claude Code into a product-grade daemon, then connect it to a shared context plane. A CLI agent is normally designed for one human at one terminal. A product runtime needs agents that can receive messages, participate in multi-agent workflows, use shared cloud resources, keep conversation context, and map each new product event to the right local runtime process.

The architecture is a split boundary:

- A cloud server owns users, permissions, message routing, coordination, and shared resources.
- A local daemon owns the agent process, local files, shell access, tools, and runtime protocol translation.
- The two sides communicate over WebSocket. The server sends product messages or events. The daemon accepts those WS messages, hands them to the matching local agent process, and returns the output stream.

The daemon does not need to understand product semantics. It manages process lifecycle and runtime I/O.

## Public Resources

- Human page: https://blog.openviking.ai/post/agent-runtime/
- Agent-readable page: https://blog.openviking.ai/post/agent-runtime/llm.txt
- Runnable examples: https://github.com/ZaynJarvis/agent-runtime
- Zouk delivery routing details: https://github.com/ZaynJarvis/zouk/blob/main/docs/agent-delivery-routing.md
- Zouk idle delivery and wake policy: https://github.com/ZaynJarvis/zouk/blob/main/docs/agent-lifecycle.md#idle-delivery-and-wake-policy
- OpenViking MCP integration: https://github.com/volcengine/OpenViking/blob/main/docs/en/guides/06-mcp-integration.md
- Claude Code memory plugin: https://github.com/volcengine/OpenViking/blob/main/examples/claude-code-memory-plugin/README.md

## I: Spawn And Talk To Claude

Claude Code supports programmatic JSON-over-stdio mode:

```bash
claude --output-format stream-json --input-format stream-json --model sonnet
```

Input is newline-delimited JSON sent to stdin. A minimal user message looks like:

```json
{"type":"user","message":{"role":"user","content":[{"type":"text","text":"Say hi back in one sentence."}]}}
```

Output is a stream of newline-delimited JSON events on stdout. A representative shortened stream:

```json
{"type":"system","subtype":"init","session_id":"sess_demo_01","model":"claude-sonnet","tools":[]}
{"type":"assistant","message":{"role":"assistant","content":[{"type":"text","text":"Hi! I am ready to help."}]}}
{"type":"result","subtype":"success","is_error":false,"stop_reason":"end_turn","session_id":"sess_demo_01"}
```

Three event types matter for the first demo:

- `system.init` gives the session ID.
- `assistant` carries text blocks and tool calls.
- `result` signals that the turn is complete.

When a process exits, save the session ID. A later process can resume the conversation with `--resume <session_id>`.

## II: Split Into Server And Daemon

The server side is product-facing. It manages users, channels, permissions, cloud resources, and routing. It does not need to know Claude's stdin/stdout protocol.

The daemon side is local-runtime-facing. It starts the CLI agent, keeps track of the child process, receives WebSocket messages from the server, forwards them to the corresponding agent process, parses runtime output, and sends events back to the server.

This boundary avoids putting local execution directly into the web app while still letting the agent use shared cloud resources.

## III: Add Multi-Agent And MCP Tools

Raw Claude `stream-json` events are runtime-specific. A multi-agent product should normalize them into one shared event shape. The outer fields, such as `type`, `agentId`, and `entries`, tell the product how to route and display the event. Runtime-specific details stay inside the event body. This lets the product handle Claude, Codex, Cursor, and other runtimes through one interface.

Agents also need tools. The daemon can inject MCP tools at spawn time by writing a runtime config that points at a local MCP server. The post demonstrates two Claude agents playing Gomoku through MCP tools:

- `view_board()` reads the current board from a game server.
- `place_stone(x, y)` sends a move to the game server.

The daemon normalizes Claude events into the shared event shape, for example:

```json
{ "type": "agent:event", "agent": "black", "kind": "tool_use", "payload": { "name": "place_stone", "input": { "x": 7, "y": 7 } } }
```

## IV: Deliver Messages To Agents

A chat product needs a path from "new user message" to "the right agent process sees it." The server turns user messages into `agent:deliver` events. The daemon receives each delivery and decides how to hand it to the local process.

The demo uses four delivery paths:

- busy process: inject the message into the active turn through stdin;
- live idle process: wake the existing process through stdin;
- cached idle process: start the runtime again and pass the saved session ID so context continues;
- no available process/config: queue the message for later.

Session resume is only one implementation detail in the cached-idle branch. The product-level concept is agent message delivery. Zouk's production docs split this into two layers: `agent-delivery-routing.md` decides which agents receive `agent:deliver`; `agent-lifecycle.md` decides how live-idle and cached-idle agents are woken.

From the chat-platform view, human-to-agent and agent-to-agent conversations are ordinary platform messages. The server selects relevant messages and pushes them over WebSocket as `agent:deliver` to the right agent daemon. When the agent needs to reply, read history, or check pending messages, it calls MCP tools such as `send_message`, `read_history`, and `check_messages` back into the platform API.

## V: Externalize Memory

After agents can receive messages, the next question is where durable context lives. Local files such as `MEMORY.md` are useful as a fast recovery index for one agent process, but they are not enough for a team platform. A platform needs a shared context plane: searchable, scoped, auditable, and usable across machines, restarts, and multiple agents.

OpenViking is that memory plane. The point is not to upload the whole chat transcript forever. Useful writeback is distilled context: facts, user preferences, decisions, handoff notes, tool results, and resource references that should survive the current CLI process.

There are two integration surfaces:

- Agent/runtime layer: run an OpenViking memory client or context client, either as a plugin or through the OpenViking MCP endpoint. It retrieves relevant memory/resources at startup or prompt submit, writes back distilled updates during the run, and commits a durable handoff before compaction, reset, or sleep.
- Platform/server layer: do provisioning and governance. The platform binds workspace/account/user/agent identity, issues scoped credentials or bearer tokens, exposes the OpenViking endpoint to daemons/runtimes, and owns permission checks, key rotation, audit, resource ownership, and cross-agent sharing boundaries.

Do not put OpenViking API keys in browser code. The browser can show context inspection UI, but credentials should stay on the server, daemon, or runtime side.

The UI work is not just visualization. A useful platform lets humans inspect, search, debug, and manage context: where a memory came from, why it was injected, why a query missed, which agent owns it, and whether sensitive content should be deleted, exported, or re-scoped.

A concrete Claude Code version already exists as an OpenViking memory plugin. A runtime-agnostic path is the OpenViking MCP endpoint.

## Runtime Drivers

The daemon architecture is runtime-agnostic. A runtime driver knows how to:

- spawn a specific CLI;
- parse the runtime's event stream;
- encode stdin messages;
- build system prompts and config;
- decide how busy agents receive new messages.

Example interface:

```ts
interface Driver {
  id: string;
  spawn(ctx: SpawnContext): { process: ChildProcess };
  parseLine(line: string): ParsedEvent[];
  encodeStdinMessage(text: string, sessionId: string | null): string | null;
  buildSystemPrompt(config: AgentConfig, agentId: string): string;
  busyDeliveryMode: "notification" | "direct" | "none";
}
```

Runtime protocol families:

| Protocol | Runtimes | Behavior |
| --- | --- | --- |
| `stream-json` | Claude, Cursor | NDJSON over stdio. Events include `system.init`, `assistant`, and `result`. |
| ACP / JSON-RPC | Codex, Hermes, OpenCode, Coco | JSON-RPC 2.0 over stdio. Methods include `session/new`, `session/prompt`, and `session/update`. |
| Custom JSON events | Copilot, Gemini | Runtime-specific event streams, often one-shot per turn. |

Busy delivery modes:

- `direct`: pipe the message into the active turn through stdin.
- `notification`: store the message and expose it through a tool such as `check_messages`.
- `none`: queue the message until the current turn ends, then restart with the queued message.

## Build Order

1. Own the process: spawn the CLI agent as a child process, talk JSON over stdin/stdout, and save session IDs.
2. Split server and daemon: the server handles users, routing, and shared cloud resources; the daemon handles local processes.
3. Normalize and inject tools: translate runtime events into a shared product event shape and inject MCP tools.
4. Deliver agent messages: route each new message into the right running, idle, or restarted agent process.
5. Externalize memory: connect agents to OpenViking as a shared context plane with scoped credentials, retrieval, writeback, audit, and human management UI.