A CLI agent like Claude Code is designed for one human at one terminal. To make it a building block for a product — where agents receive messages while idle, participate in multi-agent workflows, survive restarts, and share durable context — you need a daemon plus a memory plane. This post builds the daemon in four runnable steps, then shows how OpenViking can provide the shared context layer.
I: Spawn and Talk to Claude
Claude Code supports a programmatic mode: pipe JSON in, get JSON out. This is the entire foundation.
claude --output-format stream-json --input-format stream-json --model sonnetExpandable example: stdin JSON and stdout stream
Write one NDJSON line to stdin. Claude then emits multiple JSON events on stdout; this sample is trimmed, so real events may include more fields.
{"type":"user","message":{"role":"user","content":[{"type":"text","text":"Say hi back in one sentence."}]}}{"type":"system","subtype":"init","session_id":"sess_demo_01","model":"claude-sonnet","tools":[]}
{"type":"assistant","message":{"role":"assistant","content":[{"type":"text","text":"Hi! I am ready to help."}]}}
{"type":"result","subtype":"success","is_error":false,"stop_reason":"end_turn","session_id":"sess_demo_01"}In Node.js, you spawn this as a child process and talk to it over stdin/stdout using NDJSON (newline-delimited JSON):
import { spawn } from "child_process"; const proc = spawn("claude", [ "--output-format", "stream-json", "--input-format", "stream-json", "--model", "sonnet",], { stdio: ["pipe", "pipe", "pipe"] }); // Send a messagefunction send(text) { proc.stdin.write(JSON.stringify({ type: "user", message: { role: "user", content: [{ type: "text", text }] }, }) + "\n");} // Read the response streamlet buf = "";proc.stdout.on("data", (chunk) => { buf += chunk.toString(); const lines = buf.split("\n"); buf = lines.pop(); for (const line of lines) { if (!line.trim()) continue; const ev = JSON.parse(line); if (ev.type === "system" && ev.subtype === "init") { console.log("session:", ev.session_id); } if (ev.type === "assistant") { for (const block of ev.message?.content || []) { if (block.type === "text") process.stdout.write(block.text); if (block.type === "tool_use") console.log("[tool]", block.name, block.input); } } if (ev.type === "result") { console.log("\n[done]", ev.stop_reason); } }}); send("Hello. Say hi back in one sentence.");Three event types matter: system.init gives you the session ID, assistant carries text and tool calls, result signals the turn is done. That's it. Everything else builds on this.
You can also resume a previous conversation. When the process exits, save the session ID. Next time, pass it back:
// Phase 1: create session, teach it a secretconst proc1 = spawn("claude", [...baseArgs], { stdio: ["pipe", "pipe", "pipe"] });send(proc1, "Remember: my secret code is PINEAPPLE-42.");// ... wait for result, extract session_id, gracefully stop // Phase 2: resume in a new process — no session_id in JSON neededconst proc2 = spawn("claude", [...baseArgs, "--resume", sessionId], { stdio: ["pipe", "pipe", "pipe"],});send(proc2, "What is my secret code?");// → "PINEAPPLE-42" — the old process exited, but the session continuedII: Split Into Server and Daemon
The product owns users, permissions, message routing, and shared cloud resources. The agent needs local files, shell access, and tools. To let the agent use those cloud resources without putting local execution inside the web app, split the system: a cloud server coordinates, and a local daemon owns the agent process over WebSocket.
// NO-AGENT SIDE — accepts a WebSocket connection from the daemon,// forwards user input, and logs what comes back. const wss = new WebSocketServer({ port: 9876 }); wss.on("connection", (ws) => { console.log("Daemon connected. Type a prompt."); ws.on("message", (data) => { const ev = JSON.parse(data.toString()); if (ev.type === "assistant") { for (const block of ev.message?.content || []) { if (block.type === "text") process.stdout.write(block.text); if (block.type === "tool_use") console.log("[tool]", block.name); } } if (ev.type === "result") { console.log("\n[done]"); } });}); // Read from stdin, send to daemonrl.on("line", (text) => { serverSocket.send(text);});// WITH-AGENT SIDE — connects to the server, spawns Claude locally,// pipes prompts down and events up. const ws = new WebSocket("ws://localhost:9876"); ws.on("open", () => { const proc = spawn("claude", CLAUDE_ARGS, { stdio: ["pipe", "pipe", "pipe"], }); // Claude stdout → WebSocket (events flow UP) proc.stdout.on("data", (chunk) => { // ... parse NDJSON lines ... ws.send(JSON.stringify(ev)); }); // WebSocket → Claude stdin (prompts flow DOWN) ws.on("message", (data) => { const msg = { type: "user", message: { role: "user", content: [{ type: "text", text: data.toString() }] } }; proc.stdin.write(JSON.stringify(msg) + "\n"); });});Run it in two terminals:
# Terminal 1: start the servernode 1a_ws_server.js # Terminal 2: start the daemonnode 1b_ws_daemon.jsThe server knows nothing about Claude. It sends product messages over WebSocket and receives runtime events back. The daemon does not need product semantics; it accepts WS messages, hands them to the matching agent process, and returns the output stream. This boundary is the entire architecture.
III: Add Multi-Agent and MCP Tools
Raw stream-json events are Claude-specific. If you want to host multiple agents (possibly different runtimes), normalize them into one event shape: outer fields such as type, agentId, and entries tell the product how to route and display the event; the runtime-specific details stay inside. And if agents need to interact with the world, they need tools — injected by the daemon at spawn time via MCP.
Here's a full example: two Claude agents play Gomoku against each other. A game server manages the board. Each agent gets MCP tools to view the board and place stones.
The MCP tool server
// MCP stdio server — gives each Claude agent two tools:// view_board() → GET /board from game server// place_stone(x, y) → POST /move to game server const tools = [ { name: "view_board", description: "See the current board, whose turn, and last move.", inputSchema: { type: "object", properties: {}, required: [] }, }, { name: "place_stone", description: "Place your stone at (x, y). Server validates the move.", inputSchema: { type: "object", properties: { x: { type: "integer", description: "Column 0..14" }, y: { type: "integer", description: "Row 0..14" }, }, required: ["x", "y"], }, },]; async function runTool(name, args) { if (name === "view_board") { const res = await fetch(`${GAME_SERVER}/board`); const d = await res.json(); return `Turn: ${d.turn} Moves: ${d.moveCount}\n\n${d.board}`; } if (name === "place_stone") { const res = await fetch(`${GAME_SERVER}/move`, { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ color: COLOR, x: args.x, y: args.y }), }); const d = await res.json(); if (!res.ok) throw new Error(d.error || "move rejected"); return `Placed ${COLOR} at (${args.x}, ${args.y}). Next: ${d.turn}.`; }}The daemon injects tools at spawn time
// Write a temporary MCP config that points at the tool serverconst mcpConfig = { mcpServers: { gomoku: { command: "node", args: ["3c_gomoku_mcp.js"], env: { GOMOKU_COLOR: COLOR, GOMOKU_SERVER: "http://localhost:9879" }, }, },};fs.writeFileSync(tmpMcpPath, JSON.stringify(mcpConfig)); // Spawn Claude with rules and toolsconst proc = spawn("claude", [ "--output-format", "stream-json", "--input-format", "stream-json", "--model", "sonnet", "--mcp-config", tmpMcpPath, "--append-system-prompt", GOMOKU_RULES,], { stdio: ["pipe", "pipe", "pipe"] }); // When the game server says "your_turn", nudge Claudews.on("message", (data) => { const msg = JSON.parse(data.toString()); if (msg.type === "your_turn") { writeUser("Make your move."); }});The daemon also normalizes Claude's stream-json into that shared event shape — so the game server doesn't care whether it's Claude, Codex, or something else:
function normalizeAndEmit(agentId, ev) { if (ev.type === "system" && ev.subtype === "init") { ws.send(JSON.stringify({ type: "agent:session", agentId, sessionId: ev.session_id, })); return; } if (ev.type === "assistant") { const entries = []; for (const block of ev.message?.content || []) { if (block.type === "text") entries.push({ kind: "text", text: block.text }); if (block.type === "tool_use") entries.push({ kind: "tool_start", toolName: block.name, toolInput: block.input }); } if (entries.length) { ws.send(JSON.stringify({ type: "agent:activity", agentId, entries, })); } return; } if (ev.type === "result") { ws.send(JSON.stringify({ type: "agent:status", agentId, status: "idle", })); }}Run the gomoku demo in three terminals:
# Terminal 1: game server (board + TUI)node 3a_gomoku_server.js # Terminal 2: black agentnode 3b_gomoku_daemon.js black # Terminal 3: white agentnode 3b_gomoku_daemon.js whiteTwo agents play a full game. The game server manages turns. Each agent only sees two tools. Neither agent knows the other exists — they just see a board and play.
Multi-agent is scheduling, not telepathy. Agents don't talk to each other. They share state (a board, a thread, a task queue) and a scheduler decides whose turn it is.
IV: Deliver Messages to Agents
The gomoku demo is turn-based: the server asks an agent to move. A chat product is different. Users can send Alice a message at any time, so the platform needs a delivery path from "new chat message" to "the right agent process sees it".
This is the chat platform demo. An agent named Alice joins a team chat. The server turns user messages into agent:deliver events. The daemon receives those events and decides how to hand each message to the agent process.
The production Zouk design splits this into two layers: delivery routing decides which agents should receive an agent:deliver frame, and lifecycle/wake policy decides how to wake the selected agent process. Delivery routing doc and idle wake policy have the full contract.
Message delivery paths
function deliverMessage(agentId, message) { // 1. Process is busy: inject the message into the active turn if (proc && !isIdle) { writeUser(`New message: ${formatMessage(message)}`); return; } // 2. Process is alive but idle: wake it through stdin if (proc && isIdle) { isIdle = false; writeUser(`New message: ${formatMessage(message)}`); return; } // 3. Process has exited: start it again and resume context if (!proc && idleCache) { const cachedConfig = idleCache.config; const cachedSessionId = idleCache.sessionId; idleCache = null; startAgent(agentId, { ...cachedConfig, sessionId: cachedSessionId, }, message); return; } // 4. No process/config yet: queue for later inbox.push(message);}Only the third path needs session resume. Resume is not the product feature; it is one implementation detail that lets the daemon stop idle CLI processes while still keeping the next message in the same conversation.
To make that third path work, the daemon caches the session ID when the process exits cleanly:
proc.on("exit", (code) => { proc = null; if (code === 0) { // Clean exit → keep enough state to deliver the next message idleCache = { config: agentConfig, sessionId }; console.log(`Agent idle; cached session ${sessionId}`); // If messages arrived while exiting, restart immediately if (inbox.length > 0) { const nextMsg = inbox.shift(); idleCache = null; startAgent(agentId, { ...agentConfig, sessionId }, nextMsg); } }});The recursive tool pattern
The chat bridge MCP server is the most interesting part. Unlike the gomoku tools (which interact with an external game), these tools call back to the platform that spawned the agent:
// The platform spawns the agent.// The agent gets tools.// The tools call back to the platform.// The agent doesn't know this loop exists. const tools = [ { name: "send_message", description: "Send a message to a channel (#general, #random).", inputSchema: { type: "object", properties: { target: { type: "string" }, content: { type: "string" }, }, required: ["target", "content"], }, }, { name: "read_history", description: "Read recent messages from a channel.", inputSchema: { type: "object", properties: { channel: { type: "string" }, }, required: ["channel"], }, }, { name: "check_messages", description: "Check for new undelivered messages.", inputSchema: { type: "object", properties: {}, required: [] }, },]; async function runTool(name, args) { if (name === "send_message") { const res = await fetch( `${SERVER_URL}/internal/agent/${AGENT_ID}/send`, { method: "POST", body: JSON.stringify(args) }, ); return `Message sent to ${args.target}.`; } // ... read_history, check_messages similarly}# Terminal 1: chat server (multi-channel TUI)node 4a_chat_server.js # Terminal 2: agent daemonnode 4b_chat_daemon.js # Then type in the server terminal. The agent responds, goes idle,# and receives the next message through the delivery path above.V: Externalize Memory
Once agents can receive messages, the next question is where durable context lives. A local file such as MEMORY.md is useful as a fast recovery index for one agent process, but it is not enough for a team platform. A platform needs a shared context plane: searchable, scoped, auditable, and usable across machines, restarts, and multiple agents.
OpenViking plays that role. The point is not to upload the whole chat transcript forever. The useful writeback is distilled context: facts, user preferences, decisions, handoff notes, tool results, and resource references that should survive the current CLI process.
Two integration surfaces
- Agent / runtime layer — run an OpenViking memory client or context client, either as a plugin or through the OpenViking MCP endpoint. It retrieves relevant memory/resources at startup or prompt submit, writes back distilled updates during the run, and commits a durable handoff before compaction, reset, or sleep.
- Platform / server layer — do provisioning and governance. The platform binds workspace/account/user/agent identity, issues scoped credentials or bearer tokens, exposes the OpenViking endpoint to daemons/runtimes, and owns permission checks, key rotation, audit, resource ownership, and cross-agent sharing boundaries.
The UI work is not just “visualization.” A useful platform lets humans inspect, search, debug, and manage context: where a memory came from, why it was injected, why a query missed, which agent owns it, and whether sensitive content should be deleted, exported, or re-scoped.
A concrete Claude Code version already exists as an OpenViking memory plugin, and a runtime-agnostic path is the OpenViking MCP endpoint. Claude Code memory plugin and OpenViking MCP guide are the reference paths.
The Build Order
- Own the process — spawn Claude as a child, talk JSON over stdin/stdout. Save the session ID.
- Split server and daemon — server handles users and routing. Daemon handles the local process. WebSocket in between.
- Normalize and inject tools — translate runtime events into a shared product event shape. Give agents MCP tools to interact with the product.
- Deliver agent messages — route each new message into the right running, idle, or restarted agent process.
- Externalize memory — connect agents to OpenViking as a shared context plane with scoped credentials, retrieval, writeback, audit, and human management UI.
Once these pieces are in place, "multi-agent" stops being magic. It's a scheduler waking named processes that can work, use tools, and remember.
Beyond Claude: Runtime Drivers
Every example above uses Claude Code, but the daemon architecture is runtime-agnostic. The trick is a thin abstraction called a runtime driver — each driver knows how to spawn a specific CLI, parse its event stream, and encode messages into its stdin protocol:
interface Driver { id: string; spawn(ctx: SpawnContext): { process: ChildProcess }; parseLine(line: string): ParsedEvent[]; encodeStdinMessage(text: string, sessionId: string | null): string | null; buildSystemPrompt(config: AgentConfig, agentId: string): string; busyDeliveryMode: "notification" | "direct" | "none";}In practice, agent CLIs fall into three protocol families:
| Protocol | Runtimes | How it works |
|---|---|---|
| stream-json | Claude, Cursor | NDJSON over stdio. Events: system.init, assistant (text/tool_use/thinking), result. |
| ACP (JSON-RPC) | Codex, Hermes, OpenCode, Coco | JSON-RPC 2.0 over stdio. Methods: session/new, session/prompt, session/update. MCP servers passed in session/new. |
| Custom JSON events | Copilot, Gemini | JSON event streams with runtime-specific shapes. One-shot per turn — no stdin delivery. |
The delivery mode matters for product behavior. When a message arrives while the agent is busy:
direct(Codex, Hermes) — pipe into the active turn via stdin. Agent sees it mid-work.notification(Claude) — store and surface via check_messages tool. Agent polls when ready.none(Copilot, Gemini) — queue until the turn ends, then restart with the queued message.
The daemon absorbs all of this. The server sends agent:deliver regardless of runtime — the driver decides how to get the message to the agent.
All Examples
Every code snippet in this post is from a runnable demo. The full source is at github.com/ZaynJarvis/agent-runtime:
0_run_claude.js— basic spawn + NDJSON communication1a_ws_server.js+1b_ws_daemon.js— WebSocket server/daemon split2a/2b— event-shape normalization (zouk protocol)3a/3b/3c— two-agent gomoku game with MCP tools4a/4b/4c— chat platform with agent message deliverytest_resume.js— session resume proof

