← All essays· Engineering

OpenViking for Claude Code & Codex: Give Your Coding Agent Persistent Memory

One command gives Claude Code and Codex persistent, cross-session memory powered by OpenViking. Experience automatic knowledge accumulation, semantic recall, and multi-platform sharing.

Every time you open a new terminal window, your Coding Agent develops amnesia. OpenViking fixes this by giving Claude Code and Codex persistent, cross-session, and cross-device memory—completely automatically. One command to install, zero changes to your workflow.

The Problem

1
Memory Silos
Context and history remain trapped in silos. Switching between machines or agents means you are always starting from scratch.
2
No Experience Reuse
Current agent "memory" is merely the active context window patched with a static CLAUDE.md. It fails to accumulate real-world experience across different tasks or repositories.
3
Constant Resets
Start a new thread, and your architectural decisions, hard-won debugging insights, and coding preferences vanish. You're back to square one.

You're forced into a frustrating loop: either meticulously maintain dense environment documents by hand, or sound like a broken record briefing the AI in every single session.

OpenViking completely solves this.


Quick Start

Whether you're running OpenViking locally deployed or using the Volcengine Cloud, installation takes just one command:

Claude Code Plugin

terminalbash
bash <(curl -fsSL https://raw.githubusercontent.com/volcengine/OpenViking/main/examples/claude-code-memory-plugin/setup-helper/install.sh)

Codex Plugin

terminalbash
bash <(curl -fsSL https://raw.githubusercontent.com/volcengine/OpenViking/main/examples/codex-memory-plugin/setup-helper/install.sh)

The interactive installer guides you through connecting to a local server (no auth required) or a remote instance (via API key). It automatically configures ~/.openviking/ovcli.conf, clones the plugin source, and registers all necessary hooks and MCP services.


What the Plugin Gives Your Agent

Once integrated, your Coding Agent gains two core superpowers: invisible conversation hooks and proactively callable MCP tools.

Conversation Hooks: Invisible Read/Write

Hooks trigger automatically at critical points in the conversation lifecycle. They require zero explicit tool calls, making them completely transparent to both you and the model.

1
SessionStart
Session begins / resumes
Automatically injects your developer profile, an index of available memories, and a summary of the previous session's Working Memory.
2
UserPromptSubmit
User sends a message
Performs semantic search against OpenViking based on your prompt, injecting the most relevant memories as context patches.
3
Stop
Model completes a turn
Stages the completed conversation turn into the current OpenViking Session.
4
PreCompact
Context about to be compressed
Force-commits the conversation history to prevent any detail loss before the model truncates the context window.
5
SessionEnd
Session terminates
Commits all pending records and triggers the asynchronous background memory distillation process.
6
SubagentStart/Stop
Sub-agent spawns / exits
Provisions isolated sessions for sub-agents to prevent memory namespace pollution.
You just focus on coding. The plugin remembers everything worth remembering and seamlessly feeds it back to the model exactly when it's needed.
— Design principle

You won't feel any performance hit. The write path is asynchronous by default—hooks like Stop and SessionEnd instantly return approve to keep the chat flowing smoothly, while a detached background worker handles the actual HTTP requests. Zero perceived latency.

MCP Tools: Active Memory Management

Beyond automatic hooks, the plugin exposes standard MCP tools that the agent can actively query when it lacks context:

ToolPurpose
searchSemantic search across history, resources, and skills
readFetch the complete contents of a viking:// URI
listBrowse the memory directory structure (supports recursion)
rememberProactively lock current context into long-term memory
add_resourceImport external files or URLs as knowledge sources (with auto-refresh)
grep / globRegex text search / pattern-based file matching
forgetClean up redundant or outdated memories
healthCheck backend service health status

These tools upgrade your agent from a passive listener to an active investigator. If the agent needs to recall an architectural spec from last week, it can proactively trigger search to retrieve the exact details on its own.


How Memory Accumulates and Distills

This is where OpenViking truly diverges from standard RAG setups. Conversations aren't just blindly embedded; they undergo rigorous, multi-stage memory lifecycle management.

Conversation Storage and Archival

Every terminal window maps to an OpenViking Session. In-progress dialog lives at viking://session/{id}/messages.jsonl. Once a session concludes or hits a token threshold (default 8,000), the archival mechanism kicks in.

Session storage structure
Session storage: in-progress conversations and archived history

Archival is more than just moving files around. It operates in two critical phases:

Phase 1: Message Archival — Old messages are shifted to the archive directory. The active Session retains only a sliding window of recent turns, preventing infinite context bloat.

Phase 2: Memory Distillation (async) — This is where the magic happens. OpenViking spins up an asynchronous LLM task in the background to execute deep processing:

  1. Generate Working Memory: Condense the archived chat into a structured 7-part summary (title, current state, goals, key decisions, relevant files, error fixes, open issues).
  2. Extract 8 types of structured memories across two dimensions:
UserProfile
Identity, tech background, and core skills — continuously merged into a single source of truth.
UserPreferences
Code styling, tooling choices, and workflow habits.
UserEntities
Specific projects, people, and abstract concepts discussed.
UserEvents
Key architectural decisions, milestones, and changelogs.
AgentCases
Classic problem–solution pairs.
AgentPatterns
Reusable logic workflows and strategies.
AgentTools
Tool usage insights: success rates, timings, and parameter configs.
AgentSkills
Memory traces of successfully executed workflows.
  1. Perform Knowledge Fusion: New insights aren't blindly overwritten; they are intelligently merged with existing ones. For instance, your profile.md grows sharper and more accurate with every session.
Memory directory structure
Organized memories neatly stored under viking://user/{name}/memories/

Memory Temperature Decay

Human memory naturally fades over time, and OpenViking mimics this behavior. A memory's "Hotness" is dictated by both its access frequency and time decay:

Hotness Decay Formula
hotness = sigmoid(log1p(access_count)) × exp(-decay_rate × age_days)
Default half-life: 7 days. Untouched for 30 days, it approaches zero. High-frequency insights stay active regardless of age.

Real-World Experience: An AI That Actually "Gets" You

Session Start: The Auto-Injected Profile

Whenever you launch or resume a chat, the plugin silently constructs and injects a bespoke context payload right from the start:

injected context
<openviking-context source="resume">
<user-profile uri="viking://user/zeus/memories/profile.md">
# Zayn (@zaynjarvis)
- Role: Senior Engineer
- Repos: volcengine/OpenViking, ZaynJarvis/*, ...
- Focus: OpenViking optimization, swarm model analysis, Zouk UI
- 2026-05-19: Completed atlas-fs toolcall toggle optimization...
</user-profile>
<available-memories>
  viking://user/zeus/memories/preferences/
    - @zaynjarvis/code_development_preference.md
    - ... (13 preference files)
  viking://user/zeus/memories/entities/
    - software_project/openviking.md
    - ... (20 entity files)
</available-memories>
<session-archive>
  <archive-overview>
# Working Memory
## Session Title
Zeus Agent Post-Restructuring Task Catch-Up & Atlas Fix
## Current State
Zeus completed atlas-fs toolcall toggle fix (PR #6)...
## Open Issues
- Zouk UI task clarification pending
- Open issues: #23, #42, #50
  </archive-overview>
</session-archive>
</openviking-context>

The days of typing "Hi, last time we were working on XXX..." are officially over.

Mid-Coding: Precision Recall

With every prompt you submit, the plugin executes a millisecond-level semantic search under the hood. Ask a passing question like "How does OpenViking handle MCP OAuth?" and it instantly fetches relevant history:

recalled memoriesbash
- [memory 60%] # MCP-Key2OAuth- https://github.com/t0saki/MCP-Key2OAuth...- [memory 55%] # OpenViking MCP- findsearchreadlist...- [memory 54%] # OpenViking- MCPStarlette Route...

The model utilizes these highly relevant historical assets to synthesize a precise answer—saving you from digging through project docs manually. The confidence score (e.g., 60%) helps the model accurately weigh the information's reliability.

Cross-Session Enlightenment

This is where you get that "this AI actually gets me" feeling. During a recent architecture review, OpenViking proactively recalled a side-by-side comparison of three design patterns we had discussed weeks prior, in a completely unrelated terminal window:

Cross-session recall of architecture comparison
OpenViking automatically surfaced a design-phase architecture comparison from weeks ago
"Having your past technical wisdom automatically surface exactly when you need it"—this is an experience that manually updating a CLAUDE.md file can never replicate.

Breaking Down Walls: Cross-Platform Memory Sharing

Because of its client-server architecture, your memory assets reside securely on the OpenViking server, rather than being fragmented across dozens of local project folders.

Run both the Claude Code and Codex plugins, and they share a single, unified brain. A tricky bug you resolved in Claude Code instantly becomes knowledge available to Codex.

It gets better. Because OpenViking exposes standard MCP endpoints, *any* MCP-compatible client can tap into it. Connect a standard desktop Claude Chat to OpenViking, and you can instantly generate a comprehensive weekly dev report pulling from all your recent terminal activities:

Auto-generating weekly report from OpenViking memories
Claude Chat auto-generates a weekly report by pulling from OpenViking memories

You can even upload the finalized report back into OpenViking, forming a perfect memory loop:

Uploading report back to OpenViking
Report content uploaded back to OpenViking as long-term memory

Claude Code vs Codex Plugin Differences

Though built on the same core philosophy, the plugins have slight implementation differences to accommodate their respective host environments:

FeatureClaude CodeCodex
Hook Count7 (full SessionEnd + sub-agent lifecycle)4 (no explicit exit callback)
Session StartInject profile + historical Working MemoryHeuristics: automatically commit previous idle sessions
Session EndAccurate SessionEnd triggerNext-start check + 30min idle fallback
MCP TransportStandard HTTP (direct /mcp endpoint)Streamable HTTP (with env token support)
Sub-agentsFully isolated sessionsNot yet supported
RuntimeNode.jsNode.js 22+

Advanced Tuning and Security

The plugin offers extensive configuration options via environment variables or your ovcli.conf file:

~/.openviking/ovcli.confbash
# Memory recall tuningOPENVIKING_RECALL_LIMIT=6           # Max 6 memories per injectionOPENVIKING_SCORE_THRESHOLD=0.35     # Filter out items below 35% relevanceOPENVIKING_RECALL_TOKEN_BUDGET=2000 # Cap injected token count to protect context # Capture strategyOPENVIKING_CAPTURE_MODE=semantic    # semantic (default continuous) or keyword (triggered)OPENVIKING_CAPTURE_ASSISTANT_TURNS=true  # Include AI replies in memory extraction # Debug & OverridesOPENVIKING_DEBUG=true               # Output logs: ~/.openviking/logs/cc-hooks.logOPENVIKING_BYPASS_SESSION=true      # Disable hooks entirely for highly sensitive sessions

Security by Design

  • Credentials never touch the disk: API keys are dynamically injected into process environment variables via a shell wrapper. They are never written to .mcp.json, remaining entirely invisible to npm scripts and crash dumps.
  • Self-pollution prevention: Before sending conversational records for LLM distillation, the plugin meticulously strips out <openviking-context> tags, ensuring that "answers generated using memory" aren't recursively stored as new knowledge.
  • Sub-agent namespace isolation: Divergent thoughts from sub-agents are strictly isolated. Each receives a unique session ID like cc-<session>__agent-<id> to prevent them from polluting the main conversational timeline.

Why Not Just Use MEMORY.md?

OpenViking is designed as a powerful complement and upgrade, not a strict replacement for native memory systems.

DimensionNative MEMORY.md / AGENTS.mdOpenViking Plugin
Storage FormatFlat Markdown textVector DB + relational graphs + structured objects
Retrieval EngineEntire file dumped blindly into context windowPrecision semantic similarity + strict token budgets
Operational ScopeTrapped inside a single project directoryCross-project, cross-session, cross-client connectivity
Capacity Limit~200 lines max (bounded by context)Virtually unlimited (backed by server storage)
Knowledge EntryRequires manual developer curationLLM-driven automated, implicit extraction

Keep using MEMORY.md for static, project-wide coding conventions. But for dynamic, organically growing knowledge—like "how did I bypass that weird auth bug last Tuesday?" or "what's my preferred way to structure React hooks?"—let OpenViking's automated memory engine handle the heavy lifting.


Conclusion

By integrating OpenViking, your Coding Agent evolves from a stateless, forgetful utility into an intelligent pair programmer that learns your habits and scales with your expertise:

  • Auto-Accumulate: Every debugging session and strategic decision becomes a permanent digital asset. Zero manual maintenance.
  • Smart Recall: Historical context surfaces naturally exactly when needed. No pre-prompting required.
  • Cross-Platform: Break out of context silos. Accumulate once, and share across Claude Code, Codex, Claude Chat, and any MCP client.
  • Continuous Evolution: Memories constantly merge, distill, and gracefully decay, ensuring your AI always operates with maximum information density.

If you're exhausted from endlessly re-onboarding your AI assistant every time you open a new terminal, it's time to install OpenViking.