# OpenViking: The Database Paradigm for Context Engineering OpenViking is a context database for AI agents. It treats context as data: ingested, organized, indexed, summarized, scoped, retrieved, updated, remembered, and read through stable commands. Context engineering = ```text reliable reasoning constraints + complete information organization + effective context recommendation + full-lifecycle memory + traceable self-evolving learning ``` OpenViking mainly provides complete information organization. It also supports context recommendation and long-lifecycle memory. ## Public Resources - GitHub: https://github.com/volcengine/OpenViking - Docs: https://docs.openviking.ai/ - Discord: https://discord.gg/FyTkZ3ZKKm - OpenClaw memory plugin guide: https://github.com/volcengine/OpenViking/blob/main/examples/openclaw-plugin/INSTALL-ZH.md - Human page: https://blog.openviking.ai/post/openviking-context-database/ - Agent-readable page: https://blog.openviking.ai/post/openviking-context-database/llm.txt ## Context Primitives | Primitive | Contribution | Limit | | --- | --- | --- | | Prompt | Instructions, roles, rules, examples, and output targets. | Brittle when prompts become long-lived team knowledge. | | RAG / KnowledgeBase | Domain knowledge before generation. | Needs good ingestion, chunking, summaries, refresh, and evidence paths. | | Web Search | Recent public information. | Source quality, injection, SEO, and trust risks. | | Tools / MCP | APIs, functions, and systems. | Tool calls still depend on the right context and validation. | | Skills | SOPs, workflows, rules, and tool usage patterns. | Needs broader retrieval and evidence support. | | Memory | Experience, preferences, facts, and outcomes. | Only useful when summarized, scoped, retrieved, updated, and explainable. | ## Pain Points - Cross-repository coding breaks local context. - Long-running agents forget preferences, corrections, and task constraints. - Team knowledge is scattered across Git, docs, chats, meetings, PDFs, images, and references. - Agents miss standards, taste, historical tradeoffs, and local delivery expectations. The shared issue is context organization: sources, structure, recall, trust, and memory are not stable enough for complex tasks. ## Technical Model Context is not a fixed-schema object. It may be code, docs, images, meetings, conversations, links, summaries, or memories. OpenViking combines several information organization paradigms. | Paradigm | Value | Limit | | --- | --- | --- | | Vector index | Semantic relevance and modality-agnostic retrieval. | Weak at hierarchy, exact filtering, and relationship explanation. | | File system | Paths, hierarchy, and traversal. | Does not solve semantic discovery alone. | | Table | Metadata, filtering, owner, time, type, and permission fields. | Awkward as the primary model for messy multimodal context. | | Graph | Relationships and path explanation. | Expensive to model and maintain. | OpenViking uses vector search for semantic entry points, file-system semantics for traversal, limited metadata for governance, and relations where they add value. ## Design Constraints - Semantic by default: users should not choose schemas or modalities before adding data. - Simple surface: humans and agents use a small filesystem-like interface. - Agent-friendly commands: `ov ls`, `ov find`, `ov tree`, `ov abstract`, `ov overview`, `ov read`. - Token discipline: summaries and staged reading avoid reading-window waste. - Relations without graph burden: links and relations are useful, but graph modeling is not required first. ## CLI Path Start `openviking-server`, then use the CLI. ### Ingest ```bash ov add-resource https://github.com/volcengine/OpenViking ov add-resource https://arxiv.org/pdf/2602.09540 ov add-resource ./team_building.jpg ov add-resource ./project.docx ov add-resource ./team-docs.zip ``` ### Discover ```bash ov ls ov find "How does OpenViking use VikingDB?" --uri=viking://resources/code/volcengine/OpenViking ov tree viking://resources/code/volcengine/OpenViking/examples/ -L 2 ov abstract viking://resources/code/volcengine/OpenViking ov read viking://resources/code/volcengine/OpenViking/examples/cloud/GUIDE.md ``` Read path: `ls/find/tree` for L0 summaries, `abstract` for summary, `overview` for structure, `read` for full evidence. ### Maintain ```bash ov mv viking://resources/photo/20260102/example.jpg viking://resources/photo/20260103/ ov rm viking://resources/photo/20260102/example.jpg ov rm -r viking://resources/photo/20260102/ ``` ### Reuse Skills, Memory, Bot, and Observability ```bash ov add-skill ./my-skill/examples/openviking-cli-skills ov find "OpenViking usage tips" --uri=viking://agent/skills ov add-memory ./2026-03-04/memory-2026-03-04.md openviking-server --with-bot ov chat -m "Ask your question" ov status ov observer vlm ``` ## Product Boundary | Capability | OpenViking context database | Vector database | File system | | --- | --- | --- | --- | | Data operations | Add, delete, query, update | Add, delete, query, update | Add, delete, update; query needs other tools | | Inputs | Files, text, links, conversation history | Vectors, scalar metadata, vectorizable content | Files | | Semantic retrieval | Yes, backed by vector search | Yes | No | | Keyword retrieval | Sparse vectors or grep | Sparse vectors and keyword indexes | Grep | | Hierarchy | Preserved and exposed to agents | Usually not preserved | Native | | Parsing and summaries | Automatic parsing, L0 summaries, overview paths | Usually external | Not built in | | Agentic reading | `ls`, `tree`, `find`, `abstract`, `overview`, `read` | Not directly exposed | Traversal only | | Data isolation | Account, user, and agent dimensions | Scalar metadata | User/group controls | | Built-in capabilities | Memory plugin, bot, native RAG direction | Not included | Not included | | Deployment | Local/self-hosted today; managed and distributed options planned | Managed cloud service | Local or object storage | | Original files | Not retained by default; still being refined | Not retained | Retained | Boundary summary: - Vector search answers what is semantically close. - File systems answer where something lives. - OpenViking answers how agents use context as managed data. ## Document Decomposition OpenViking does not force a long document to remain one file. It decomposes, reorganizes, and summarizes it for staged reading. | Stage | Output | Agent action | | --- | --- | --- | | Input document | docx, pdf, markdown, web page, folder, or archive becomes a managed resource. | Use `ov add-resource`, not prompt paste. | | Chapter paths | Sections become navigable paths such as `viking://resources/docs/project/03-design/`. | Use `ov tree` or `ov ls` first. | | Content modules | Each unit carries a point, process, interface, case, or decision. | Use `ov find`, then `ov overview`. | | Modal elements | Tables, images, code blocks, links, and attachments become searchable context. | Follow URIs into specific elements. | | Summary ladder | L0 summaries, abstracts, overview, and full content form a reading path. | Read summaries first; use `ov read` only when needed. | Goal: vectorizable units, coherent meaning, and lower model-window cost. ## Team Adoption Context processing efficiency sets the ceiling for team AI work. OpenViking connects scattered resources into one context layer. ### Deploy ```bash uv venv openviking-env source openviking-env/bin/activate uv pip install openviking --upgrade # Configure OpenViking according to the repository README. nohup openviking-server > openviking.log 2>&1 & ov status ``` Use local or self-hosted validation first. Configure access and credentials before adding private repositories. ### Build the Corpus 1. Connect core repositories and stable documents. 2. Add meeting notes, chats, project records, and external references. 3. Turn repeated workflows into skills and repeated preferences into memory. ### Demo A: Multi-Repository Technical Question The goal is cross-repo, cross-doc context for real engineering questions. ## OpenClaw Memory OpenClaw shows the memory problem clearly. Longer tasks need preferences and corrections as retrievable context instead of raw chat messages. | Pain point | OpenViking practice | | --- | --- | | Repeated preferences | Store user requirements and team conventions as searchable memory. | | High retry cost | Use session summaries and `add-memory` for reusable experience. | | Long tasks | Let OpenClaw retrieve long-term context through OpenViking. | | Scattered knowledge | Unify code, documents, meetings, chats, and references. | ```bash curl -fSL https://openclaw.ai/install.sh | bash # Follow the OpenViking memory plugin guide: # https://github.com/volcengine/OpenViking/blob/main/examples/openclaw-plugin/INSTALL-ZH.md ov add-memory ./2026-03-04/memory-2026-03-04.md ``` Good memory is summarized, compressed, reorganized, scoped, and explainable. It is not infinite chat retention. ## VikingBot VikingBot is a native agent interface for testing ingestion, retrieval, summaries, and reading paths. ```bash openviking-server --with-bot ov chat -m "Ask your question" ov status ov observer vlm ``` With `--with-bot`, `ov chat` can use connected resources, skills, summaries, retrieval, and memory context. ## Core Judgments and Roadmap Core judgments: - Larger context corpora make retrieval quality and organization more important. - High-efficiency teams need a context database for full-domain information integration. - Vectors, file systems, graphs, and tables are forms; agents need an operable data interface. - OpenViking is a context database for complex agent tasks, with memory as one built-in use case. - Future agent capability is context capability: knowledge, memory, tools, and organization. Roadmap: - Build ecosystem standards and reusable protocols. - Strengthen single-machine operations, stable releases, and smooth upgrades. - Improve multimodal context, memory retrieval, skill retrieval, and content understanding. - Build distributed capabilities and public-cloud integrations for reliable consistency. ## Agent-Readable Contract HTML serves human reading. `llm.txt` serves agent citation. Both should expose the same public content. Agents that receive the human URL should check `/post/openviking-context-database/llm.txt`.