OpenViking is trying to make context feel less like prompt stuffing and more like database infrastructure: addressable, indexed, isolated, observable, and safe enough for agents to use repeatedly.
On one benchmark, text-to-SQL accuracy moved from 0%, to 10% with RAG-style tricks, to roughly 35% when the prompt directly supplied the actual tables and joins.
Agents need the right data substrate. They need to know where information lives, how far a search should expand, which memory belongs to whom, and whether a write is safe. OpenViking frames that substrate as a context database.
The Shape Of The System
The core is deliberately polyglot. Python owns the server because model calls, parsing, multimodal processing, and AI dependencies still live there. Rust owns distribution-sensitive surfaces such as the CLI and RAGFS. C++ carries the embedded vector database lineage from VikingDB.
A database-shaped stack for agent context
That split is not just an implementation detail. It keeps each layer honest about its contract: agents speak in commands and URIs, the server enforces identity and jobs, AGFS/RAGFS gives context a traversable shape, and VikingDB plus file storage decide what can be retrieved or persisted.
Directory Semantics Are The Addressing Layer
A large amount of useful context is already organized as a tree: code, calendars, wikis, books, shelves, service trees. VikingDB turned that observation into a path-aware vector index. OpenViking then uses `viking://` URIs to give agents a database namespace that feels familiar without pretending to be the local filesystem.
The important detail is that `path` is not stored as ordinary text. In VikingDB it is a `TYPE_PATH` index, so a query can choose a tree scope directly instead of scanning path strings as scalar metadata. Agents rarely ask for “any semantically similar thing anywhere.” They ask inside a project, a user memory space, a document subtree, or a service boundary.
| Capability | Why scalar filtering is not enough |
|---|---|
| Depth-aware retrieval | A directory query must mean current node, one level, or the entire subtree without rewriting every path predicate. |
| Multiple roots | Context lives under users, resources, memories, and tools; each root must remain a scope boundary. |
| Real-time updates | New files, moves, and deletes must be visible to retrieval without rebuilding the whole tree. |
| Per-level caches | Agents often need overview first, detail later; cache boundaries should match the tree. |
Directory depth selector
The buttons change only the visualization; the rule is still visible below.{ "op": "must", "field": "path", "conds": ["/user/shengmaojia/memories"], "para": "-d=1"}d=-1means global retrieval under the current directory.d=0matches the current node itself.d=xsearches downward by `x` levels.
Progressive Disclosure For Context
| Level | What it stores | Why agents need it |
|---|---|---|
L0 | Short summary | Fast orientation before spending tokens. |
L1 | Structure and fields | Enough shape to plan a query or traversal. |
L2 | Detailed source content | Only loaded when precision requires it. |
From Uploaded Files To Context Objects
A single image upload shows why OpenViking is not just a filesystem facade. When `ov add-resource ./docs/images/grafana-demo-dashboard.png` runs, OpenViking creates a directory for that resource, stores the original image as L2, and generates L0 and L1 summaries so an agent can decide whether to inspect the full object. All three levels are embedded with multimodal models and written to the vector database.
ov add-resource ./docs/images/grafana-demo-dashboard.png # creates a resource URI similar to:viking://resources/images/20260509/upload_321e98a827a0461f8721c683d726cbec_png| Input | Stored shape | Agent value |
|---|---|---|
grafana-demo-dashboard.png | L0 summary, L1 structure, L2 image | Searchable before the full image is loaded. |
| Code repository file | Original relative path preserved | Agents can navigate like code while retrieval stays semantic. |
| Wiki or document subtree | `viking://` URI hierarchy | Search can stay inside the intended knowledge scope. |
Distributed By Decoupling Storage
The open-source distribution starts as a single-machine service, but the architecture is pointed at managed deployment. The important move is to run OpenViking instances without data disks: vector storage, filesystem storage, logs, and telemetry are abstracted behind middleware interfaces.
The open-source build also avoids nonessential dependencies such as Redis and Kafka. Account information, temporary working directories, transactions, task records, and work queues are kept behind the same filesystem abstraction. That makes the local path easy to operate, while leaving a clear place to swap in managed storage later.
| Mode | What happens | Why it matters | Current caveat |
|---|---|---|---|
| Full read-write | Every instance accepts reads and writes. | Simpler scaling model and likely default direction. | Heavy writes can occupy CPU in a Python single-process server. |
| Read-write separation | Write and read clusters are separated. | Better isolation and availability boundaries. | Currently manual and not the recommended default. |
| Layer | Consistency expectation | OpenViking responsibility |
|---|---|---|
| VikingDB | Eventual consistency in managed vector storage. | Design retrieval and retries around visibility delay. |
| Embedded vector database | Strong consistency on a single machine. | Keep the local mode simple and predictable. |
| Distributed filesystem | Usually strong, still with ordering edge cases. | Protect writes with file and directory locks. |
Where correctness has to be explicit
| Layer | Consistency | Protection | Primary risk |
|---|---|---|---|
| Managed vector store | Eventually visible after write | Retry and visibility windows | Fresh resources may miss first retrieval. |
| File artifact store | Strong when local, provider-defined when remote | File lock | Concurrent overwrite or partial artifact exposure. |
| Directory namespace | Must preserve tree invariants | Directory lock | Move/delete can race with indexing or traversal. |
| Metadata and permissions | Read-your-policy is the target | Transaction boundary | Policy drift leaks or hides context. |
Directory namespace: Move/delete can race with indexing or traversal.
Identity: Treat Agents As Database Users
The hardest multi-tenant question is not accounts. It is whether an agent is subordinate to a human user, owns data by itself, or should be treated as a peer. OpenViking went through all three designs and is converging on the peer model.
Local multi-tenancy starts with a root API key and explicit user registration. Hosted OpenViking hides the root key and exposes user capacity through service tiers instead. The product surface changes, but the invariant stays the same: every read and write must carry a real identity before it touches private context.
Agent belongs to User
Simple RBAC, but one service agent cannot naturally serve many visitors with separate memory.
Agent can own data
More flexible, but the authorization graph becomes hard to explain and harder to secure.
Human and agent are peers
The target model: `user` is the only authenticated object besides root, and it may represent a human or an agent.
This is a privacy decision as much as a modeling decision. A customer-service agent may manage memories for visitors who are not registered OpenViking users. Forcing those visitors into the same `User` abstraction makes the authorization graph less true and less safe.
Identity flow decides what context can cross
Admin authority
Register users, rotate API keys, configure global policy.
Human and agent as peer users
Root is admin-only; every actor gets a scoped API key and a visible namespace.
Private scope
Index filters and read APIs enforce visible context.
Secrets stay protected
Skills receive restored placeholders only when policy allows.
# server ov.conf: configure root_api_key before startup# client ovcli.conf: configure the same root_api_keyov admin register-user default <your_name># client ovcli.conf: use the returned api_key for normal accessPerformance Is A Pipeline Problem
Once the storage model is distributed, capacity is mostly a deployment choice. Performance is harder because write requests touch parsing, splitting, VLM calls, embedding, summarization, memory extraction, IO movement, and locks.
Vector database
Use VikingDB DSL filters for shared pools; dedicate a vector database for large tenants.
Filesystem
Local FS is fast but fragile; S3/TOS scales but can slow the agent loop.
Write pipeline
Parsing, splitting, VLM calls, embeddings, summaries, and memory extraction dominate latency.
Locks
Directory/file locks protect conflicting writes; transaction semantics are still evolving.
| Layer | Lightweight mode | Heavy mode | Tradeoff |
|---|---|---|---|
| Vector database | Shared VikingDB pool with Account/User scalar filters. | Dedicated vector database per OpenViking instance. | Shared mode saves resources; dedicated mode removes the practical index ceiling. |
| Filesystem | Local FS, ByteNAS, or managed shared FS. | TOS/S3 or EFS-like remote storage. | Local is fast; object storage scales but slows agent loops. |
| Write pipeline | Queue model calls and embedding work. | Globally controlled parallel ingestion. | More throughput, but lock and ordering costs become visible. |
The bottleneck is a chain, not one database call
Model calls
VLM, embedding, summary, memory extraction.
Current optimization directions
- Queue and parallelize model calls with global concurrency control.
- Replace the Go AGFS server path with embedded calls and Rust where transfer cost matters.
- Parallelize tree operations such as `find` and `tree`.
- Reduce copies across receive, work, and visible directories during upload.
Privacy: Context Is Plaintext
A context database stores the material an agent uses to reason. That material is often sensitive by definition. OpenViking handles this with API-key identity, root isolation, user-scoped `viking://user` visibility, optional file encryption, and experimental Skill privacy configs.
| Control | Purpose |
|---|---|
dev | Local development mode without authentication. |
api_key | Required when the service listens beyond localhost. |
ov --sudo | Root identity is explicit and limited to admin actions. |
viking://user | Private data scope filtered at the index layer. |
| Privacy configs | Store Skill secrets in protected storage and restore placeholders at read time. |
Encryption is implemented, but it is not free. Different tenants or accounts can use different keys, which improves blast-radius control, but remote storage has to be decrypted before operations such as `grep`. For a context database, privacy controls affect latency and operator ergonomics, not only compliance posture.
openviking privacy categoriesopenviking privacy list skillopenviking privacy upsert skill byted-viking-search-knowledgebase \ --values-json '{"api_key":"secret-2","base_url":"https://example.com"}'openviking privacy activate skill byted-viking-search-knowledgebase 2Evaluation Still Drives The Product
The article only sketches evaluation because it deserves its own write-up. The important point is that OpenViking measures across RAG, memory, long tasks, SWE tasks, and multi-agent human-agent collaboration.
| Benchmark area | Architecture question it answers |
|---|---|
| RAG | Does directory-scoped retrieval improve answer accuracy and reduce irrelevant context? |
| Memory | Can user and agent isolation preserve useful memory without leaking private state? |
| Long tasks | Do L0/L1/L2 summaries help agents plan across large context without exhausting tokens? |
| SWE | Do preserved repository paths and scoped retrieval help agents navigate codebases? |
| Multi-agent collaboration | Can peer identity and permission boundaries support shared work without collapsing ownership? |
Scenarios
- RAG benchmarks
- Memory benchmarks
- Long-task benchmarks
- SWE benchmarks
Dimensions
- Accuracy
- Task latency
- Token consumption
- Multimodal support
A product dashboard for context quality
RAG
Can the system retrieve the right evidence before the model answers?
What To Remember
The critical architectural insight is that context is not a blob. It has paths, scopes, identities, consistency constraints, performance budgets, and privacy boundaries. OpenViking is useful because it lets agents consume those properties through an interface they can already navigate.
The architecture is still moving from concept to product construction. The open-source release has already produced enough usage, issues, and feedback to make capacity and performance the next hard priorities. The useful thing about the design is not that every consistency or latency question is closed; it is that OpenViking names the database properties context systems need to expose before agents can depend on them.
The source note closes by thanking more than 150 contributors and participants, over 1000 merged changes, and a community that has pushed the project past 23k stars. That matters because the remaining questions are not slideware questions; they are the questions that show up when real agents, data, and users start sharing the same context substrate.

