OpenViking: Inside the Context Database Architecture

OpenViking is trying to make context feel less like prompt stuffing and more like database infrastructure: addressable, indexed, isolated, observable, and safe enough for agents to use repeatedly.

On one benchmark, text-to-SQL accuracy moved from 0%, to 10% with RAG-style tricks, to roughly 35% when the prompt directly supplied the actual tables and joins.
— Mike Stonebraker, April 2026

Agents need the right data substrate. They need to know where information lives, how far a search should expand, which memory belongs to whom, and whether a write is safe. OpenViking frames that substrate as a context database.

The Shape Of The System

The core is deliberately polyglot. Python owns the server because model calls, parsing, multimodal processing, and AI dependencies still live there. Rust owns distribution-sensitive surfaces such as the CLI and RAGFS. C++ carries the embedded vector database lineage from VikingDB.

Arch stack

A database-shaped stack for agent context

Read top-down for request flow, bottom-up for ownership.

01Agent surface

CLI, SDK, MCP, Skills, VikingBotNavigation commands and resource URIs

viking://...

02OpenViking server

Identity, jobs, parsers, metadata, telemetryCoordinates reads, writes, retries, and isolation

API + jobs

03Context filesystem

AGFS/RAGFS, tree operations, summariesTurns context into paths agents can traverse

ls/find/read

04Storage substrate

VikingDB, embedded vectors, object/file storageDurability, retrieval, filters, and artifacts

index + blob

That split is not just an implementation detail. It keeps each layer honest about its contract: agents speak in commands and URIs, the server enforces identity and jobs, AGFS/RAGFS gives context a traversable shape, and VikingDB plus file storage decide what can be retrieved or persisted.

Directory Semantics Are The Addressing Layer

A large amount of useful context is already organized as a tree: code, calendars, wikis, books, shelves, service trees. VikingDB turned that observation into a path-aware vector index. OpenViking then uses `viking://` URIs to give agents a database namespace that feels familiar without pretending to be the local filesystem.

The important detail is that `path` is not stored as ordinary text. In VikingDB it is a `TYPE_PATH` index, so a query can choose a tree scope directly instead of scanning path strings as scalar metadata. Agents rarely ask for “any semantically similar thing anywhere.” They ask inside a project, a user memory space, a document subtree, or a service boundary.

Capability	Why scalar filtering is not enough
Depth-aware retrieval	A directory query must mean current node, one level, or the entire subtree without rewriting every path predicate.
Multiple roots	Context lives under users, resources, memories, and tools; each root must remain a scope boundary.
Real-time updates	New files, moves, and deletes must be visible to retrieval without rebuilding the whole tree.
Per-level caches	Agents often need overview first, detail later; cache boundaries should match the tree.

Directory depth selector

The buttons change only the visualization; the rule is still visible below.

viking://resources/openviking

viking://resources/openviking/docs

viking://resources/openviking/docs/design

viking://resources/openviking/telemetry

viking://resources/openviking/telemetry/grafana

viking://resources/openviking/images/20260509/upload_png

vikingdb-path-filter.jsonjs

 1{ 2  "op": "must", 3  "field": "path", 4  "conds": ["/user/shengmaojia/memories"], 5  "para": "-d=1" 6}

d=-1 means global retrieval under the current directory.
d=0 matches the current node itself.
d=x searches downward by `x` levels.

Progressive Disclosure For Context

Level	What it stores	Why agents need it
`L0`	Short summary	Fast orientation before spending tokens.
`L1`	Structure and fields	Enough shape to plan a query or traversal.
`L2`	Detailed source content	Only loaded when precision requires it.

From Uploaded Files To Context Objects

A single image upload shows why OpenViking is not just a filesystem facade. When `ov add-resource ./docs/images/grafana-demo-dashboard.png` runs, OpenViking creates a directory for that resource, stores the original image as L2, and generates L0 and L1 summaries so an agent can decide whether to inspect the full object. All three levels are embedded with multimodal models and written to the vector database.

add-image-resource.shjs

 1ov add-resource ./docs/images/grafana-demo-dashboard.png 2  3# creates a resource URI similar to: 4viking://resources/images/20260509/upload_321e98a827a0461f8721c683d726cbec_png

Input	Stored shape	Agent value
`grafana-demo-dashboard.png`	L0 summary, L1 structure, L2 image	Searchable before the full image is loaded.
Code repository file	Original relative path preserved	Agents can navigate like code while retrieval stays semantic.
Wiki or document subtree	`viking://` URI hierarchy	Search can stay inside the intended knowledge scope.

Distributed By Decoupling Storage

The open-source distribution starts as a single-machine service, but the architecture is pointed at managed deployment. The important move is to run OpenViking instances without data disks: vector storage, filesystem storage, logs, and telemetry are abstracted behind middleware interfaces.

The open-source build also avoids nonessential dependencies such as Redis and Kafka. Account information, temporary working directories, transactions, task records, and work queues are kept behind the same filesystem abstraction. That makes the local path easy to operate, while leaving a clear place to swap in managed storage later.

Mode	What happens	Why it matters	Current caveat
Full read-write	Every instance accepts reads and writes.	Simpler scaling model and likely default direction.	Heavy writes can occupy CPU in a Python single-process server.
Read-write separation	Write and read clusters are separated.	Better isolation and availability boundaries.	Currently manual and not the recommended default.

Layer	Consistency expectation	OpenViking responsibility
VikingDB	Eventual consistency in managed vector storage.	Design retrieval and retries around visibility delay.
Embedded vector database	Strong consistency on a single machine.	Keep the local mode simple and predictable.
Distributed filesystem	Usually strong, still with ordering edge cases.	Protect writes with file and directory locks.

Consistency and locks

Where correctness has to be explicit

Select a row to surface the failure mode.

Layer	Consistency	Protection	Primary risk
Managed vector store	Eventually visible after write	Retry and visibility windows	Fresh resources may miss first retrieval.
File artifact store	Strong when local, provider-defined when remote	File lock	Concurrent overwrite or partial artifact exposure.
Directory namespace	Must preserve tree invariants	Directory lock	Move/delete can race with indexing or traversal.
Metadata and permissions	Read-your-policy is the target	Transaction boundary	Policy drift leaks or hides context.

Directory namespace: Move/delete can race with indexing or traversal.

Identity: Treat Agents As Database Users

The hardest multi-tenant question is not accounts. It is whether an agent is subordinate to a human user, owns data by itself, or should be treated as a peer. OpenViking went through all three designs and is converging on the peer model.

Local multi-tenancy starts with a root API key and explicit user registration. Hosted OpenViking hides the root key and exposes user capacity through service tiers instead. The product surface changes, but the invariant stays the same: every read and write must carry a real identity before it touches private context.

Agent belongs to User

Simple RBAC, but one service agent cannot naturally serve many visitors with separate memory.

Agent can own data

More flexible, but the authorization graph becomes hard to explain and harder to secure.

Human and agent are peers

The target model: `user` is the only authenticated object besides root, and it may represent a human or an agent.

This is a privacy decision as much as a modeling decision. A customer-service agent may manage memories for visitors who are not registered OpenViking users. Forcing those visitors into the same `User` abstraction makes the authorization graph less true and less safe.

Privacy boundary

Identity flow decides what context can cross

Switch models to compare privacy pressure.

root

Admin authority

user:agent

Human and agent as peer users

Root is admin-only; every actor gets a scoped API key and a visible namespace.

API key + namespace boundary

viking://user

Private scope

Index filters and read APIs enforce visible context.

privacy config

Secrets stay protected

Skills receive restored placeholders only when policy allows.

local-multitenant.shjs

 1# server ov.conf: configure root_api_key before startup 2# client ovcli.conf: configure the same root_api_key 3ov admin register-user default <your_name> 4# client ovcli.conf: use the returned api_key for normal access

Performance Is A Pipeline Problem

Once the storage model is distributed, capacity is mostly a deployment choice. Performance is harder because write requests touch parsing, splitting, VLM calls, embedding, summarization, memory extraction, IO movement, and locks.

Vector database

Use VikingDB DSL filters for shared pools; dedicate a vector database for large tenants.

Filesystem

Local FS is fast but fragile; S3/TOS scales but can slow the agent loop.

Write pipeline

Parsing, splitting, VLM calls, embeddings, summaries, and memory extraction dominate latency.

Locks

Directory/file locks protect conflicting writes; transaction semantics are still evolving.

Layer	Lightweight mode	Heavy mode	Tradeoff
Vector database	Shared VikingDB pool with Account/User scalar filters.	Dedicated vector database per OpenViking instance.	Shared mode saves resources; dedicated mode removes the practical index ceiling.
Filesystem	Local FS, ByteNAS, or managed shared FS.	TOS/S3 or EFS-like remote storage.	Local is fast; object storage scales but slows agent loops.
Write pipeline	Queue model calls and embedding work.	Globally controlled parallel ingestion.	More throughput, but lock and ordering costs become visible.

Write pipeline

The bottleneck is a chain, not one database call

Bar length approximates relative latency pressure.

Model calls

VLM, embedding, summary, memory extraction.

Current optimization directions

Queue and parallelize model calls with global concurrency control.
Replace the Go AGFS server path with embedded calls and Rust where transfer cost matters.
Parallelize tree operations such as `find` and `tree`.
Reduce copies across receive, work, and visible directories during upload.

Privacy: Context Is Plaintext

A context database stores the material an agent uses to reason. That material is often sensitive by definition. OpenViking handles this with API-key identity, root isolation, user-scoped `viking://user` visibility, optional file encryption, and experimental Skill privacy configs.

Control	Purpose
`dev`	Local development mode without authentication.
`api_key`	Required when the service listens beyond localhost.
`ov --sudo`	Root identity is explicit and limited to admin actions.
`viking://user`	Private data scope filtered at the index layer.
Privacy configs	Store Skill secrets in protected storage and restore placeholders at read time.

Encryption is implemented, but it is not free. Different tenants or accounts can use different keys, which improves blast-radius control, but remote storage has to be decrypted before operations such as `grep`. For a context database, privacy controls affect latency and operator ergonomics, not only compliance posture.

privacy-config.shjs

 1openviking privacy categories 2openviking privacy list skill 3openviking privacy upsert skill byted-viking-search-knowledgebase \ 4  --values-json '{"api_key":"secret-2","base_url":"https://example.com"}' 5openviking privacy activate skill byted-viking-search-knowledgebase 2

Evaluation Still Drives The Product

The article only sketches evaluation because it deserves its own write-up. The important point is that OpenViking measures across RAG, memory, long tasks, SWE tasks, and multi-agent human-agent collaboration.

Benchmark area	Architecture question it answers
RAG	Does directory-scoped retrieval improve answer accuracy and reduce irrelevant context?
Memory	Can user and agent isolation preserve useful memory without leaking private state?
Long tasks	Do L0/L1/L2 summaries help agents plan across large context without exhausting tokens?
SWE	Do preserved repository paths and scoped retrieval help agents navigate codebases?
Multi-agent collaboration	Can peer identity and permission boundaries support shared work without collapsing ownership?

Scenarios

RAG benchmarks
Memory benchmarks
Long-task benchmarks
SWE benchmarks

Dimensions

Accuracy
Task latency
Token consumption
Multimodal support

Evaluation

A product dashboard for context quality

The numbers are illustrative scoring dimensions.

RAG

Can the system retrieve the right evidence before the model answers?

Accuracy82

Task latency68

Token economy74

Multimodal support71

What To Remember

The critical architectural insight is that context is not a blob. It has paths, scopes, identities, consistency constraints, performance budgets, and privacy boundaries. OpenViking is useful because it lets agents consume those properties through an interface they can already navigate.

The architecture is still moving from concept to product construction. The open-source release has already produced enough usage, issues, and feedback to make capacity and performance the next hard priorities. The useful thing about the design is not that every consistency or latency question is closed; it is that OpenViking names the database properties context systems need to expose before agents can depend on them.

The source note closes by thanking more than 150 contributors and participants, over 1000 merged changes, and a community that has pushed the project past 23k stars. That matters because the remaining questions are not slideware questions; they are the questions that show up when real agents, data, and users start sharing the same context substrate.