← All essays· Arch

OpenViking: Inside the Context Database Architecture

How OpenViking turns directory semantics, distributed storage, identity, performance, and privacy into a context database layer for AI agents.

OpenViking is trying to make context feel less like prompt stuffing and more like database infrastructure: addressable, indexed, isolated, observable, and safe enough for agents to use repeatedly.

On one benchmark, text-to-SQL accuracy moved from 0%, to 10% with RAG-style tricks, to roughly 35% when the prompt directly supplied the actual tables and joins.
— Mike Stonebraker, April 2026

Agents need the right data substrate. They need to know where information lives, how far a search should expand, which memory belongs to whom, and whether a write is safe. OpenViking frames that substrate as a context database.

The Shape Of The System

The core is deliberately polyglot. Python owns the server because model calls, parsing, multimodal processing, and AI dependencies still live there. Rust owns distribution-sensitive surfaces such as the CLI and RAGFS. C++ carries the embedded vector database lineage from VikingDB.

Arch stack

A database-shaped stack for agent context

Read top-down for request flow, bottom-up for ownership.
01Agent surface
CLI, SDK, MCP, Skills, VikingBotNavigation commands and resource URIs
viking://...
02OpenViking server
Identity, jobs, parsers, metadata, telemetryCoordinates reads, writes, retries, and isolation
API + jobs
03Context filesystem
AGFS/RAGFS, tree operations, summariesTurns context into paths agents can traverse
ls/find/read
04Storage substrate
VikingDB, embedded vectors, object/file storageDurability, retrieval, filters, and artifacts
index + blob

That split is not just an implementation detail. It keeps each layer honest about its contract: agents speak in commands and URIs, the server enforces identity and jobs, AGFS/RAGFS gives context a traversable shape, and VikingDB plus file storage decide what can be retrieved or persisted.


Directory Semantics Are The Addressing Layer

A large amount of useful context is already organized as a tree: code, calendars, wikis, books, shelves, service trees. VikingDB turned that observation into a path-aware vector index. OpenViking then uses `viking://` URIs to give agents a database namespace that feels familiar without pretending to be the local filesystem.

The important detail is that `path` is not stored as ordinary text. In VikingDB it is a `TYPE_PATH` index, so a query can choose a tree scope directly instead of scanning path strings as scalar metadata. Agents rarely ask for “any semantically similar thing anywhere.” They ask inside a project, a user memory space, a document subtree, or a service boundary.

CapabilityWhy scalar filtering is not enough
Depth-aware retrievalA directory query must mean current node, one level, or the entire subtree without rewriting every path predicate.
Multiple rootsContext lives under users, resources, memories, and tools; each root must remain a scope boundary.
Real-time updatesNew files, moves, and deletes must be visible to retrieval without rebuilding the whole tree.
Per-level cachesAgents often need overview first, detail later; cache boundaries should match the tree.

Directory depth selector

The buttons change only the visualization; the rule is still visible below.
viking://resources/openviking
viking://resources/openviking/docs
viking://resources/openviking/docs/design
viking://resources/openviking/telemetry
viking://resources/openviking/telemetry/grafana
viking://resources/openviking/images/20260509/upload_png
vikingdb-path-filter.jsonjs
{  "op": "must",  "field": "path",  "conds": ["/user/shengmaojia/memories"],  "para": "-d=1"}
  • d=-1 means global retrieval under the current directory.
  • d=0 matches the current node itself.
  • d=x searches downward by `x` levels.

Progressive Disclosure For Context

LevelWhat it storesWhy agents need it
L0Short summaryFast orientation before spending tokens.
L1Structure and fieldsEnough shape to plan a query or traversal.
L2Detailed source contentOnly loaded when precision requires it.

From Uploaded Files To Context Objects

A single image upload shows why OpenViking is not just a filesystem facade. When `ov add-resource ./docs/images/grafana-demo-dashboard.png` runs, OpenViking creates a directory for that resource, stores the original image as L2, and generates L0 and L1 summaries so an agent can decide whether to inspect the full object. All three levels are embedded with multimodal models and written to the vector database.

add-image-resource.shjs
ov add-resource ./docs/images/grafana-demo-dashboard.png # creates a resource URI similar to:viking://resources/images/20260509/upload_321e98a827a0461f8721c683d726cbec_png
InputStored shapeAgent value
grafana-demo-dashboard.pngL0 summary, L1 structure, L2 imageSearchable before the full image is loaded.
Code repository fileOriginal relative path preservedAgents can navigate like code while retrieval stays semantic.
Wiki or document subtree`viking://` URI hierarchySearch can stay inside the intended knowledge scope.

Distributed By Decoupling Storage

The open-source distribution starts as a single-machine service, but the architecture is pointed at managed deployment. The important move is to run OpenViking instances without data disks: vector storage, filesystem storage, logs, and telemetry are abstracted behind middleware interfaces.

The open-source build also avoids nonessential dependencies such as Redis and Kafka. Account information, temporary working directories, transactions, task records, and work queues are kept behind the same filesystem abstraction. That makes the local path easy to operate, while leaving a clear place to swap in managed storage later.

ModeWhat happensWhy it mattersCurrent caveat
Full read-writeEvery instance accepts reads and writes.Simpler scaling model and likely default direction.Heavy writes can occupy CPU in a Python single-process server.
Read-write separationWrite and read clusters are separated.Better isolation and availability boundaries.Currently manual and not the recommended default.
LayerConsistency expectationOpenViking responsibility
VikingDBEventual consistency in managed vector storage.Design retrieval and retries around visibility delay.
Embedded vector databaseStrong consistency on a single machine.Keep the local mode simple and predictable.
Distributed filesystemUsually strong, still with ordering edge cases.Protect writes with file and directory locks.
Consistency and locks

Where correctness has to be explicit

Select a row to surface the failure mode.
LayerConsistencyProtectionPrimary risk
Managed vector storeEventually visible after writeRetry and visibility windowsFresh resources may miss first retrieval.
File artifact storeStrong when local, provider-defined when remoteFile lockConcurrent overwrite or partial artifact exposure.
Directory namespaceMust preserve tree invariantsDirectory lockMove/delete can race with indexing or traversal.
Metadata and permissionsRead-your-policy is the targetTransaction boundaryPolicy drift leaks or hides context.

Directory namespace: Move/delete can race with indexing or traversal.

Identity: Treat Agents As Database Users

The hardest multi-tenant question is not accounts. It is whether an agent is subordinate to a human user, owns data by itself, or should be treated as a peer. OpenViking went through all three designs and is converging on the peer model.

Local multi-tenancy starts with a root API key and explicit user registration. Hosted OpenViking hides the root key and exposes user capacity through service tiers instead. The product surface changes, but the invariant stays the same: every read and write must carry a real identity before it touches private context.

V1

Agent belongs to User

Simple RBAC, but one service agent cannot naturally serve many visitors with separate memory.

V2

Agent can own data

More flexible, but the authorization graph becomes hard to explain and harder to secure.

V3

Human and agent are peers

The target model: `user` is the only authenticated object besides root, and it may represent a human or an agent.

This is a privacy decision as much as a modeling decision. A customer-service agent may manage memories for visitors who are not registered OpenViking users. Forcing those visitors into the same `User` abstraction makes the authorization graph less true and less safe.

Privacy boundary

Identity flow decides what context can cross

Switch models to compare privacy pressure.
root

Admin authority

Register users, rotate API keys, configure global policy.

user:agent

Human and agent as peer users

Root is admin-only; every actor gets a scoped API key and a visible namespace.

API key + namespace boundary
viking://user

Private scope

Index filters and read APIs enforce visible context.

privacy config

Secrets stay protected

Skills receive restored placeholders only when policy allows.

local-multitenant.shjs
# server ov.conf: configure root_api_key before startup# client ovcli.conf: configure the same root_api_keyov admin register-user default <your_name># client ovcli.conf: use the returned api_key for normal access

Performance Is A Pipeline Problem

Once the storage model is distributed, capacity is mostly a deployment choice. Performance is harder because write requests touch parsing, splitting, VLM calls, embedding, summarization, memory extraction, IO movement, and locks.

Vector database

Use VikingDB DSL filters for shared pools; dedicate a vector database for large tenants.

Filesystem

Local FS is fast but fragile; S3/TOS scales but can slow the agent loop.

Write pipeline

Parsing, splitting, VLM calls, embeddings, summaries, and memory extraction dominate latency.

Locks

Directory/file locks protect conflicting writes; transaction semantics are still evolving.

LayerLightweight modeHeavy modeTradeoff
Vector databaseShared VikingDB pool with Account/User scalar filters.Dedicated vector database per OpenViking instance.Shared mode saves resources; dedicated mode removes the practical index ceiling.
FilesystemLocal FS, ByteNAS, or managed shared FS.TOS/S3 or EFS-like remote storage.Local is fast; object storage scales but slows agent loops.
Write pipelineQueue model calls and embedding work.Globally controlled parallel ingestion.More throughput, but lock and ordering costs become visible.
Write pipeline

The bottleneck is a chain, not one database call

Bar length approximates relative latency pressure.

Model calls

VLM, embedding, summary, memory extraction.

Current optimization directions

  1. Queue and parallelize model calls with global concurrency control.
  2. Replace the Go AGFS server path with embedded calls and Rust where transfer cost matters.
  3. Parallelize tree operations such as `find` and `tree`.
  4. Reduce copies across receive, work, and visible directories during upload.

Privacy: Context Is Plaintext

A context database stores the material an agent uses to reason. That material is often sensitive by definition. OpenViking handles this with API-key identity, root isolation, user-scoped `viking://user` visibility, optional file encryption, and experimental Skill privacy configs.

ControlPurpose
devLocal development mode without authentication.
api_keyRequired when the service listens beyond localhost.
ov --sudoRoot identity is explicit and limited to admin actions.
viking://userPrivate data scope filtered at the index layer.
Privacy configsStore Skill secrets in protected storage and restore placeholders at read time.

Encryption is implemented, but it is not free. Different tenants or accounts can use different keys, which improves blast-radius control, but remote storage has to be decrypted before operations such as `grep`. For a context database, privacy controls affect latency and operator ergonomics, not only compliance posture.

privacy-config.shjs
openviking privacy categoriesopenviking privacy list skillopenviking privacy upsert skill byted-viking-search-knowledgebase \  --values-json '{"api_key":"secret-2","base_url":"https://example.com"}'openviking privacy activate skill byted-viking-search-knowledgebase 2

Evaluation Still Drives The Product

The article only sketches evaluation because it deserves its own write-up. The important point is that OpenViking measures across RAG, memory, long tasks, SWE tasks, and multi-agent human-agent collaboration.

Benchmark areaArchitecture question it answers
RAGDoes directory-scoped retrieval improve answer accuracy and reduce irrelevant context?
MemoryCan user and agent isolation preserve useful memory without leaking private state?
Long tasksDo L0/L1/L2 summaries help agents plan across large context without exhausting tokens?
SWEDo preserved repository paths and scoped retrieval help agents navigate codebases?
Multi-agent collaborationCan peer identity and permission boundaries support shared work without collapsing ownership?

Scenarios

  • RAG benchmarks
  • Memory benchmarks
  • Long-task benchmarks
  • SWE benchmarks

Dimensions

  • Accuracy
  • Task latency
  • Token consumption
  • Multimodal support
Evaluation

A product dashboard for context quality

The numbers are illustrative scoring dimensions.

RAG

Can the system retrieve the right evidence before the model answers?

Accuracy82
Task latency68
Token economy74
Multimodal support71

What To Remember

The critical architectural insight is that context is not a blob. It has paths, scopes, identities, consistency constraints, performance budgets, and privacy boundaries. OpenViking is useful because it lets agents consume those properties through an interface they can already navigate.

The architecture is still moving from concept to product construction. The open-source release has already produced enough usage, issues, and feedback to make capacity and performance the next hard priorities. The useful thing about the design is not that every consistency or latency question is closed; it is that OpenViking names the database properties context systems need to expose before agents can depend on them.

The source note closes by thanking more than 150 contributors and participants, over 1000 merged changes, and a community that has pushed the project past 23k stars. That matters because the remaining questions are not slideware questions; they are the questions that show up when real agents, data, and users start sharing the same context substrate.