Session 2: What Conversation Data Is Accessible Locally? Where Are Transcripts Stored?¶
Date: 2026-03-21 17:03 PST Focus: Mapping the full data landscape available to a Hippocampus plugin — what exists, where it lives, what format it's in, and what's indexable.
Key Finding: The Data Is Rich, Structured, and Already On Disk¶
Every conversation that passes through OpenClaw is persisted as JSONL transcripts on the local filesystem. This is a goldmine for a Hippocampus-style system.
Data Location¶
~/.openclaw/agents/<agentId>/sessions/
├── sessions.json # Session store (key → metadata map)
├── <sessionId>.jsonl # Per-session transcript (append-only)
├── <sessionId>-topic-<threadId>.jsonl # Telegram topic transcripts
├── *.jsonl.reset.<timestamp> # Archived transcripts from /new resets
└── *.jsonl.deleted.<timestamp> # Deleted/pruned transcripts
Current installation stats:
- 174 active JSONL transcript files
- 186 session keys in sessions.json
- ~36 MB total session storage
- Sessions span: main DM, Telegram topics, cron runs, slash commands, etc.
Transcript Structure (JSONL)¶
Each .jsonl file is structured as a tree with id + parentId relationships:
{"type":"session","version":3,"id":"<uuid>","timestamp":"...","cwd":"..."}
{"type":"model_change","id":"...","parentId":null,"provider":"anthropic","modelId":"claude-opus-4-6"}
{"type":"thinking_level_change","id":"...","parentId":"...","thinkingLevel":"medium"}
{"type":"custom","customType":"model-snapshot","data":{...},"id":"...","parentId":"..."}
{"type":"message","id":"...","parentId":"...","timestamp":"...","message":{"role":"user","content":[{"type":"text","text":"..."}]}}
{"type":"message","id":"...","parentId":"...","timestamp":"...","message":{"role":"assistant","content":[{"type":"text","text":"..."}]}}
Entry types relevant to Hippocampus:
| Type | Hippocampus Use | Index Priority |
|---|---|---|
message (role: user) |
Primary query material | ⭐ HIGH |
message (role: assistant) |
Context + answers | ⭐ HIGH |
message (role: toolResult) |
Factual data returned by tools | MEDIUM |
custom_message |
Extension-injected context | MEDIUM |
compaction |
Summaries of older conversation | ⭐ HIGH (dense info) |
session header |
Session metadata (cwd, timestamp) | LOW (metadata only) |
model_change |
Model routing history | LOW |
custom (model-snapshot) |
Not needed | SKIP |
thinking_level_change |
Not needed | SKIP |
What's In the Messages¶
User messages include rich metadata injected by OpenClaw:
{
"role": "user",
"content": [{
"type": "text",
"text": "Conversation info (untrusted metadata):\n```json\n{\"sender\": \"Jeffrey Flynn\", \"conversation_label\": \"Jeff & Jules\", \"topic_id\": \"13\"}\n```\n\nActual user message here"
}]
}
This means each message carries: - Sender identity (name, ID) - Conversation context (channel, group, topic) - Timestamp - The actual content after the metadata block
Hippocampus parsing strategy: Strip the metadata prefix to get clean content for semantic indexing, but preserve metadata for filtering (e.g., "what did Jeff say about X" vs "what did I say in the Ideas topic").
Session Store Metadata (sessions.json)¶
The session store maps sessionKey → SessionEntry with:
{
sessionId: string, // Links to transcript file
updatedAt: number, // Last activity timestamp
chatType: "direct" | "group" | "room",
provider: string, // Channel (telegram, discord, etc.)
subject?: string, // Group/topic name
displayName?: string, // Human-readable session label
inputTokens: number, // Cumulative token usage
outputTokens: number,
totalTokens: number,
contextTokens: number,
compactionCount: number, // How many times compaction ran
}
Hippocampus use: Session metadata enables filtering by channel, topic, recency, and conversation type. This is crucial for scoped retrieval ("what did we discuss in the Projects topic?").
Existing Index Infrastructure¶
memory_search (Already Operational)¶
Current config:
{
memorySearch: {
provider: "ollama", // Local Nomic embeddings
fallback: "gemini", // Remote fallback
model: "nomic-embed-text"
}
}
This indexes:
- MEMORY.md
- memory/*.md
It does NOT index session transcripts by default.
Session Memory Search (Available but Not Enabled)¶
OpenClaw already has an experimental feature for indexing sessions:
agents: {
defaults: {
memorySearch: {
experimental: { sessionMemory: true },
sources: ["memory", "sessions"]
}
}
}
This is significant. OpenClaw already has the machinery to index JSONL transcripts into the same vector store used by memory_search. It's opt-in, debounced, and async.
QMD Backend (Not Enabled)¶
QMD offers a more sophisticated approach: BM25 + vectors + reranking. It can also index sessions:
memory: {
backend: "qmd",
qmd: {
sessions: {
enabled: true,
retentionDays: 30,
exportDir: "~/.openclaw/agents/<id>/qmd/sessions/"
}
}
}
QMD sanitizes transcripts (User/Assistant turns only) into a dedicated collection.
Data Access Paths for a Hippocampus Plugin¶
A Hippocampus plugin running in-process has three data access paths:
Path 1: Hook-Based Real-Time Stream (from Session 1)¶
message:preprocessed → Current turn content (fully enriched)
message:sent → Agent's response
before_prompt_build → Conversation history (messages array)
Pro: Real-time, no disk I/O, already parsed Con: Only current session's messages, no cross-session history
Path 2: Direct JSONL File Reading¶
// Plugin reads transcript files directly
const transcriptPath = `~/.openclaw/agents/${agentId}/sessions/${sessionId}.jsonl`;
const lines = fs.readFileSync(transcriptPath, 'utf-8').split('\n');
const messages = lines
.filter(l => l.trim())
.map(l => JSON.parse(l))
.filter(e => e.type === 'message');
Pro: Full historical access, all sessions, all conversations Con: Disk I/O, needs parsing, can be large (36MB+ and growing)
Path 3: Piggyback on Existing Memory Infrastructure¶
If experimental.sessionMemory or QMD session indexing is enabled, the Hippocampus plugin could query the already-indexed session data via the same vector store.
Pro: No duplicate indexing, leverages existing infrastructure Con: Dependent on user enabling the feature, less control over indexing strategy
Cross-Session Data: The Key Differentiator¶
The critical gap that Hippocampus fills is cross-session retrieval. Here's what currently exists:
| System | Current Session | Past Sessions (same key) | Past Sessions (different key) |
|---|---|---|---|
| Model context window | ✅ | ❌ (compacted away) | ❌ |
| memory_search (files) | ✅ | ✅ (if written to memory) | ✅ (if written to memory) |
| memory_search (sessions) | ❌ (experimental) | ✅ (if enabled) | ✅ (if enabled) |
| QMD sessions | ❌ | ✅ (if enabled) | ✅ (if enabled) |
| Hippocampus | ✅ | ✅ | ✅ |
The key insight: memory_search only finds things the agent explicitly wrote down. Hippocampus indexes everything — including the vast majority of conversation content that never gets written to memory/*.md.
The Memory Gap Problem¶
Consider this flow: 1. Monday: Jeff and Jules discuss a vendor's API pricing ($0.02/1K tokens) 2. Tuesday: Session resets. That pricing detail was in conversation but never written to memory 3. Wednesday: Jeff asks "what was that vendor's pricing?" 4. Current state: Lost. memory_search finds nothing. The JSONL transcript has it, but nothing queries it.
Hippocampus closes this gap by indexing the transcripts and surfacing relevant context automatically.
Indexing Strategy for Hippocampus¶
Given the data landscape, here's the optimal indexing approach:
What to Index¶
- All
messageentries with roleuserorassistant— the conversational content compactionentries — these are LLM-generated summaries, extremely information-dense- Strip metadata prefixes from user messages (conversation info blocks)
- Skip tool calls, tool results (usually too noisy), model changes, thinking levels
How to Chunk¶
JSONL transcripts are naturally chunked by turn. Options:
- Per-turn chunks — each message is one chunk. Simple, but short turns may lack context.
- Turn-pair chunks — user + assistant paired together. Better semantic coherence.
- Sliding window — N turns with overlap. Best retrieval but more storage.
Recommendation: Turn-pair chunks (user question + assistant response) as the primary unit. This gives semantic completeness — the question provides the query surface, the answer provides the information surface.
Chunk Metadata¶
Each chunk should carry:
{
sessionKey: string, // Which conversation
sessionId: string, // Which transcript
timestamp: number, // When it happened
channelId: string, // telegram, discord, etc.
chatType: string, // direct, group, topic
topicId?: string, // Telegram topic ID
senderId?: string, // Who said it
chunkType: "turn-pair" | "compaction" | "single-turn",
}
This metadata enables filtered retrieval: "what did we discuss in the Ideas topic this week?"
Size Estimation¶
Current data: - 174 transcripts, ~36 MB total - Average transcript: ~207 KB - Estimated ~50-100 turn-pairs per transcript - Estimated total: ~10,000-17,000 chunks
With nomic-embed-text (768 dimensions, float32): - Per chunk: 768 × 4 = 3,072 bytes - 17,000 chunks × 3,072 = ~52 MB for embeddings - Plus chunk text storage: ~36 MB - Total index size: ~88 MB — very manageable for local SQLite
Growth rate: ~1-5 MB/day of new transcripts → ~3-15 KB/day of new embeddings. Not a concern.
Interaction with Existing memory_search¶
This is a critical design question. Options:
Option A: Replace memory_search¶
Bad idea. memory_search serves curated, high-signal memory. Hippocampus serves broad conversational recall. Different purposes.
Option B: Complement memory_search (Recommended)¶
Hippocampus runs alongside memory_search. The before_prompt_build hook injects Hippocampus results as prependContext, while memory_search continues to work via the agent's memory_search tool calls.
Key difference:
- memory_search = agent-initiated ("let me look this up")
- Hippocampus = system-initiated ("you might need this")
Option C: Unified Search¶
Merge both into one retrieval system. More complex, diminishing returns for v1.
Open Questions for Next Sessions¶
- Prototype: What's the simplest thing that proves turn-pair indexing + before_prompt_build injection works? (Session 3)
- Algorithm: How does the plugin decide what's relevant? Keyword match? Semantic similarity threshold? Topic detection? (Session 4)
- Index tech: SQLite FTS5 vs vector embeddings vs hybrid for the index? (Session 5)
Session 2 Verdict¶
The data is there and it's excellent. Every conversation is persisted as structured JSONL with rich metadata. The Hippocampus plugin has full filesystem access to:
- 174+ transcript files (~36 MB)
- Turn-by-turn message content with sender/channel/topic metadata
- Compaction summaries (information-dense)
- Session store for filtering and discovery
The architecture from Session 1 + the data from Session 2 = a complete picture:
Real-time stream (hooks) → Index new turns as they happen
Historical transcripts (JSONL) → Bulk index on first run
Session store metadata → Filter/scope retrieval
before_prompt_build → Inject relevant context
The biggest insight: OpenClaw already has the building blocks (experimental session memory, QMD session indexing). Hippocampus could either build on those or replace them with a more purpose-built retrieval system. The key differentiator is the automatic injection — memory_search requires the agent to think "I should search for this," while Hippocampus just does it.