Skip to content

Session 2: What Conversation Data Is Accessible Locally? Where Are Transcripts Stored?

Date: 2026-03-21 17:03 PST Focus: Mapping the full data landscape available to a Hippocampus plugin — what exists, where it lives, what format it's in, and what's indexable.


Key Finding: The Data Is Rich, Structured, and Already On Disk

Every conversation that passes through OpenClaw is persisted as JSONL transcripts on the local filesystem. This is a goldmine for a Hippocampus-style system.

Data Location

~/.openclaw/agents/<agentId>/sessions/
├── sessions.json                          # Session store (key → metadata map)
├── <sessionId>.jsonl                      # Per-session transcript (append-only)
├── <sessionId>-topic-<threadId>.jsonl     # Telegram topic transcripts
├── *.jsonl.reset.<timestamp>              # Archived transcripts from /new resets
└── *.jsonl.deleted.<timestamp>            # Deleted/pruned transcripts

Current installation stats: - 174 active JSONL transcript files - 186 session keys in sessions.json - ~36 MB total session storage - Sessions span: main DM, Telegram topics, cron runs, slash commands, etc.

Transcript Structure (JSONL)

Each .jsonl file is structured as a tree with id + parentId relationships:

{"type":"session","version":3,"id":"<uuid>","timestamp":"...","cwd":"..."}
{"type":"model_change","id":"...","parentId":null,"provider":"anthropic","modelId":"claude-opus-4-6"}
{"type":"thinking_level_change","id":"...","parentId":"...","thinkingLevel":"medium"}
{"type":"custom","customType":"model-snapshot","data":{...},"id":"...","parentId":"..."}
{"type":"message","id":"...","parentId":"...","timestamp":"...","message":{"role":"user","content":[{"type":"text","text":"..."}]}}
{"type":"message","id":"...","parentId":"...","timestamp":"...","message":{"role":"assistant","content":[{"type":"text","text":"..."}]}}

Entry types relevant to Hippocampus:

Type Hippocampus Use Index Priority
message (role: user) Primary query material ⭐ HIGH
message (role: assistant) Context + answers ⭐ HIGH
message (role: toolResult) Factual data returned by tools MEDIUM
custom_message Extension-injected context MEDIUM
compaction Summaries of older conversation ⭐ HIGH (dense info)
session header Session metadata (cwd, timestamp) LOW (metadata only)
model_change Model routing history LOW
custom (model-snapshot) Not needed SKIP
thinking_level_change Not needed SKIP

What's In the Messages

User messages include rich metadata injected by OpenClaw:

{
  "role": "user",
  "content": [{
    "type": "text",
    "text": "Conversation info (untrusted metadata):\n```json\n{\"sender\": \"Jeffrey Flynn\", \"conversation_label\": \"Jeff & Jules\", \"topic_id\": \"13\"}\n```\n\nActual user message here"
  }]
}

This means each message carries: - Sender identity (name, ID) - Conversation context (channel, group, topic) - Timestamp - The actual content after the metadata block

Hippocampus parsing strategy: Strip the metadata prefix to get clean content for semantic indexing, but preserve metadata for filtering (e.g., "what did Jeff say about X" vs "what did I say in the Ideas topic").

Session Store Metadata (sessions.json)

The session store maps sessionKey → SessionEntry with:

{
  sessionId: string,           // Links to transcript file
  updatedAt: number,           // Last activity timestamp
  chatType: "direct" | "group" | "room",
  provider: string,            // Channel (telegram, discord, etc.)
  subject?: string,            // Group/topic name
  displayName?: string,        // Human-readable session label
  inputTokens: number,         // Cumulative token usage
  outputTokens: number,
  totalTokens: number,
  contextTokens: number,
  compactionCount: number,     // How many times compaction ran
}

Hippocampus use: Session metadata enables filtering by channel, topic, recency, and conversation type. This is crucial for scoped retrieval ("what did we discuss in the Projects topic?").


Existing Index Infrastructure

memory_search (Already Operational)

Current config:

{
  memorySearch: {
    provider: "ollama",        // Local Nomic embeddings
    fallback: "gemini",        // Remote fallback
    model: "nomic-embed-text"
  }
}

This indexes: - MEMORY.md - memory/*.md

It does NOT index session transcripts by default.

Session Memory Search (Available but Not Enabled)

OpenClaw already has an experimental feature for indexing sessions:

agents: {
  defaults: {
    memorySearch: {
      experimental: { sessionMemory: true },
      sources: ["memory", "sessions"]
    }
  }
}

This is significant. OpenClaw already has the machinery to index JSONL transcripts into the same vector store used by memory_search. It's opt-in, debounced, and async.

QMD Backend (Not Enabled)

QMD offers a more sophisticated approach: BM25 + vectors + reranking. It can also index sessions:

memory: {
  backend: "qmd",
  qmd: {
    sessions: {
      enabled: true,
      retentionDays: 30,
      exportDir: "~/.openclaw/agents/<id>/qmd/sessions/"
    }
  }
}

QMD sanitizes transcripts (User/Assistant turns only) into a dedicated collection.


Data Access Paths for a Hippocampus Plugin

A Hippocampus plugin running in-process has three data access paths:

Path 1: Hook-Based Real-Time Stream (from Session 1)

message:preprocessed → Current turn content (fully enriched)
message:sent → Agent's response
before_prompt_build → Conversation history (messages array)

Pro: Real-time, no disk I/O, already parsed Con: Only current session's messages, no cross-session history

Path 2: Direct JSONL File Reading

// Plugin reads transcript files directly
const transcriptPath = `~/.openclaw/agents/${agentId}/sessions/${sessionId}.jsonl`;
const lines = fs.readFileSync(transcriptPath, 'utf-8').split('\n');
const messages = lines
  .filter(l => l.trim())
  .map(l => JSON.parse(l))
  .filter(e => e.type === 'message');

Pro: Full historical access, all sessions, all conversations Con: Disk I/O, needs parsing, can be large (36MB+ and growing)

Path 3: Piggyback on Existing Memory Infrastructure

If experimental.sessionMemory or QMD session indexing is enabled, the Hippocampus plugin could query the already-indexed session data via the same vector store.

Pro: No duplicate indexing, leverages existing infrastructure Con: Dependent on user enabling the feature, less control over indexing strategy


Cross-Session Data: The Key Differentiator

The critical gap that Hippocampus fills is cross-session retrieval. Here's what currently exists:

System Current Session Past Sessions (same key) Past Sessions (different key)
Model context window ❌ (compacted away)
memory_search (files) ✅ (if written to memory) ✅ (if written to memory)
memory_search (sessions) ❌ (experimental) ✅ (if enabled) ✅ (if enabled)
QMD sessions ✅ (if enabled) ✅ (if enabled)
Hippocampus

The key insight: memory_search only finds things the agent explicitly wrote down. Hippocampus indexes everything — including the vast majority of conversation content that never gets written to memory/*.md.

The Memory Gap Problem

Consider this flow: 1. Monday: Jeff and Jules discuss a vendor's API pricing ($0.02/1K tokens) 2. Tuesday: Session resets. That pricing detail was in conversation but never written to memory 3. Wednesday: Jeff asks "what was that vendor's pricing?" 4. Current state: Lost. memory_search finds nothing. The JSONL transcript has it, but nothing queries it.

Hippocampus closes this gap by indexing the transcripts and surfacing relevant context automatically.


Indexing Strategy for Hippocampus

Given the data landscape, here's the optimal indexing approach:

What to Index

  1. All message entries with role user or assistant — the conversational content
  2. compaction entries — these are LLM-generated summaries, extremely information-dense
  3. Strip metadata prefixes from user messages (conversation info blocks)
  4. Skip tool calls, tool results (usually too noisy), model changes, thinking levels

How to Chunk

JSONL transcripts are naturally chunked by turn. Options:

  1. Per-turn chunks — each message is one chunk. Simple, but short turns may lack context.
  2. Turn-pair chunks — user + assistant paired together. Better semantic coherence.
  3. Sliding window — N turns with overlap. Best retrieval but more storage.

Recommendation: Turn-pair chunks (user question + assistant response) as the primary unit. This gives semantic completeness — the question provides the query surface, the answer provides the information surface.

Chunk Metadata

Each chunk should carry:

{
  sessionKey: string,      // Which conversation
  sessionId: string,       // Which transcript
  timestamp: number,       // When it happened
  channelId: string,       // telegram, discord, etc.
  chatType: string,        // direct, group, topic
  topicId?: string,        // Telegram topic ID
  senderId?: string,       // Who said it
  chunkType: "turn-pair" | "compaction" | "single-turn",
}

This metadata enables filtered retrieval: "what did we discuss in the Ideas topic this week?"


Size Estimation

Current data: - 174 transcripts, ~36 MB total - Average transcript: ~207 KB - Estimated ~50-100 turn-pairs per transcript - Estimated total: ~10,000-17,000 chunks

With nomic-embed-text (768 dimensions, float32): - Per chunk: 768 × 4 = 3,072 bytes - 17,000 chunks × 3,072 = ~52 MB for embeddings - Plus chunk text storage: ~36 MB - Total index size: ~88 MB — very manageable for local SQLite

Growth rate: ~1-5 MB/day of new transcripts → ~3-15 KB/day of new embeddings. Not a concern.


This is a critical design question. Options:

Bad idea. memory_search serves curated, high-signal memory. Hippocampus serves broad conversational recall. Different purposes.

Hippocampus runs alongside memory_search. The before_prompt_build hook injects Hippocampus results as prependContext, while memory_search continues to work via the agent's memory_search tool calls.

Key difference: - memory_search = agent-initiated ("let me look this up") - Hippocampus = system-initiated ("you might need this")

Merge both into one retrieval system. More complex, diminishing returns for v1.


Open Questions for Next Sessions

  1. Prototype: What's the simplest thing that proves turn-pair indexing + before_prompt_build injection works? (Session 3)
  2. Algorithm: How does the plugin decide what's relevant? Keyword match? Semantic similarity threshold? Topic detection? (Session 4)
  3. Index tech: SQLite FTS5 vs vector embeddings vs hybrid for the index? (Session 5)

Session 2 Verdict

The data is there and it's excellent. Every conversation is persisted as structured JSONL with rich metadata. The Hippocampus plugin has full filesystem access to:

  • 174+ transcript files (~36 MB)
  • Turn-by-turn message content with sender/channel/topic metadata
  • Compaction summaries (information-dense)
  • Session store for filtering and discovery

The architecture from Session 1 + the data from Session 2 = a complete picture:

Real-time stream (hooks)     → Index new turns as they happen
Historical transcripts (JSONL) → Bulk index on first run
Session store metadata         → Filter/scope retrieval
before_prompt_build            → Inject relevant context

The biggest insight: OpenClaw already has the building blocks (experimental session memory, QMD session indexing). Hippocampus could either build on those or replace them with a more purpose-built retrieval system. The key differentiator is the automatic injection — memory_search requires the agent to think "I should search for this," while Hippocampus just does it.