Session 2: What Conversation Data Is Accessible Locally? Where Are Transcripts Stored?¶

Date: 2026-03-21 17:03 PST Focus: Mapping the full data landscape available to a Hippocampus plugin — what exists, where it lives, what format it's in, and what's indexable.

Key Finding: The Data Is Rich, Structured, and Already On Disk¶

Every conversation that passes through OpenClaw is persisted as JSONL transcripts on the local filesystem. This is a goldmine for a Hippocampus-style system.

Data Location¶

~/.openclaw/agents/<agentId>/sessions/
├── sessions.json                          # Session store (key → metadata map)
├── <sessionId>.jsonl                      # Per-session transcript (append-only)
├── <sessionId>-topic-<threadId>.jsonl     # Telegram topic transcripts
├── *.jsonl.reset.<timestamp>              # Archived transcripts from /new resets
└── *.jsonl.deleted.<timestamp>            # Deleted/pruned transcripts

Current installation stats: - 174 active JSONL transcript files - 186 session keys in sessions.json - ~36 MB total session storage - Sessions span: main DM, Telegram topics, cron runs, slash commands, etc.

Transcript Structure (JSONL)¶

Each .jsonl file is structured as a tree with id + parentId relationships:

{"type":"session","version":3,"id":"<uuid>","timestamp":"...","cwd":"..."}
{"type":"model_change","id":"...","parentId":null,"provider":"anthropic","modelId":"claude-opus-4-6"}
{"type":"thinking_level_change","id":"...","parentId":"...","thinkingLevel":"medium"}
{"type":"custom","customType":"model-snapshot","data":{...},"id":"...","parentId":"..."}
{"type":"message","id":"...","parentId":"...","timestamp":"...","message":{"role":"user","content":[{"type":"text","text":"..."}]}}
{"type":"message","id":"...","parentId":"...","timestamp":"...","message":{"role":"assistant","content":[{"type":"text","text":"..."}]}}

Entry types relevant to Hippocampus:

Type	Hippocampus Use	Index Priority
`message` (role: user)	Primary query material	⭐ HIGH
`message` (role: assistant)	Context + answers	⭐ HIGH
`message` (role: toolResult)	Factual data returned by tools	MEDIUM
`custom_message`	Extension-injected context	MEDIUM
`compaction`	Summaries of older conversation	⭐ HIGH (dense info)
`session` header	Session metadata (cwd, timestamp)	LOW (metadata only)
`model_change`	Model routing history	LOW
`custom` (model-snapshot)	Not needed	SKIP
`thinking_level_change`	Not needed	SKIP

What's In the Messages¶

User messages include rich metadata injected by OpenClaw:

{
  "role": "user",
  "content": [{
    "type": "text",
    "text": "Conversation info (untrusted metadata):\n```json\n{\"sender\": \"Jeffrey Flynn\", \"conversation_label\": \"Jeff & Jules\", \"topic_id\": \"13\"}\n```\n\nActual user message here"
  }]
}

This means each message carries: - Sender identity (name, ID) - Conversation context (channel, group, topic) - Timestamp - The actual content after the metadata block

Hippocampus parsing strategy: Strip the metadata prefix to get clean content for semantic indexing, but preserve metadata for filtering (e.g., "what did Jeff say about X" vs "what did I say in the Ideas topic").

Session Store Metadata (`sessions.json`)¶

The session store maps sessionKey → SessionEntry with:

{
  sessionId: string,           // Links to transcript file
  updatedAt: number,           // Last activity timestamp
  chatType: "direct" | "group" | "room",
  provider: string,            // Channel (telegram, discord, etc.)
  subject?: string,            // Group/topic name
  displayName?: string,        // Human-readable session label
  inputTokens: number,         // Cumulative token usage
  outputTokens: number,
  totalTokens: number,
  contextTokens: number,
  compactionCount: number,     // How many times compaction ran
}

Hippocampus use: Session metadata enables filtering by channel, topic, recency, and conversation type. This is crucial for scoped retrieval ("what did we discuss in the Projects topic?").

Existing Index Infrastructure¶

memory_search (Already Operational)¶

Current config:

{
  memorySearch: {
    provider: "ollama",        // Local Nomic embeddings
    fallback: "gemini",        // Remote fallback
    model: "nomic-embed-text"
  }
}

This indexes: - MEMORY.md - memory/*.md

It does NOT index session transcripts by default.

Session Memory Search (Available but Not Enabled)¶

OpenClaw already has an experimental feature for indexing sessions:

agents: {
  defaults: {
    memorySearch: {
      experimental: { sessionMemory: true },
      sources: ["memory", "sessions"]
    }
  }
}

This is significant. OpenClaw already has the machinery to index JSONL transcripts into the same vector store used by memory_search. It's opt-in, debounced, and async.

QMD Backend (Not Enabled)¶

QMD offers a more sophisticated approach: BM25 + vectors + reranking. It can also index sessions:

memory: {
  backend: "qmd",
  qmd: {
    sessions: {
      enabled: true,
      retentionDays: 30,
      exportDir: "~/.openclaw/agents/<id>/qmd/sessions/"
    }
  }
}

QMD sanitizes transcripts (User/Assistant turns only) into a dedicated collection.

Data Access Paths for a Hippocampus Plugin¶

A Hippocampus plugin running in-process has three data access paths:

Path 1: Hook-Based Real-Time Stream (from Session 1)¶

message:preprocessed → Current turn content (fully enriched)
message:sent → Agent's response
before_prompt_build → Conversation history (messages array)

Pro: Real-time, no disk I/O, already parsed Con: Only current session's messages, no cross-session history

Path 2: Direct JSONL File Reading¶

// Plugin reads transcript files directly
const transcriptPath = `~/.openclaw/agents/${agentId}/sessions/${sessionId}.jsonl`;
const lines = fs.readFileSync(transcriptPath, 'utf-8').split('\n');
const messages = lines
  .filter(l => l.trim())
  .map(l => JSON.parse(l))
  .filter(e => e.type === 'message');

Pro: Full historical access, all sessions, all conversations Con: Disk I/O, needs parsing, can be large (36MB+ and growing)

Path 3: Piggyback on Existing Memory Infrastructure¶

If experimental.sessionMemory or QMD session indexing is enabled, the Hippocampus plugin could query the already-indexed session data via the same vector store.

Pro: No duplicate indexing, leverages existing infrastructure Con: Dependent on user enabling the feature, less control over indexing strategy

Cross-Session Data: The Key Differentiator¶

The critical gap that Hippocampus fills is cross-session retrieval. Here's what currently exists:

System	Current Session	Past Sessions (same key)	Past Sessions (different key)
Model context window	✅	❌ (compacted away)	❌
memory_search (files)	✅	✅ (if written to memory)	✅ (if written to memory)
memory_search (sessions)	❌ (experimental)	✅ (if enabled)	✅ (if enabled)
QMD sessions	❌	✅ (if enabled)	✅ (if enabled)
Hippocampus	✅	✅	✅

The key insight: memory_search only finds things the agent explicitly wrote down. Hippocampus indexes everything — including the vast majority of conversation content that never gets written to memory/*.md.

The Memory Gap Problem¶

Consider this flow: 1. Monday: Jeff and Jules discuss a vendor's API pricing ($0.02/1K tokens) 2. Tuesday: Session resets. That pricing detail was in conversation but never written to memory 3. Wednesday: Jeff asks "what was that vendor's pricing?" 4. Current state: Lost. memory_search finds nothing. The JSONL transcript has it, but nothing queries it.

Hippocampus closes this gap by indexing the transcripts and surfacing relevant context automatically.

Indexing Strategy for Hippocampus¶

Given the data landscape, here's the optimal indexing approach:

What to Index¶

All message entries with role user or assistant — the conversational content
compaction entries — these are LLM-generated summaries, extremely information-dense
Strip metadata prefixes from user messages (conversation info blocks)
Skip tool calls, tool results (usually too noisy), model changes, thinking levels

How to Chunk¶

JSONL transcripts are naturally chunked by turn. Options:

Per-turn chunks — each message is one chunk. Simple, but short turns may lack context.
Turn-pair chunks — user + assistant paired together. Better semantic coherence.
Sliding window — N turns with overlap. Best retrieval but more storage.

Recommendation: Turn-pair chunks (user question + assistant response) as the primary unit. This gives semantic completeness — the question provides the query surface, the answer provides the information surface.

Chunk Metadata¶

Each chunk should carry:

{
  sessionKey: string,      // Which conversation
  sessionId: string,       // Which transcript
  timestamp: number,       // When it happened
  channelId: string,       // telegram, discord, etc.
  chatType: string,        // direct, group, topic
  topicId?: string,        // Telegram topic ID
  senderId?: string,       // Who said it
  chunkType: "turn-pair" | "compaction" | "single-turn",
}

This metadata enables filtered retrieval: "what did we discuss in the Ideas topic this week?"

Size Estimation¶

Current data: - 174 transcripts, ~36 MB total - Average transcript: ~207 KB - Estimated ~50-100 turn-pairs per transcript - Estimated total: ~10,000-17,000 chunks

With nomic-embed-text (768 dimensions, float32): - Per chunk: 768 × 4 = 3,072 bytes - 17,000 chunks × 3,072 = ~52 MB for embeddings - Plus chunk text storage: ~36 MB - Total index size: ~88 MB — very manageable for local SQLite

Growth rate: ~1-5 MB/day of new transcripts → ~3-15 KB/day of new embeddings. Not a concern.

Interaction with Existing memory_search¶

This is a critical design question. Options:

Option A: Replace memory_search¶

Bad idea. memory_search serves curated, high-signal memory. Hippocampus serves broad conversational recall. Different purposes.

Option B: Complement memory_search (Recommended)¶

Hippocampus runs alongside memory_search. The before_prompt_build hook injects Hippocampus results as prependContext, while memory_search continues to work via the agent's memory_search tool calls.

Key difference: - memory_search = agent-initiated ("let me look this up") - Hippocampus = system-initiated ("you might need this")

Option C: Unified Search¶

Merge both into one retrieval system. More complex, diminishing returns for v1.

Open Questions for Next Sessions¶

Prototype: What's the simplest thing that proves turn-pair indexing + before_prompt_build injection works? (Session 3)
Algorithm: How does the plugin decide what's relevant? Keyword match? Semantic similarity threshold? Topic detection? (Session 4)
Index tech: SQLite FTS5 vs vector embeddings vs hybrid for the index? (Session 5)

Session 2 Verdict¶

The data is there and it's excellent. Every conversation is persisted as structured JSONL with rich metadata. The Hippocampus plugin has full filesystem access to:

174+ transcript files (~36 MB)
Turn-by-turn message content with sender/channel/topic metadata
Compaction summaries (information-dense)
Session store for filtering and discovery

The architecture from Session 1 + the data from Session 2 = a complete picture:

Real-time stream (hooks)     → Index new turns as they happen
Historical transcripts (JSONL) → Bulk index on first run
Session store metadata         → Filter/scope retrieval
before_prompt_build            → Inject relevant context

The biggest insight: OpenClaw already has the building blocks (experimental session memory, QMD session indexing). Hippocampus could either build on those or replace them with a more purpose-built retrieval system. The key differentiator is the automatic injection — memory_search requires the agent to think "I should search for this," while Hippocampus just does it.