Skip to content

QA Report — Adaptive Context Injection (ACI)

Reviewer: Quinn
Date: 2026-03-21
Spec version: Draft (Forge, 2026-03-21)
Status: BLOCKING ISSUES FOUND — Not ready for implementation


Summary

5 blockers, 4 major findings, 3 minor findings. The hook infrastructure exists and is broadly sound, but the spec has a real API mismatch, an internal token contradiction in acceptance criteria, a critical question about whether MEMORY.md is even bulk-injected by default, and several testability gaps that need shoring up before Melody starts any build work.


F-01 — agent:bootstrap hook API shape is incorrect in spec

Severity: BLOCKER

The spec's BootstrapHookContext interface (Section 10) is wrong. Per actual OpenClaw docs, the event structure is:

event.sessionKey        // top-level, NOT inside context
event.context.bootstrapFiles   // nested under context
event.context.workspaceDir
event.context.cfg

The spec defines BootstrapHookContext as a flat object with sessionKey, bootstrapFiles, workspaceDir, and cfg at the same level. This is not the real shape. The real shape has sessionKey at event top-level and bootstrapFiles nested under event.context.

The spec's classification code classifySession(sessionKey: string) is correct in concept, but the hook handler would need to read event.sessionKey, not event.context.sessionKey.

Recommendation: Correct the TypeScript contracts in Section 10 before Melody writes handler.ts. Or rely on Task 1 (Atlas API verification) to confirm the exact shape — but do not let Melody build until Task 1 completes and the spec is updated.


F-02 — MEMORY.md may NOT be in the default bootstrap list

Severity: BLOCKER

The spec's entire Phase 1 rationale rests on removing MEMORY.md from bulk injection. But per OpenClaw's context docs, the default bootstrap files are:

AGENTS.md, SOUL.md, TOOLS.md, IDENTITY.md, USER.md, HEARTBEAT.md, BOOTSTRAP.md

MEMORY.md is not listed as a default-injected file. This is either: - a) MEMORY.md is already on-demand (making Phase 1 a no-op), or
- b) MEMORY.md is injected via the existing workspace config/hook (not a default), or
- c) The OpenClaw docs example is incomplete

If (a), the "highest-leverage change in the spec" doesn't need to happen. This needs to be verified with /context list on a real session before Phase 1 is designed around removing it.

Recommendation: Run /context list and confirm whether MEMORY.md appears before writing a single line of hook code for Phase 1.


F-03 — before_prompt_build is a plugin hook, not a standalone hook

Severity: BLOCKER

The spec treats before_prompt_build as peer to agent:bootstrap — a hook file you can drop in workspace/hooks/. It is not. It requires:

  1. A full plugin with openclaw.plugin.json (required)
  2. Registration via api.on('before_prompt_build', ...) inside a plugin's register(api) function
  3. Enablement via plugins.entries.<id>.enabled

This is meaningfully more complex than the hook model. The spec correctly defers shedding to v2 and implements it via behavioral instructions in SOUL.md instead — that's fine. But the spec also references before_prompt_build in Section 3 (Layer 3) and Section 6 (Integration Point 2) as if it's a "plugin hook" that can be added easily later. The entry cost for v2 shedding is higher than described. Section 6's characterization of "requires building a minimal plugin (not just a hook)" is correct but undersells the delta — a plugin requires a manifest, must be explicitly allowed in config, and cannot be workspace-dropped like a hook file.

Recommendation: Flag in the spec that v2 Layer 3 shedding requires full plugin development including manifest authoring. Update the complexity estimate (currently ⭐⭐ in the task table for hook work — plugin work is ⭐⭐⭐).


F-04 — AC-1 token target is contradicted by the file set for Ops/Initiatives/Projects topics

Severity: BLOCKER

AC-1 states: "total injected bootstrap tokens are ≤ 400 tokens" for Telegram forum topic sessions.

But the topic override table loads WISDOM.md for topics 15 (Ops), 77 (Initiatives), and 79 (Projects). That file set is: SOUL.md (~80) + TOOLS_COMPACT.md (~200) + WISDOM.md (~250) = ~530 tokens, which exceeds the 400-token AC-1 threshold.

AC-1 says "no more than 3 files" — this passes (3 files). But the token ceiling fails for 3 of 8 configured topics.

Options: - Raise AC-1 threshold to ≤ 600 tokens (covers all topic variants) - Split AC-1 into two sub-criteria: one for minimal topics (Ideas, SigInt, Research, APA, BioThread) and one for enriched topics (Ops, Initiatives, Projects) - Reduce WISDOM.md target size to get under 400 total

Recommendation: Fix before Quinn can validate AC-1. Current target is internally contradictory.


F-05 — WISDOM.md Layer placement (already flagged by Jeff)

Severity: BLOCKER (acknowledged, fix in progress)

WISDOM.md as Layer 1 conditional is architecturally wrong. Judgment must be available before the agent decides what to load. WISDOM.md belongs at Layer 0 alongside SOUL.md (or folded into it), not conditionally injected. Not analyzing further — fix is in progress.


F-06 — bootstrap-extra-files constraint may limit custom file injection

Severity: MAJOR

The bundled bootstrap-extra-files hook states: "Only recognized bootstrap basenames are loaded" and "Subagent allowlist is preserved (AGENTS.md and TOOLS.md only)."

This constraint is on the bundled hook, not necessarily on the core injection pipeline. But it raises a real question: does OpenClaw's core filter WorkspaceBootstrapFile entries by basename before rendering them into the system prompt? If yes, injecting WISDOM.md, TOOLS_COMPACT.md, or HEARTBEAT.md via a custom hook would silently fail or be stripped.

Task 1 (Atlas API verification) should explicitly test injecting a non-standard filename and confirm it appears in /context list.

Recommendation: Add this to Task 1's test checklist explicitly. If the core enforces a basename allowlist, the entire conditional injection strategy needs rethinking.


F-07 — AC-3 and AC-9 are non-deterministic behavioral tests

Severity: MAJOR

AC-3 ("Jules reads MEMORY.md on demand") and AC-9 ("Jules reads missing files when needed") depend on LLM behavior. There's no reliable pass/fail without: - A defined test prompt - Minimum pass rate (e.g., 3/3 or 4/5 attempts) - Observable criterion (read tool call appears in session transcript)

As written, both criteria could pass on lucky runs and fail on others. A single test interaction is not sufficient evidence.

Recommendation: Specify a test protocol: 3 standard prompts that should trigger MEMORY.md retrieval, minimum 2/3 pass rate required, evidence = read tool call to MEMORY.md visible in session logs.


F-08 — Phase 0 has no rollback instruction for SOUL.md modification

Severity: MAJOR

Phase 0 compresses SOUL.md. If Jeff reviews and rejects, the spec says "Jeff reviews before any injection changes go live" — but Task 6 promotes the draft to the live SOUL.md. There is no rollback instruction for SOUL.md, only for the hook (Phase 1-2). A bad SOUL.md compression degrades Jules immediately on the next session start, before any hook changes.

Recommendation: Require a SOUL.md backup (SOUL.md.bak) before Task 6 promotes the compressed version. Add explicit rollback instruction: cp SOUL.md.bak SOUL.md. This should be a hard gate in the Phase 0 checklist.


F-09 — LESSONS_LEARNED.md referenced but not confirmed to exist

Severity: MINOR

Task 3 (wisdom compression) lists LESSONS_LEARNED.md (Forge's file) as an input source. This file is not confirmed to exist in the workspace. If missing, the script will either error or silently produce lower-quality output.

Recommendation: Add a pre-check in compress-wisdom.sh that lists available sources and logs warnings for any missing files before proceeding.


F-10 — No pre-migration baseline required

Severity: MINOR

AC-1 and AC-2 measure token reduction. Without a documented pre-migration baseline from /context list, there's no reference point to measure "reduction" from. The spec's Appendix A provides approximate baseline numbers, but these are from docs examples, not from the actual running system.

Recommendation: Add to Phase 0 checklist: capture /context list output and save to initiatives/adaptive-context-injection/baseline-context.txt before any files are modified.


F-11 — Task dependency chain partially implicit

Severity: MINOR

Phase 2 depends on files created in Phase 0 (WISDOM.md, TOOLS_COMPACT.md, compressed SOUL.md). The Task dependency table captures some of this (Task 7 depends on Tasks 1 + 5), but the Phase description doesn't explicitly state "Phase 2 requires Phase 0 complete." If someone runs Phase 2 before Phase 0, the hook would reference files that don't exist yet, causing silent injection failures.

Recommendation: Add explicit "Requires: Phase 0 complete" to the Phase 2 description.


Acceptance Criteria Testability Summary

AC Testable? Issues
AC-1 Partially Token target contradicted by Ops/Init/Projects topics (F-04)
AC-2 Yes Testable via /context list
AC-3 Weakly Non-deterministic; needs test protocol (F-07)
AC-4 Yes Testable via /context list in subagent session
AC-5 Yes (manual) Requires triggering each session type; no test harness specified
AC-6 Yes Token count script, content audit
AC-7 Partially "Covers ≥3 raw lessons" requires manual judgment; no automation path
AC-8 Yes Disable hook, check context list
AC-9 Weakly Non-deterministic; needs test protocol (F-07)

Hook Infrastructure — Confirmed

For the record:

  • agent:bootstrap: ✅ Exists. Fires before workspace file injection. Handler receives event.sessionKey (top-level) and event.context.bootstrapFiles (mutable array). This is the right integration point for Layer 0/1 control.
  • before_prompt_build: ✅ Exists — but as a plugin lifecycle hook (api.on('before_prompt_build', ...)), not a standalone hook file. Requires full plugin with openclaw.plugin.json.
  • WorkspaceBootstrapFile exact type shape: ❓ Not documented in public docs. Must be verified empirically (Task 1).

Blockers Before Implementation Can Start

  1. Verify MEMORY.md bootstrap status with /context list (F-02)
  2. Correct the BootstrapHookContext TypeScript interface (F-01) — after Task 1 Atlas verification
  3. Fix AC-1 token threshold contradiction (F-04)
  4. Resolve WISDOM.md Layer placement (F-05) — fix in progress per Jeff
  5. Confirm or deny WorkspaceBootstrapFile basename allowlist enforcement (F-06) — add to Task 1

Quinn out.