QA Report — Adaptive Context Injection (ACI)¶
Reviewer: Quinn
Date: 2026-03-21
Spec version: Draft (Forge, 2026-03-21)
Status: BLOCKING ISSUES FOUND — Not ready for implementation
Summary¶
5 blockers, 4 major findings, 3 minor findings. The hook infrastructure exists and is broadly sound, but the spec has a real API mismatch, an internal token contradiction in acceptance criteria, a critical question about whether MEMORY.md is even bulk-injected by default, and several testability gaps that need shoring up before Melody starts any build work.
F-01 — agent:bootstrap hook API shape is incorrect in spec¶
Severity: BLOCKER
The spec's BootstrapHookContext interface (Section 10) is wrong. Per actual OpenClaw docs, the event structure is:
event.sessionKey // top-level, NOT inside context
event.context.bootstrapFiles // nested under context
event.context.workspaceDir
event.context.cfg
The spec defines BootstrapHookContext as a flat object with sessionKey, bootstrapFiles, workspaceDir, and cfg at the same level. This is not the real shape. The real shape has sessionKey at event top-level and bootstrapFiles nested under event.context.
The spec's classification code classifySession(sessionKey: string) is correct in concept, but the hook handler would need to read event.sessionKey, not event.context.sessionKey.
Recommendation: Correct the TypeScript contracts in Section 10 before Melody writes handler.ts. Or rely on Task 1 (Atlas API verification) to confirm the exact shape — but do not let Melody build until Task 1 completes and the spec is updated.
F-02 — MEMORY.md may NOT be in the default bootstrap list¶
Severity: BLOCKER
The spec's entire Phase 1 rationale rests on removing MEMORY.md from bulk injection. But per OpenClaw's context docs, the default bootstrap files are:
AGENTS.md, SOUL.md, TOOLS.md, IDENTITY.md, USER.md, HEARTBEAT.md, BOOTSTRAP.md
MEMORY.md is not listed as a default-injected file. This is either:
- a) MEMORY.md is already on-demand (making Phase 1 a no-op), or
- b) MEMORY.md is injected via the existing workspace config/hook (not a default), or
- c) The OpenClaw docs example is incomplete
If (a), the "highest-leverage change in the spec" doesn't need to happen. This needs to be verified with /context list on a real session before Phase 1 is designed around removing it.
Recommendation: Run /context list and confirm whether MEMORY.md appears before writing a single line of hook code for Phase 1.
F-03 — before_prompt_build is a plugin hook, not a standalone hook¶
Severity: BLOCKER
The spec treats before_prompt_build as peer to agent:bootstrap — a hook file you can drop in workspace/hooks/. It is not. It requires:
- A full plugin with
openclaw.plugin.json(required) - Registration via
api.on('before_prompt_build', ...)inside a plugin'sregister(api)function - Enablement via
plugins.entries.<id>.enabled
This is meaningfully more complex than the hook model. The spec correctly defers shedding to v2 and implements it via behavioral instructions in SOUL.md instead — that's fine. But the spec also references before_prompt_build in Section 3 (Layer 3) and Section 6 (Integration Point 2) as if it's a "plugin hook" that can be added easily later. The entry cost for v2 shedding is higher than described. Section 6's characterization of "requires building a minimal plugin (not just a hook)" is correct but undersells the delta — a plugin requires a manifest, must be explicitly allowed in config, and cannot be workspace-dropped like a hook file.
Recommendation: Flag in the spec that v2 Layer 3 shedding requires full plugin development including manifest authoring. Update the complexity estimate (currently ⭐⭐ in the task table for hook work — plugin work is ⭐⭐⭐).
F-04 — AC-1 token target is contradicted by the file set for Ops/Initiatives/Projects topics¶
Severity: BLOCKER
AC-1 states: "total injected bootstrap tokens are ≤ 400 tokens" for Telegram forum topic sessions.
But the topic override table loads WISDOM.md for topics 15 (Ops), 77 (Initiatives), and 79 (Projects). That file set is: SOUL.md (~80) + TOOLS_COMPACT.md (~200) + WISDOM.md (~250) = ~530 tokens, which exceeds the 400-token AC-1 threshold.
AC-1 says "no more than 3 files" — this passes (3 files). But the token ceiling fails for 3 of 8 configured topics.
Options: - Raise AC-1 threshold to ≤ 600 tokens (covers all topic variants) - Split AC-1 into two sub-criteria: one for minimal topics (Ideas, SigInt, Research, APA, BioThread) and one for enriched topics (Ops, Initiatives, Projects) - Reduce WISDOM.md target size to get under 400 total
Recommendation: Fix before Quinn can validate AC-1. Current target is internally contradictory.
F-05 — WISDOM.md Layer placement (already flagged by Jeff)¶
Severity: BLOCKER (acknowledged, fix in progress)
WISDOM.md as Layer 1 conditional is architecturally wrong. Judgment must be available before the agent decides what to load. WISDOM.md belongs at Layer 0 alongside SOUL.md (or folded into it), not conditionally injected. Not analyzing further — fix is in progress.
F-06 — bootstrap-extra-files constraint may limit custom file injection¶
Severity: MAJOR
The bundled bootstrap-extra-files hook states: "Only recognized bootstrap basenames are loaded" and "Subagent allowlist is preserved (AGENTS.md and TOOLS.md only)."
This constraint is on the bundled hook, not necessarily on the core injection pipeline. But it raises a real question: does OpenClaw's core filter WorkspaceBootstrapFile entries by basename before rendering them into the system prompt? If yes, injecting WISDOM.md, TOOLS_COMPACT.md, or HEARTBEAT.md via a custom hook would silently fail or be stripped.
Task 1 (Atlas API verification) should explicitly test injecting a non-standard filename and confirm it appears in /context list.
Recommendation: Add this to Task 1's test checklist explicitly. If the core enforces a basename allowlist, the entire conditional injection strategy needs rethinking.
F-07 — AC-3 and AC-9 are non-deterministic behavioral tests¶
Severity: MAJOR
AC-3 ("Jules reads MEMORY.md on demand") and AC-9 ("Jules reads missing files when needed") depend on LLM behavior. There's no reliable pass/fail without: - A defined test prompt - Minimum pass rate (e.g., 3/3 or 4/5 attempts) - Observable criterion (read tool call appears in session transcript)
As written, both criteria could pass on lucky runs and fail on others. A single test interaction is not sufficient evidence.
Recommendation: Specify a test protocol: 3 standard prompts that should trigger MEMORY.md retrieval, minimum 2/3 pass rate required, evidence = read tool call to MEMORY.md visible in session logs.
F-08 — Phase 0 has no rollback instruction for SOUL.md modification¶
Severity: MAJOR
Phase 0 compresses SOUL.md. If Jeff reviews and rejects, the spec says "Jeff reviews before any injection changes go live" — but Task 6 promotes the draft to the live SOUL.md. There is no rollback instruction for SOUL.md, only for the hook (Phase 1-2). A bad SOUL.md compression degrades Jules immediately on the next session start, before any hook changes.
Recommendation: Require a SOUL.md backup (SOUL.md.bak) before Task 6 promotes the compressed version. Add explicit rollback instruction: cp SOUL.md.bak SOUL.md. This should be a hard gate in the Phase 0 checklist.
F-09 — LESSONS_LEARNED.md referenced but not confirmed to exist¶
Severity: MINOR
Task 3 (wisdom compression) lists LESSONS_LEARNED.md (Forge's file) as an input source. This file is not confirmed to exist in the workspace. If missing, the script will either error or silently produce lower-quality output.
Recommendation: Add a pre-check in compress-wisdom.sh that lists available sources and logs warnings for any missing files before proceeding.
F-10 — No pre-migration baseline required¶
Severity: MINOR
AC-1 and AC-2 measure token reduction. Without a documented pre-migration baseline from /context list, there's no reference point to measure "reduction" from. The spec's Appendix A provides approximate baseline numbers, but these are from docs examples, not from the actual running system.
Recommendation: Add to Phase 0 checklist: capture /context list output and save to initiatives/adaptive-context-injection/baseline-context.txt before any files are modified.
F-11 — Task dependency chain partially implicit¶
Severity: MINOR
Phase 2 depends on files created in Phase 0 (WISDOM.md, TOOLS_COMPACT.md, compressed SOUL.md). The Task dependency table captures some of this (Task 7 depends on Tasks 1 + 5), but the Phase description doesn't explicitly state "Phase 2 requires Phase 0 complete." If someone runs Phase 2 before Phase 0, the hook would reference files that don't exist yet, causing silent injection failures.
Recommendation: Add explicit "Requires: Phase 0 complete" to the Phase 2 description.
Acceptance Criteria Testability Summary¶
| AC | Testable? | Issues |
|---|---|---|
| AC-1 | Partially | Token target contradicted by Ops/Init/Projects topics (F-04) |
| AC-2 | Yes | Testable via /context list |
| AC-3 | Weakly | Non-deterministic; needs test protocol (F-07) |
| AC-4 | Yes | Testable via /context list in subagent session |
| AC-5 | Yes (manual) | Requires triggering each session type; no test harness specified |
| AC-6 | Yes | Token count script, content audit |
| AC-7 | Partially | "Covers ≥3 raw lessons" requires manual judgment; no automation path |
| AC-8 | Yes | Disable hook, check context list |
| AC-9 | Weakly | Non-deterministic; needs test protocol (F-07) |
Hook Infrastructure — Confirmed¶
For the record:
agent:bootstrap: ✅ Exists. Fires before workspace file injection. Handler receivesevent.sessionKey(top-level) andevent.context.bootstrapFiles(mutable array). This is the right integration point for Layer 0/1 control.before_prompt_build: ✅ Exists — but as a plugin lifecycle hook (api.on('before_prompt_build', ...)), not a standalone hook file. Requires full plugin withopenclaw.plugin.json.WorkspaceBootstrapFileexact type shape: ❓ Not documented in public docs. Must be verified empirically (Task 1).
Blockers Before Implementation Can Start¶
- Verify MEMORY.md bootstrap status with
/context list(F-02) - Correct the
BootstrapHookContextTypeScript interface (F-01) — after Task 1 Atlas verification - Fix AC-1 token threshold contradiction (F-04)
- Resolve WISDOM.md Layer placement (F-05) — fix in progress per Jeff
- Confirm or deny
WorkspaceBootstrapFilebasename allowlist enforcement (F-06) — add to Task 1
Quinn out.