Skills Overhaul — Initiative Brief¶

Date: 2026-03-20 Owner: Jules Status: Spec phase — Atlas writing specs, Quinn QA, Jules review

Context¶

Full audit of our 11 skills against Anthropic's official "Complete Guide to Building Skills for Claude" (32-page PDF + Claude Code docs). Multiple structural, content, and implementation gaps identified. Jeff wants systematic fix, not ad hoc.

Reference Material (stored locally)¶

research/anthropic-skills-guide.md — Full Anthropic guide (extracted from PDF)
research/anthropic-skills-docs.md — Claude Code skills docs page
research/ANTHROPIC_SKILLS_GUIDE_REVIEW.md — Atlas's prior review
Anthropic example skills reviewed: webapp-testing, skill-creator, doc-coauthoring, internal-comms, mcp-builder, brand-guidelines

KTLO Work — Existing Skill Fixes¶

Universal Gaps (apply to ALL skills)¶

Add negative triggers to every skill description ("Do NOT use for...")
Add examples to every skill (good output vs bad output, or scenario walkthroughs)
Add troubleshooting sections where applicable

Per-Skill Fixes¶

cost-estimation¶

Move MODEL_PRICING_REGISTRY.md into references/
Add scripts/estimate.py — automate the cost calculation instead of manual math
Add example output format
Add negative trigger: estimates only, not billing-grade

frontend-design¶

Add references/ with font pairings, color palette examples
Add negative trigger: "Use vv-dashboard-design for MC work, not this"
Add 2-3 example scenarios

intelligence-suite¶

Decision: Archive or adapt. It's Makima's skill, not ours. vv-sigint is our intel skill.
If keeping: strip Makima references, align with our agent team

memory-manager¶

Add examples: good memory entry vs bad memory entry
Consider scripts/consolidate-check.py for line count + contradiction detection
Add negative trigger: not for session continuity (that's AGENTS.md)

openclaw-prime¶

Add negative trigger: not for feature requests/bug reports
Add troubleshooting section for common gateway issues

project-pipeline¶

Move PROJECT_EVALUATION_TEMPLATE.md into references/evaluation-template.md
Add completed example evaluation in references/examples/
Add negative trigger: not for existing initiative status tracking

project-scaffolding → MERGE INTO service-management¶

Merge scaffolding content into service-management as a "New Project" section
Add scripts/scaffold.sh to service-management
Add references for standard stack config to service-management
Add example of properly scaffolded VV project
Delete project-scaffolding skill after merge

qa-validation¶

Critical: Add scripts/validate.sh — runs build, test, health check programmatically
Add references/peekaboo-guide.md
Add negative trigger: code/build QA only, not business analysis
Add troubleshooting for common build failures

service-management (absorbs project-scaffolding)¶

Absorb project-scaffolding content as a "New Project Setup" section
Add scripts/start-service.sh, scripts/stop-service.sh, scripts/check-health.sh, scripts/scaffold.sh
Add references/ with standard stack config (tsconfig, tailwind, etc.)
Add troubleshooting: port conflicts, stale PIDs, launchd issues
Add example walkthroughs: "Starting MC from scratch", "Scaffolding a new VV app"
Delete project-scaffolding skill after merge

vv-dashboard-design¶

Add examples of correct vs incorrect component implementations in references/
Add negative trigger: MC only, use frontend-design for general web
Consider scripts/check-tokens.sh to grep for hardcoded hex values

vv-sigint¶

Fix stale cron schedule section — remove hardcoded times, reference cron as source of truth
Add examples: well-scored signal vs rejected signal
Add troubleshooting: source 403s, RSS format changes
Consider scripts/check-sources.sh to ping source URLs

New Skills to Build¶

1. doc-coauthoring (adapted from Anthropic)¶

3-stage workflow: Context Gathering → Refinement → Reader Testing
Primary users: Forge (specs), Jules (proposals), Atlas (research docs)
Adapt Anthropic's 375-line skill to VV context

2. webapp-testing (adapted from Anthropic)¶

Playwright-based testing for Quinn
Bundle scripts/with_server.py for server lifecycle management
Replace Quinn's manual "check every page" with programmatic testing

3. skill-creator (adapted from Anthropic)¶

Meta-skill for building and improving our own skills
Includes eval framework for testing trigger accuracy
Ensures future skills are built to standard from the start

4. agent-dispatch (new, VV-specific)¶

Codifies agent selection: Atlas/Melody/Quinn/Forge
Model routing rules, timeout settings, handoff protocol
Currently scattered across MEMORY.md as prose — should be procedural

5. revenue-modeling (new, VV-specific)¶

Standardize ARR modeling, pricing tiers, TAM/SAM/SOM analysis
Used by Forge (specs) and Atlas (market research)
Relevant for APA pricing validation

Process¶

Tiered Execution¶

Tier	What	Process
Simple	Negative triggers, stale content, frontmatter cleanup	Jules implements → Quinn full QA → Jules strategic/tactical review
Medium	Scripts, reference restructuring, skill merges	Forge specs → Melody builds → Quinn full QA → Jules strategic/tactical review
Complex	New skills from scratch	Forge specs → Quinn reviews spec → Jules reviews spec → Melody builds → Quinn full QA → Jules strategic/tactical review

Non-negotiable rules¶

Quinn always does FULL QA — never partial, never spot-check
Jules always reviews for strategic and tactical alignment — policies, protocols, tasks, projects, initiatives, ad hoc conversations. Never assume agent output fits our goals without checking.
Jeff approves scope before any implementation begins
Atlas available for research input if Forge needs it during spec writing

Priority Order¶

Universal fixes (negative triggers, examples) — highest leverage
Script additions (qa-validation, service-management, cost-estimation) — "code is deterministic"
Stale content fixes (vv-sigint cron schedule, project-pipeline paths)
New skills (doc-coauthoring, webapp-testing first — most immediately useful)
Nice-to-have new skills (skill-creator, agent-dispatch, revenue-modeling)