Agent-Flow · v1.0 · April 2026
Agents lie.
Agent-flow makes
them prove it.
A practical introduction to the six-agent Claude Code plugin —
installation, commands, and the gates that keep it honest.
Agent-Flow01 · Orientation
What it is
A Claude Code plugin that replaces one generalist agent with six specialists and a verification layer.
Four slash commands — /orchestrate, /team-orchestrate, /deep-dive, /analyze — fan work across six specialists with mandatory evidence gates.
Agent-Flow01 · Orientation
Design philosophy
Three principles shape every decision in the system.
01
Specialization over generalization
Each agent has a focused role, a restricted toolset, and a single kind of output.
02
Verification over trust
Every claim of completion must be backed by actual command output — not summaries.
03
Cost awareness over convenience
Opus reasons. Sonnet executes. You pay the premium only where judgment beats speed.
Agent-Flow01 · Orientation
Principle 01
Specialists beat a single generalist — because the constraints are the feature.
A generalist does all five jobs. Review becomes performative.
- Reviewer has no Write/Edit — can't fix what it reviews.
- Planner has no Bash — can't skip to execution.
- Verifier has no Edit — only reports output.
Agent-Flow02 · Agents
The cast
Meet the six specialists.
Explorer
Riko
Reads, searches, queries the graph. Never writes.
OPUS · READ-ONLY
Planner
Senku
Turns findings into a concrete file-by-file task list.
OPUS · TODO-ONLY
Executor
Loid
The only agent with Write and Edit. Runs its own sanity checks.
SONNET · WRITE
Reviewer
Lawliet
Static analysis only. Emits APPROVED or NEEDS_CHANGES.
SONNET · ANALYZE
Verifier
Alphonse
Runs the test suite. Reports exact output. No summaries.
SONNET · VERIFY
Authoring
Speedwagon
Authors interactive HTML explainers in a scoped sandbox.
SONNET · SCOPED-WRITE
Agent-Flow · Agent 1/602 · Agents
- Model
- Opus
- Mode
- Read-only
- Owns skills
- exploration-strategy
graphify-usage
personal-kb-usage
Finds files, cites lines, never edits them.
Tools
Read
Grep
Glob
Bash (AST only)
WebSearch
WebFetch
Write
Edit
Discipline: Every claim backed by file paths and line numbers.
src/auth/jwt.ts:42 — signToken() builds the HS256 payload used by /api/login
Agent-Flow · Exploration01 · Foundations
How the Explorer works
Start with the graph. Grep last. Never guess.
Graph orientation
Shape before text.
Local repository
Where 90% lives.
Web search
Only for externals.
Ask the user
Last resort — once.
Why the next four agents need this
→ Planner
Plans against real files.
→ Executor
Edits with citations.
→ Reviewer
Same ground truth.
→ Verifier
Runs the right checks.
Agent-Flow · Agent 2/602 · Agents
- Model
- Opus
- Mode
- Plan-only
- Owns skills
- task-classification
prompt-refinement
team-decision
Writes the plan. Does not touch the code.
Tools
Read
Grep
Glob
TodoWrite
Write
Edit
Bash
Discipline: Every file path in the plan is verified to exist.
TodoWrite: "Add signToken() to src/auth/jwt.ts — acceptance: returns {token, expiresAt}"
Agent-Flow · Planning02 · Agents
How the Planner works
Turns exploration into a file-by-file plan.
Verify the findings
Trust but verify.
Blast-radius check
What else gets touched.
Write the task list
Atomic, ordered steps.
Pin the contract
Criteria. Risks. Done.
What lands in the plan
→ Files
Verified paths, in order.
→ Patterns
House style, cited.
→ Criteria
Testable done-conditions.
→ Risks
Known edges up front.
Agent-Flow · Agent 3/602 · Agents
- Model
- Sonnet
- Mode
- Write enabled
- Consumes
- verification-gates
exploration-strategy
agent-behavior-constraints
The only agent that edits files.
House rule: Show the output. No "looks good."
Tests: PASS (npm test — 15/15 passed)
Agent-Flow · Execution02 · Agents
How the Executor works
Follows the plan. Shows the output.
Read before write
Match local style.
One step at a time
No batched changes.
Evidence per step
Paste the output.
Stop on red
Never mask failure.
Non-negotiables
✗ No claims
"Complete" needs output.
✗ No skipping
No @ts-ignore.
✗ No scope creep
Plan only. Files follow-up.
→ Handoff
Diff + evidence downstream.
Agent-Flow · Agent 4/602 · Agents
- Model
- Sonnet
- Mode
- Analyze-only
- Bash
- Static analysis tools only
(tsc, mypy, eslint, ruff)
Reads fresh. Returns a verdict.
Verdicts
APPROVED
NEEDS_CHANGES
BLOCKED
NEEDS_CHANGES loops back to the Executor automatically — you do nothing.
Discipline: Run the analyzer — never approve from reading alone.
Verdict: APPROVED — tsc: 0 errors; eslint: 0 warnings
Agent-Flow · Review02 · Agents
How the Reviewer works
Static analysis. One verdict: APPROVED or NEEDS_CHANGES.
Graph orientation
Blast radius first.
Run the analyzers
tsc · mypy · eslint · ruff.
Pattern check
Match local idioms.
Verdict
Zero critical to approve.
What the Reviewer will not do
✗ Run tests
That's the Verifier's job.
✗ Modify code
Suggest only. Never edit.
✗ Run the app
Analyzers only.
↻ Loop
Failures route to Executor.
Agent-Flow · Agent 5/602 · Agents
- Model
- Sonnet
- Tools
- Bash · Read · Grep
- Owns skills
- verification-gates
Runs the tests that prove it.
Discipline: Show the exact output — no summaries, ever.
$ pytest
========== 4 passed in 0.12s ==========
$ mypy .
Success: no issues found in 12 source files
Overall: VERIFIED
Agent-Flow · Verification02 · Agents
How the Verifier works
Runs the suite. Pastes the output.
Detect the project
Pick the right toolchain.
Run the four gates
Tests · types · lint · build.
Report verbatim
No summaries. Ever.
Emit the verdict
All green, or FAILED.
Why the Verifier is last
→ Authority
Final word on done.
→ Separation
Fresh run, not incremental.
→ Tools
Bash · Read · Grep only.
↻ Failure loop
Back to Executor with output.
Agent-Flow · Agent 6/602 · Agents
- Model
- Sonnet
- Mode
- Scoped-write
- Consumes skills
- agent-behavior-constraints
exploration-strategy
explainer-design-system
Authors the brief and HTML fragment. Nothing else.
Tools
Read
Grep
Glob
Write†
Edit†
Bash‡
† Write/Edit scoped to explain-out/ and .claude/explain-briefs/.
‡ Bash scoped to bash scripts/compile-explain.sh only.
Agent-Flow · Authoring02 · Agents
How the Authoring agent works
Reads the scope. Renders the explainer.
Verify every ref
Read each file:line before embedding.
Equip design skill
Load explainer-design-system before any HTML.
Write the brief
Teaching arc, refs, metaphor → .md file.
Render + assemble
HTML fragment → assembler → explain-out/index.html.
What Speedwagon will not do
✗ Modify src/
Application code is off-limits.
✗ Call agents
Orchestrator handles routing.
✗ Run tests
Lint guardrail only.
→ Output
Brief + fragment → explain-out/index.html.
Agent-Flow02 · Agents
Principle 03
Opus plans. Sonnet executes. Nobody gets both.
Opus — reason
Unfamiliar codebases. Ambiguous requirements. Strategic sequencing.
- Explorer (Riko) — exploration
- Planner (Senku) — planning
Sonnet — execute
Clear plans. Pass-fail criteria. Mechanical verification.
- Executor (Loid) — implementation
- Reviewer (Lawliet) — review
- Verifier (Alphonse) — verification
- Authoring (Speedwagon) — explainer HTML
Agent-Flow03 · Installation
Getting started
Installation takes three commands.
# Method 0 — Claude Code marketplace (recommended)
claude> /plugin marketplace add josix/agent-flow
claude> /plugin install agent-flow@josix-plugins
# Method 2 — git clone
$ git clone https://github.com/josix/agent-flow.git
$ cd agent-flow && claude --plugin-dir .
Prerequisites: Claude Code CLI · Bash 4+ · jq · git
Agent-Flow03 · Installation
Before your first run
Validate the install with a single script.
plugin.json + hooks.json parse as valid JSON
- Every agent, skill, and command frontmatter parses
- Hook scripts exist, are executable, valid syntax
- Edge cases: missing
jq, malformed input, env-var propagation
$ bash scripts/validate-plugin.sh
Running 10 structural checks…
✓ plugin.json valid
✓ hooks/hooks.json valid
✓ 6 agents parsed
✓ 9 skills parsed
✓ 12 hook scripts executable
…
✓ All tests passed
Agent-Flow04 · Commands
The main command
/orchestrate runs the full pipeline end to end.
claude>
/orchestrate
Add user authentication with JWT tokens
You give it a task in plain English. It delegates to Explorer → Planner → Executor → Reviewer → Verifier in sequence. Each agent's output becomes the next agent's context. No babysitting.
Agent-Flow04 · Commands
/orchestrate — five phases
Watch the pipeline, agent by agent.
PHASE 01
ExplorerRiko
Explore the codebase. Gather files, patterns, constraints.
Opus
PHASE 02
PlannerSenku
Plan the change. List files, order, risks, acceptance criteria.
Opus
PHASE 03
ExecutorLoid
Implement the plan. Write and edit files. Run sanity checks.
Sonnet
PHASE 04
ReviewerLawliet
Review with static analysis. Emit APPROVED or NEEDS_CHANGES.
Sonnet
PHASE 05
VerifierAlphonse
Run tests, types, lint, build. Report exact output.
Sonnet
The orchestrator — the main Claude instance — never writes code. It coordinates specialists and collects evidence.
Agent-Flow04 · Commands
/team-orchestrate
Same pipeline — but review and verify run in parallel.
01
ExplorerRiko
Explore.
Opus
03
ExecutorLoid
Implement.
Sonnet
04a · parallel
ReviewerLawliet
Review.
Sonnet
04b · parallel
VerifierAlphonse
Verify.
Sonnet
05
Merge
Combine results. Iterate if gates fail.
—
30–40%
Wall-clock reduction
The Reviewer and Verifier don't depend on each other's output — so they don't need to wait. Pass --force-sequential if you ever need the old behavior.
Agent-Flow04 · Commands
/deep-dive
Cache codebase context once. Reuse it for the rest of the session.
- Fire 5+ parallel Rikos against structure, conventions, anti-patterns, build, architecture, testing.
- The Planner synthesizes findings into a unified context file.
- Output lands at
.claude/deep-dive.local.md — ephemeral, session-scoped.
claude> /deep-dive
# whole codebase, ~3–5 min
claude> /deep-dive --focus=src/auth
# narrow scope
claude> /deep-dive --refresh
# codebase changed, update context
Agent-Flow04 · Commands
Compounding returns
Reuse one deep-dive across many orchestrations.
# 1. Pay the exploration cost once
claude> /deep-dive
# 2. Every subsequent task reuses that context
claude> /orchestrate --use-deep-dive Add user authentication
claude> /orchestrate --use-deep-dive Add password reset
claude> /orchestrate --use-deep-dive Add email verification
# 3. When the codebase drifts, refresh
claude> /deep-dive --refresh
The pattern: front-load exploration when you start a session, then run as many tasks as you want without repaying the cost.
Agent-Flow04 · Commands
04 · Commands
Generate an interactive explainer for any topic.
- Riko gathers scope — file:line refs, graph nodes, key terminology.
- Senku designs a 3–5 screen teaching arc with a metaphor and translator pick.
- Speedwagon authors the module brief and HTML fragment.
- Assembler + lint produce
explain-out/index.html.
claude> /agent-flow:explain how does orchestration work
claude> /agent-flow:explain --revise <slug>
Requires /deep-dive first. Graph optional.
Agent-Flow05 · Enforcement
SECTION 03 · HOOKS
Hooks enforce the gates
you can't skip.
Seven lifecycle events — independent of any agent — make the system's promises structural instead of aspirational.
Agent-Flow05 · Enforcement
Seven lifecycle events
Every hook runs without asking the agent for permission.
UserPromptSubmitClarifies vague prompts before they run.Clarity
PreToolUseBlocks unsafe writes before they run.Safety
PostToolUsePer-agent verification reminders.Discipline
SessionStartDetects project type and toolchain.Context
Stop
Blocks completion until every check passes.
The gate
TeammateIdleQuality checks for parallel teammates.Parallel
TaskCompletedEvidence check before marking done.Receipts
Agent-Flow05 · Enforcement
The gate that matters
The Stop hook blocks completion until every check passes.
Node.js projects
- Tests:
npm test (when package.json has test)
- Types:
npx tsc --noEmit
Python projects
- Tests:
uv run pytest → pytest fallback
- Types:
mypy . (requires mypy.ini)
Escape hatches
.claude/skip-test-verification — bypass, with reason
.claude/known-test-failures — tolerated, non-blocking
.claude/test-command — custom command
Only new failures block.
Agent-Flow06 · Knowledge
Skills
Skills are the playbooks agents follow.
Each skill has an owner agent that embodies it and consumer agents that reference it. Update one skill → every consumer inherits the change.
| Skill | Owner | Consumers | Heuristic |
| exploration-strategy | Explorer | Planner · Executor | Parallel + targeted. Converge fast. |
| task-classification | Planner | Explorer · Orchestrator | When unsure, classify higher. |
| prompt-refinement | Planner | Orchestrator | One question at a time. |
| verification-gates | Verifier | Executor · Reviewer | Fail fast. Any failed gate halts work. |
| team-decision | Planner | Orchestrator | When in doubt, go sequential. |
| graphify-usage | Explorer | Planner · Reviewer | Traversal? Graph. Substring? Grep. |
| personal-kb-usage | Explorer | Planner · Reviewer | Cross-session memory, not this-session structure. |
| explainer-design-system | Vendored | Speedwagon | 12 primitives, 8 lint rules, design tokens for explainer HTML. |
Agent-Flow06 · Knowledge
Optional MCP server
Graphify answers the structural questions grep can't.
- graph_stats — shape of the codebase at a glance
- god_nodes — top-N fan-in anchors
- get_neighbors — blast radius of a change
- shortest_path — how does A reach B?
- get_community — what cluster is this in?
Explorer's rule: graph_stats + god_nodes before grep.
Orient, then investigate.
Install: pipx install 'graphifyy[mcp]' · Queryable by: Explorer, Planner, Reviewer
Agent-Flow06 · Knowledge
Optional MCP server
Personal-KB remembers across projects.
A cross-project memory store. Patterns you established elsewhere, decisions you've made, things future sessions should inherit — they live here and the read-path agents can query them.
Readers
Explorer
Planner
Reviewer
Turns "every session starts fresh" into "every session inherits."
Agent-Flow07 · Observability
/agent-flow:analyze
See what your agents actually did.
- Tool usage by agent — which tools each specialist calls, how often
- MCP / skill invocation counts
- Iteration & rejection rates
- Token spend, thinking effort, model mismatches
- Heuristic flags — orchestrator doing persona-owned work, MCP-skipping, allowlist violations
$ bash scripts/analyze.sh load --all-sessions
$ bash scripts/analyze.sh report
→ .claude/observability/events.db
→ .claude/observability/report.md
Local SQLite. No network. No deps beyond Python stdlib.
Agent-Flow08 · In practice
SECTION 04 · WORKED EXAMPLE
Empty directory
to verified code
in five minutes.
The first-orchestration walkthrough. Simplest possible task that exercises all five agents end to end.
Agent-Flow08 · In practice
Setup
Fresh directory. One prompt.
$ mkdir fib-demo && cd fib-demo
$ claude
claude>
/orchestrate
Create a Python fibonacci.py module with an iterative fib(n)
function and a pytest file covering fib(0), fib(1), fib(10),
and a negative-input ValueError case.
You watch. You don't intervene.
Agent-Flow08 · In practice
What you see
Five agents run. Two files land.
ExplorerRikoEXPLORESEmpty directory. Nothing to preserve.
PlannerSenkuPLANSFour-item plan: fib(n), guard, tests, cases.
ExecutorLoidIMPLEMENTSWrites the two files. Sanity imports.
ReviewerLawlietREVIEWSStatic checks clean. Returns APPROVED
VerifierAlphonseVERIFIESpytest → 4 passed. Zero errors.
Agent-Flow08 · In practice
The completion promise
One literal marker means the work is done.
<orchestration-complete>TASK VERIFIED</orchestration-complete>
Prints only when every gate is green. No marker — not done.
Agent-Flow08 · In practice
When verification fails
The loop self-corrects. You do nothing.
- Gate fails. NEEDS_CHANGES or test error.
- Executor re-runs with the exact output.
- Resolved in one or two iterations.
Agent-Flow09 · Practice
Five habits
Use the system the way it expects to be used.
- Be specific. Narrow prompts win.
- Deep-dive once. Then
--use-deep-dive everywhere.
- Trust VERIFIED. It means command output.
- Answer clarifications. Refinement isn't noise.
- Read the analyze report. See the truth.
Agent-FlowEnd · Q&A
Where to go next
Three docs, one demo, your first task.
DOC 01
docs/getting-started/
first-orchestration.md
The Fibonacci walkthrough. 2–5 minutes. Touches every agent.
DOC 02
docs/architecture/
overview.md
Full data-flow diagrams. Model selection. Verification architecture.
DOC 03
docs/concepts/
subagents-lie.md
Why verification is the whole point — and how the gates enforce it.