Agent-Flow · v1.0 · April 2026
Agents lie.
Agent-flow makes
them prove it.
A practical introduction to the six-agent Claude Code plugin — installation, commands, and the gates that keep it honest.
Engineering · Claude Code
Agent-Flow01 · Orientation

What it is

A Claude Code plugin that replaces one generalist agent with six specialists and a verification layer.

Four slash commands — /orchestrate, /team-orchestrate, /deep-dive, /analyze — fan work across six specialists with mandatory evidence gates.

Agent-Flow01 · Orientation

Design philosophy

Three principles shape every decision in the system.

01

Specialization over generalization

Each agent has a focused role, a restricted toolset, and a single kind of output.

02

Verification over trust

Every claim of completion must be backed by actual command output — not summaries.

03

Cost awareness over convenience

Opus reasons. Sonnet executes. You pay the premium only where judgment beats speed.

Agent-Flow01 · Orientation

Principle 01

Specialists beat a single generalist — because the constraints are the feature.

A generalist does all five jobs. Review becomes performative.

  • Reviewer has no Write/Edit — can't fix what it reviews.
  • Planner has no Bash — can't skip to execution.
  • Verifier has no Edit — only reports output.
Agent-Flow02 · Agents

The cast

Meet the six specialists.

Explorer
Riko
Reads, searches, queries the graph. Never writes.
OPUS · READ-ONLY
Planner
Senku
Turns findings into a concrete file-by-file task list.
OPUS · TODO-ONLY
Executor
Loid
The only agent with Write and Edit. Runs its own sanity checks.
SONNET · WRITE
Reviewer
Lawliet
Static analysis only. Emits APPROVED or NEEDS_CHANGES.
SONNET · ANALYZE
Verifier
Alphonse
Runs the test suite. Reports exact output. No summaries.
SONNET · VERIFY
Authoring
Speedwagon
Authors interactive HTML explainers in a scoped sandbox.
SONNET · SCOPED-WRITE
Agent-Flow · Agent 1/602 · Agents
Explorer
Riko
Model
Opus
Mode
Read-only
Owns skills
exploration-strategy
graphify-usage
personal-kb-usage

Finds files, cites lines, never edits them.

Tools
Read Grep Glob Bash (AST only) WebSearch WebFetch Write Edit
Discipline: Every claim backed by file paths and line numbers.

src/auth/jwt.ts:42 — signToken() builds the HS256 payload used by /api/login
Agent-Flow · Exploration01 · Foundations

How the Explorer works

Start with the graph. Grep last. Never guess.

  1. Graph orientation

    Shape before text.

  2. Local repository

    Where 90% lives.

  3. Web search

    Only for externals.

  4. Ask the user

    Last resort — once.

Why the next four agents need this

→ Planner

Plans against real files.

→ Executor

Edits with citations.

→ Reviewer

Same ground truth.

→ Verifier

Runs the right checks.

Agent-Flow · Agent 2/602 · Agents
Planner
Senku
Model
Opus
Mode
Plan-only
Owns skills
task-classification
prompt-refinement
team-decision

Writes the plan. Does not touch the code.

Tools
Read Grep Glob TodoWrite Write Edit Bash
Discipline: Every file path in the plan is verified to exist.

TodoWrite: "Add signToken() to src/auth/jwt.ts — acceptance: returns {token, expiresAt}"
Agent-Flow · Planning02 · Agents

How the Planner works

Turns exploration into a file-by-file plan.

  1. Verify the findings

    Trust but verify.

  2. Blast-radius check

    What else gets touched.

  3. Write the task list

    Atomic, ordered steps.

  4. Pin the contract

    Criteria. Risks. Done.

What lands in the plan

→ Files

Verified paths, in order.

→ Patterns

House style, cited.

→ Criteria

Testable done-conditions.

→ Risks

Known edges up front.

Agent-Flow · Agent 3/602 · Agents
Executor
Loid
Model
Sonnet
Mode
Write enabled
Consumes
verification-gates
exploration-strategy
agent-behavior-constraints

The only agent that edits files.

House rule: Show the output. No "looks good."

Tests: PASS (npm test — 15/15 passed)
Agent-Flow · Execution02 · Agents

How the Executor works

Follows the plan. Shows the output.

  1. Read before write

    Match local style.

  2. One step at a time

    No batched changes.

  3. Evidence per step

    Paste the output.

  4. Stop on red

    Never mask failure.

Non-negotiables

✗ No claims

"Complete" needs output.

✗ No skipping

No @ts-ignore.

✗ No scope creep

Plan only. Files follow-up.

→ Handoff

Diff + evidence downstream.

Agent-Flow · Agent 4/602 · Agents
Reviewer
Lawliet
Model
Sonnet
Mode
Analyze-only
Bash
Static analysis tools only
(tsc, mypy, eslint, ruff)

Reads fresh. Returns a verdict.

Verdicts
APPROVED NEEDS_CHANGES BLOCKED

NEEDS_CHANGES loops back to the Executor automatically — you do nothing.

Discipline: Run the analyzer — never approve from reading alone.

Verdict: APPROVED — tsc: 0 errors; eslint: 0 warnings
Agent-Flow · Review02 · Agents

How the Reviewer works

Static analysis. One verdict: APPROVED or NEEDS_CHANGES.

  1. Graph orientation

    Blast radius first.

  2. Run the analyzers

    tsc · mypy · eslint · ruff.

  3. Pattern check

    Match local idioms.

  4. Verdict

    Zero critical to approve.

What the Reviewer will not do

✗ Run tests

That's the Verifier's job.

✗ Modify code

Suggest only. Never edit.

✗ Run the app

Analyzers only.

↻ Loop

Failures route to Executor.

Agent-Flow · Agent 5/602 · Agents
Verifier
Alphonse
Model
Sonnet
Tools
Bash · Read · Grep
Owns skills
verification-gates

Runs the tests that prove it.

Discipline: Show the exact output — no summaries, ever.
$ pytest
========== 4 passed in 0.12s ==========

$ mypy .
Success: no issues found in 12 source files

Overall: VERIFIED
Agent-Flow · Verification02 · Agents

How the Verifier works

Runs the suite. Pastes the output.

  1. Detect the project

    Pick the right toolchain.

  2. Run the four gates

    Tests · types · lint · build.

  3. Report verbatim

    No summaries. Ever.

  4. Emit the verdict

    All green, or FAILED.

Why the Verifier is last

→ Authority

Final word on done.

→ Separation

Fresh run, not incremental.

→ Tools

Bash · Read · Grep only.

↻ Failure loop

Back to Executor with output.

Agent-Flow · Agent 6/602 · Agents
Authoring
Speedwagon
Model
Sonnet
Mode
Scoped-write
Consumes skills
agent-behavior-constraints
exploration-strategy
explainer-design-system

Authors the brief and HTML fragment. Nothing else.

Tools
Read Grep Glob Write† Edit† Bash‡
Write/Edit scoped to explain-out/ and .claude/explain-briefs/.

Bash scoped to bash scripts/compile-explain.sh only.
Agent-Flow · Authoring02 · Agents

How the Authoring agent works

Reads the scope. Renders the explainer.

  1. Verify every ref

    Read each file:line before embedding.

  2. Equip design skill

    Load explainer-design-system before any HTML.

  3. Write the brief

    Teaching arc, refs, metaphor → .md file.

  4. Render + assemble

    HTML fragment → assembler → explain-out/index.html.

What Speedwagon will not do

✗ Modify src/

Application code is off-limits.

✗ Call agents

Orchestrator handles routing.

✗ Run tests

Lint guardrail only.

→ Output

Brief + fragment → explain-out/index.html.

Agent-Flow02 · Agents

Principle 03

Opus plans. Sonnet executes. Nobody gets both.

Opus — reason

Unfamiliar codebases. Ambiguous requirements. Strategic sequencing.

  • Explorer (Riko) — exploration
  • Planner (Senku) — planning
Sonnet — execute

Clear plans. Pass-fail criteria. Mechanical verification.

  • Executor (Loid) — implementation
  • Reviewer (Lawliet) — review
  • Verifier (Alphonse) — verification
  • Authoring (Speedwagon) — explainer HTML
Agent-Flow03 · Installation

Getting started

Installation takes three commands.

# Method 0 — Claude Code marketplace (recommended)
claude> /plugin marketplace add josix/agent-flow
claude> /plugin install agent-flow@josix-plugins
# Method 2 — git clone
$ git clone https://github.com/josix/agent-flow.git
$ cd agent-flow && claude --plugin-dir .

Prerequisites: Claude Code CLI · Bash 4+ · jq · git

Agent-Flow03 · Installation

Before your first run

Validate the install with a single script.

  • plugin.json + hooks.json parse as valid JSON
  • Every agent, skill, and command frontmatter parses
  • Hook scripts exist, are executable, valid syntax
  • Edge cases: missing jq, malformed input, env-var propagation
$ bash scripts/validate-plugin.sh

Running 10 structural checks…
   plugin.json valid
   hooks/hooks.json valid
   6 agents parsed
   9 skills parsed
   12 hook scripts executable
  

✓ All tests passed
Agent-Flow04 · Commands

The main command

/orchestrate runs the full pipeline end to end.

claude> /orchestrate Add user authentication with JWT tokens

You give it a task in plain English. It delegates to Explorer → Planner → Executor → Reviewer → Verifier in sequence. Each agent's output becomes the next agent's context. No babysitting.

Agent-Flow04 · Commands

/orchestrate — five phases

Watch the pipeline, agent by agent.

PHASE 01
ExplorerRiko
Explore the codebase. Gather files, patterns, constraints.
Opus
PHASE 02
PlannerSenku
Plan the change. List files, order, risks, acceptance criteria.
Opus
PHASE 03
ExecutorLoid
Implement the plan. Write and edit files. Run sanity checks.
Sonnet
PHASE 04
ReviewerLawliet
Review with static analysis. Emit APPROVED or NEEDS_CHANGES.
Sonnet
PHASE 05
VerifierAlphonse
Run tests, types, lint, build. Report exact output.
Sonnet

The orchestrator — the main Claude instance — never writes code. It coordinates specialists and collects evidence.

Agent-Flow04 · Commands

/team-orchestrate

Same pipeline — but review and verify run in parallel.

01
ExplorerRiko
Explore.
Opus
02
PlannerSenku
Plan.
Opus
03
ExecutorLoid
Implement.
Sonnet
04a · parallel
ReviewerLawliet
Review.
Sonnet
04b · parallel
VerifierAlphonse
Verify.
Sonnet
05
Merge
Combine results. Iterate if gates fail.
30–40%

Wall-clock reduction

The Reviewer and Verifier don't depend on each other's output — so they don't need to wait. Pass --force-sequential if you ever need the old behavior.

Agent-Flow04 · Commands

/deep-dive

Cache codebase context once. Reuse it for the rest of the session.

  1. Fire 5+ parallel Rikos against structure, conventions, anti-patterns, build, architecture, testing.
  2. The Planner synthesizes findings into a unified context file.
  3. Output lands at .claude/deep-dive.local.md — ephemeral, session-scoped.
claude> /deep-dive
# whole codebase, ~3–5 min

claude> /deep-dive --focus=src/auth
# narrow scope

claude> /deep-dive --refresh
# codebase changed, update context
Agent-Flow04 · Commands

Compounding returns

Reuse one deep-dive across many orchestrations.

# 1. Pay the exploration cost once
claude> /deep-dive

# 2. Every subsequent task reuses that context
claude> /orchestrate --use-deep-dive Add user authentication
claude> /orchestrate --use-deep-dive Add password reset
claude> /orchestrate --use-deep-dive Add email verification

# 3. When the codebase drifts, refresh
claude> /deep-dive --refresh

The pattern: front-load exploration when you start a session, then run as many tasks as you want without repaying the cost.

Agent-Flow04 · Commands

04 · Commands

Generate an interactive explainer for any topic.

  1. Riko gathers scope — file:line refs, graph nodes, key terminology.
  2. Senku designs a 3–5 screen teaching arc with a metaphor and translator pick.
  3. Speedwagon authors the module brief and HTML fragment.
  4. Assembler + lint produce explain-out/index.html.
claude> /agent-flow:explain how does orchestration work

claude> /agent-flow:explain --revise <slug>

Requires /deep-dive first. Graph optional.

Agent-Flow05 · Enforcement
SECTION 03 · HOOKS

Hooks enforce the gates
you can't skip.

Seven lifecycle events — independent of any agent — make the system's promises structural instead of aspirational.

Agent-Flow05 · Enforcement

Seven lifecycle events

Every hook runs without asking the agent for permission.

UserPromptSubmitClarifies vague prompts before they run.Clarity
PreToolUseBlocks unsafe writes before they run.Safety
PostToolUsePer-agent verification reminders.Discipline
SessionStartDetects project type and toolchain.Context
Stop Blocks completion until every check passes. The gate
TeammateIdleQuality checks for parallel teammates.Parallel
TaskCompletedEvidence check before marking done.Receipts
Agent-Flow05 · Enforcement

The gate that matters

The Stop hook blocks completion until every check passes.

Node.js projects
  • Tests: npm test (when package.json has test)
  • Types: npx tsc --noEmit
Python projects
  • Tests: uv run pytestpytest fallback
  • Types: mypy . (requires mypy.ini)
Escape hatches
  • .claude/skip-test-verification — bypass, with reason
  • .claude/known-test-failures — tolerated, non-blocking
  • .claude/test-command — custom command
Only new failures block.
Agent-Flow06 · Knowledge

Skills

Skills are the playbooks agents follow.

Each skill has an owner agent that embodies it and consumer agents that reference it. Update one skill → every consumer inherits the change.

SkillOwnerConsumersHeuristic
exploration-strategyExplorerPlanner · ExecutorParallel + targeted. Converge fast.
task-classificationPlannerExplorer · OrchestratorWhen unsure, classify higher.
prompt-refinementPlannerOrchestratorOne question at a time.
verification-gatesVerifierExecutor · ReviewerFail fast. Any failed gate halts work.
team-decisionPlannerOrchestratorWhen in doubt, go sequential.
graphify-usageExplorerPlanner · ReviewerTraversal? Graph. Substring? Grep.
personal-kb-usageExplorerPlanner · ReviewerCross-session memory, not this-session structure.
explainer-design-systemVendoredSpeedwagon12 primitives, 8 lint rules, design tokens for explainer HTML.
Agent-Flow06 · Knowledge

Optional MCP server

Graphify answers the structural questions grep can't.

  • graph_stats — shape of the codebase at a glance
  • god_nodes — top-N fan-in anchors
  • get_neighbors — blast radius of a change
  • shortest_path — how does A reach B?
  • get_community — what cluster is this in?
Explorer's rule: graph_stats + god_nodes before grep.

Orient, then investigate.

Install: pipx install 'graphifyy[mcp]' · Queryable by: Explorer, Planner, Reviewer

Agent-Flow06 · Knowledge

Optional MCP server

Personal-KB remembers across projects.

A cross-project memory store. Patterns you established elsewhere, decisions you've made, things future sessions should inherit — they live here and the read-path agents can query them.

Readers
Explorer
Planner
Reviewer

Turns "every session starts fresh" into "every session inherits."

Agent-Flow07 · Observability

/agent-flow:analyze

See what your agents actually did.

  • Tool usage by agent — which tools each specialist calls, how often
  • MCP / skill invocation counts
  • Iteration & rejection rates
  • Token spend, thinking effort, model mismatches
  • Heuristic flags — orchestrator doing persona-owned work, MCP-skipping, allowlist violations
$ bash scripts/analyze.sh load --all-sessions
$ bash scripts/analyze.sh report

→ .claude/observability/events.db
→ .claude/observability/report.md

Local SQLite. No network. No deps beyond Python stdlib.

Agent-Flow08 · In practice
SECTION 04 · WORKED EXAMPLE

Empty directory
to verified code
in five minutes.

The first-orchestration walkthrough. Simplest possible task that exercises all five agents end to end.

Agent-Flow08 · In practice

Setup

Fresh directory. One prompt.

$ mkdir fib-demo && cd fib-demo
$ claude
claude> /orchestrate Create a Python fibonacci.py module with an iterative fib(n)
function and a pytest file covering fib(0), fib(1), fib(10),
and a negative-input ValueError case.

You watch. You don't intervene.

Agent-Flow08 · In practice

What you see

Five agents run. Two files land.

ExplorerRikoEXPLORESEmpty directory. Nothing to preserve.
PlannerSenkuPLANSFour-item plan: fib(n), guard, tests, cases.
ExecutorLoidIMPLEMENTSWrites the two files. Sanity imports.
ReviewerLawlietREVIEWSStatic checks clean. Returns APPROVED
VerifierAlphonseVERIFIESpytest4 passed. Zero errors.
Agent-Flow08 · In practice

The completion promise

One literal marker means the work is done.

<orchestration-complete>TASK VERIFIED</orchestration-complete>

Prints only when every gate is green. No marker — not done.

Agent-Flow08 · In practice

When verification fails

The loop self-corrects. You do nothing.

  1. Gate fails. NEEDS_CHANGES or test error.
  2. Executor re-runs with the exact output.
  3. Resolved in one or two iterations.
Agent-Flow09 · Practice

Five habits

Use the system the way it expects to be used.

  1. Be specific. Narrow prompts win.
  2. Deep-dive once. Then --use-deep-dive everywhere.
  3. Trust VERIFIED. It means command output.
  4. Answer clarifications. Refinement isn't noise.
  5. Read the analyze report. See the truth.
Agent-FlowEnd · Q&A

Where to go next

Three docs, one demo, your first task.

DOC 01

docs/getting-started/
first-orchestration.md

The Fibonacci walkthrough. 2–5 minutes. Touches every agent.

DOC 02

docs/architecture/
overview.md

Full data-flow diagrams. Model selection. Verification architecture.

DOC 03

docs/concepts/
subagents-lie.md

Why verification is the whole point — and how the gates enforce it.

Questions? github.com/josix/agent-flow
ACCENT