subagent wokflows

2026-04-05 09:34:38 +02:00
parent 5c8d29a033
commit 90e62b1a51
23 changed files with 654 additions and 213 deletions
--- a/pi/.pi/agent/skills/subagent-implement-critical/SKILL.md
+++ b/pi/.pi/agent/skills/subagent-implement-critical/SKILL.md
@@ -0,0 +1,109 @@
+---
+name: subagent-implement-critical
+description: "Maximum-quality implementation pipeline: deep-scout → plan → review → Claude coder → build verify → review → fix. Use when the user invokes /implement-critical."
+---
+
+# /implement-critical — Maximum Quality Pipeline
+
+Same structure as `/implement` but with deeper scouting and a stronger coder.
+
+## Agent Roster
+
+| Agent | Role |
+|-------|------|
+| deep-scout | Thorough architectural exploration |
+| planner | Detailed implementation plans |
+| plan-reviewer | Reviews plans for correctness and risk |
+| coder-claude | Complex/unfamiliar implementation (elite coder) |
+| reviewer | Cross-family code review |
+| fixer | Applies review feedback precisely |
+
+## Phase 1: Deep Scout → Plan → Review (chain)
+
+```js
+await subagent({ chain: [
+  { agent: "deep-scout",     task: "Explore the codebase for: {task}" },
+  { agent: "planner",        task: "Create a detailed implementation plan for: {task}" },
+  { agent: "plan-reviewer",  task: "Review the plan for correctness, completeness, and risk." }
+]})
+```
+
+The chain creates `scout.md`, `plan.md`, `plan-review.md` in the chain artifact dir.
+
+Read `plan-review.md` from the artifact dir. If **NEEDS_REVISION** or **REJECTED**, loop: tell the planner what to fix, re-run the reviewer. If **APPROVED**, proceed to Phase 2.
+
+## Phase 2: Implement
+
+```js
+await subagent({ agent: "coder-claude", task: `Read and implement the plan at <chain_dir>/plan.md` })
+```
+
+Use the chain dir path from Phase 1's result.
+
+## Phase 3: Build Verification
+
+After the coder finishes, independently verify the build — don't trust the coder's report:
+
+```bash
+# Run a full build, not just type-check. Shaders, linkers, and bundlers may fail.
+# Adapt to project: cargo build, npm run build, etc.
+```
+
+## Phase 4: Review → Fix
+
+```js
+await subagent({ agent: "reviewer", output: `${chainDir}/review.md`, task: "Review all changes made" })
+```
+
+Read `review.md`. If issues found:
+
+```js
+await subagent({ agent: "fixer", task: `Read and apply the review feedback in ${chainDir}/review.md` })
+```
+
+## Phase 5: Workflow Summary
+
+After all phases complete, give the user a brief honest summary:
+
+- **What happened**: which phases ran, any plan review loops, whether fixes were needed
+- **Issues**: any agent silent failures, fallbacks used, build errors, or unexpected behavior
+- **Agent quality**: did any agent misinterpret the task, produce poor output, or need hand-holding? Name the agent and the problem
+- **Skill improvements**: did this workflow reveal gaps in the skill instructions or agent prompts? Note what should change
+
+Be concise — a few lines is enough when things went well. Only expand on problems.
+
+## Chain Mechanics
+
+Chain mode (`subagent({ chain: [...] })`) runs agents sequentially in a shared temp directory (`{chain_dir}`). Each step:
+1. The framework injects `[Read from:]` and `[Write to:]` directives from the agent's `defaultReads` and `output` frontmatter
+2. The agent reads upstream files, does its work, and writes its deliverable to the `[Write to:]` path using the `write` tool
+3. The agent returns a brief text summary; `{previous}` carries this summary to the next step
+4. Variable substitution: `{task}` = original task, `{previous}` = prior step's brief ack, `{chain_dir}` = artifact dir path
+
+Key behaviors:
+- Data flows through FILES (`scout.md` → `plan.md` → `plan-review.md`), not through `{previous}`
+- `{previous}` contains only a brief summary from the prior step — do NOT rely on it for full context
+- The framework validates that the expected output file was created
+- The chain result includes `📁 Artifacts: /tmp/pi-chain-runs/<id>/` — use this path to read files for branching decisions
+
+## Fallback Strategy
+
+When a subagent call returns no output (silent failure), apply cross-family model fallback. **Do not fall back to doing the work yourself** — always retry with the fallback model first.
+
+1. **First attempt**: Use the agent's default model
+2. **If silent failure or error**: Retry with the fallback model using `model` override
+3. **If the fallback also fails**: Report the double-failure to the user. Still do not do the work yourself.
+
+```js
+// Example: deep-scout fails silently, retry with fallback
+subagent({ agent: "deep-scout", task: "...", model: "anthropic/claude-sonnet-4-6" })
+```
+
+| Agent | Primary | Fallback |
+|-------|---------|----------|
+| deep-scout | zai/glm-4.7 | anthropic/claude-sonnet-4-6 |
+| planner | zai/glm-5.1 | anthropic/claude-opus-4-6 |
+| plan-reviewer | anthropic/claude-opus-4-6 | zai/glm-5.1 |
+| coder-claude | anthropic/claude-sonnet-4-6 | zai/glm-5.1 |
+| reviewer | anthropic/claude-opus-4-6 | zai/glm-5.1 |
+| fixer | anthropic/claude-sonnet-4-6 | zai/glm-5.1 |
--- a/pi/.pi/agent/skills/subagent-implement/SKILL.md
+++ b/pi/.pi/agent/skills/subagent-implement/SKILL.md
@@ -0,0 +1,149 @@
+---
+name: subagent-implement
+description: "Full implementation pipeline: scout → plan → review → code → build verify → review → fix. Use when the user invokes /implement."
+---
+
+# /implement — Standard Implementation Pipeline
+
+## Agent Roster
+
+| Agent | Role |
+|-------|------|
+| scout | Fast codebase recon |
+| planner | Detailed implementation plans |
+| plan-reviewer | Reviews plans for correctness and risk |
+| coder | Primary implementation |
+| coder-parallel | Parallel implementation tasks (fan-out) |
+| reviewer | Cross-family code review |
+| fixer | Applies review feedback precisely |
+
+## Phase 1: Scout → Plan → Review (chain)
+
+```js
+await subagent({ chain: [
+  { agent: "scout",         task: "Explore the codebase for: {task}" },
+  { agent: "planner",       task: "Create a detailed implementation plan for: {task}" },
+  { agent: "plan-reviewer", task: "Review the plan for correctness, completeness, and risk." }
+]})
+```
+
+The chain creates `scout.md`, `plan.md`, `plan-review.md` in the chain artifact dir.
+
+Read `plan-review.md` from the artifact dir. If **NEEDS_REVISION** or **REJECTED**, loop: tell the planner what to fix, re-run the reviewer. If **APPROVED**, proceed to Phase 2.
+
+## Phase 2: Implement
+
+```js
+// Single coder:
+await subagent({ agent: "coder", task: `Read and implement the plan at <chain_dir>/plan.md` })
+
+// Or parallel coders (if plan has independent sections):
+await subagent({ tasks: [
+  { agent: "coder-parallel", task: `Implement Part A from <chain_dir>/plan.md` },
+  { agent: "coder-parallel", task: `Implement Part B from <chain_dir>/plan.md` }
+]})
+```
+
+Use the chain dir path from Phase 1's result.
+
+## Phase 3: Build Verification
+
+After the coder finishes, independently verify the build — don't trust the coder's report:
+
+```bash
+# Run a full build, not just type-check. Shaders, linkers, and bundlers may fail.
+# Adapt to project: cargo build, npm run build, etc.
+```
+
+## Phase 4: Review → Fix
+
+```js
+await subagent({ agent: "reviewer", output: `${chainDir}/review.md`, task: "Review all changes made" })
+```
+
+Read `review.md`. If issues found:
+
+```js
+await subagent({ agent: "fixer", task: `Read and apply the review feedback in ${chainDir}/review.md` })
+```
+
+## Phase 5: Next Steps & Workflow Summary
+
+After all phases complete, give the user a brief summary of what is next to take use of the implementation and an honoest evaluation of the workflow:
+
+- **What needs wiring and where**: Point to where the user can take the implementation into use
+- **What happened**: which phases ran, any plan review loops, whether fixes were needed
+- **Issues**: any agent silent failures, fallbacks used, build errors, or unexpected behavior
+- **Agent quality**: did any agent misinterpret the task, produce poor output, or need hand-holding? Name the agent and the problem.
+- **Skill improvements**: did this workflow reveal gaps in the skill instructions or agent prompts? Note what should change
+
+Be concise — a few lines is enough when things went well. Only expand on problems.
+
+## Parallel Coders Within a Chain
+
+Fan-out/fan-in inside a chain (useful when plan sections are independent):
+
+```js
+await subagent({ chain: [
+  { agent: "scout", task: "Explore the codebase for: {task}" },
+  { agent: "planner", task: "..." },
+  { parallel: [
+    { agent: "coder-parallel", task: "Read the plan at {chain_dir}/plan.md. Implement Part A" },
+    { agent: "coder-parallel", task: "Read the plan at {chain_dir}/plan.md. Implement Part B" }
+  ]},
+  { agent: "reviewer", task: "Review all changes from {previous}" }
+]})
+```
+
+## Step Overrides
+
+Override agent defaults per step:
+
+```js
+await subagent({ chain: [
+  { agent: "scout", output: "context.md" },
+  { agent: "planner", reads: ["context.md"], output: "plan.md" },
+  { agent: "plan-reviewer", reads: ["plan.md"] }
+]})
+```
+
+## Chain Mechanics
+
+Chain mode (`subagent({ chain: [...] })`) runs agents sequentially in a shared temp directory (`{chain_dir}`). Each step:
+1. The framework injects `[Read from:]` and `[Write to:]` directives from the agent's `defaultReads` and `output` frontmatter
+2. The agent reads upstream files, does its work, and writes its deliverable to the `[Write to:]` path using the `write` tool
+3. The agent returns a brief text summary; `{previous}` carries this summary to the next step
+4. Variable substitution: `{task}` = original task, `{previous}` = prior step's brief ack, `{chain_dir}` = artifact dir path
+
+Key behaviors:
+- Data flows through FILES (`scout.md` → `plan.md` → `plan-review.md`), not through `{previous}`
+- `{previous}` contains only a brief summary from the prior step — do NOT rely on it for full context
+- The framework validates that the expected output file was created
+- The chain result includes `📁 Artifacts: /tmp/pi-chain-runs/<id>/` — use this path to read files for branching decisions
+
+Reviewer and fixer run as single agents (not in chains). Pass file paths in the task string and use the `output` parameter to capture their output.
+
+## Fallback Strategy
+
+When a subagent call returns no output (silent failure), apply cross-family model fallback. **Do not fall back to doing the work yourself** — always retry with the fallback model first.
+
+1. **First attempt**: Use the agent's default model
+2. **If silent failure or error**: Retry with the fallback model using `model` override
+3. **If the fallback also fails**: Report the double-failure to the user. Still do not do the work yourself.
+
+```js
+// Example: scout fails silently, retry with fallback
+subagent({ agent: "scout", task: "...", model: "anthropic/claude-haiku-4-5" })
+```
+
+| Agent | Primary | Fallback |
+|-------|---------|----------|
+| scout | zai/glm-4.7-flash | anthropic/claude-haiku-4-5 |
+| planner, coder | zai/glm-5.1 | anthropic/claude-opus-4-6 |
+| coder-parallel | zai/glm-4.7 | anthropic/claude-sonnet-4-6 |
+| plan-reviewer, reviewer | anthropic/claude-opus-4-6 | zai/glm-5.1 |
+| fixer | anthropic/claude-sonnet-4-6 | zai/glm-5.1 |
+
+## Adaptive Routing
+
+Do not run the full pipeline for a one-line fix. Use your judgment — a single coder call may suffice.
--- a/pi/.pi/agent/skills/subagent-plan/SKILL.md
+++ b/pi/.pi/agent/skills/subagent-plan/SKILL.md
@@ -0,0 +1,76 @@
+---
+name: subagent-plan
+description: "Plan-only pipeline: scout → plan → review. Produces a reviewed implementation plan without coding. Use when the user invokes /plan."
+---
+
+# /plan — Plan Only (No Implementation)
+
+## Agent Roster
+
+| Agent | Role |
+|-------|------|
+| scout | Fast codebase recon |
+| planner | Detailed implementation plans |
+| plan-reviewer | Reviews plans for correctness and risk |
+
+## Workflow (single chain)
+
+```js
+await subagent({ chain: [
+  { agent: "scout",         task: "Explore the codebase for: {task}" },
+  { agent: "planner",       task: "Create a detailed implementation plan for: {task}" },
+  { agent: "plan-reviewer", task: "Review the plan for correctness, completeness, and risk." }
+]})
+```
+
+The chain creates `scout.md`, `plan.md`, `plan-review.md` in the chain artifact dir (`/tmp/pi-chain-runs/<id>/`).
+
+## Handling the Verdict
+
+Read `plan-review.md` from the artifact dir. If **NEEDS_REVISION** or **REJECTED**, loop: tell the planner what to fix, re-run the reviewer. If **APPROVED**, present the plan path to the user.
+
+Show the user the path to `plan.md`.
+
+## Workflow Summary
+
+After the chain completes, give the user a brief honest summary:
+
+- **What happened**: did the plan pass review on the first try, or did it need revision loops?
+- **Issues**: any agent silent failures, fallbacks used, or unexpected behavior
+- **Agent quality**: did any agent misinterpret the task, produce poor output, or need hand-holding? Name the agent and the problem
+- **Skill improvements**: did this workflow reveal gaps in the skill instructions or agent prompts? Note what should change
+
+Be concise — a few lines is enough when things went well. Only expand on problems.
+
+## Chain Mechanics
+
+Chain mode (`subagent({ chain: [...] })`) runs agents sequentially in a shared temp directory (`{chain_dir}`). Each step:
+1. The framework injects `[Read from:]` and `[Write to:]` directives from the agent's `defaultReads` and `output` frontmatter
+2. The agent reads upstream files, does its work, and writes its deliverable to the `[Write to:]` path using the `write` tool
+3. The agent returns a brief text summary; `{previous}` carries this summary to the next step
+4. Variable substitution: `{task}` = original task, `{previous}` = prior step's brief ack, `{chain_dir}` = artifact dir path
+
+Key behaviors:
+- Data flows through FILES (`scout.md` → `plan.md` → `plan-review.md`), not through `{previous}`
+- `{previous}` contains only a brief summary from the prior step — do NOT rely on it for full context
+- The framework validates that the expected output file was created
+- The chain result includes `📁 Artifacts: /tmp/pi-chain-runs/<id>/` — use this path to read files for branching decisions
+
+## Fallback Strategy
+
+When a subagent call returns no output (silent failure), apply cross-family model fallback. **Do not fall back to doing the work yourself** — always retry with the fallback model first.
+
+1. **First attempt**: Use the agent's default model
+2. **If silent failure or error**: Retry with the fallback model using `model` override
+3. **If the fallback also fails**: Report the double-failure to the user. Still do not do the work yourself.
+
+```js
+// Example: scout fails silently, retry with fallback
+subagent({ agent: "scout", task: "...", model: "anthropic/claude-haiku-4-5" })
+```
+
+| Agent | Primary | Fallback |
+|-------|---------|----------|
+| scout | zai/glm-4.7-flash | anthropic/claude-haiku-4-5 |
+| planner | zai/glm-5.1 | anthropic/claude-opus-4-6 |
+| plan-reviewer | anthropic/claude-opus-4-6 | zai/glm-5.1 |
--- a/pi/.pi/agent/skills/subagent-review/SKILL.md
+++ b/pi/.pi/agent/skills/subagent-review/SKILL.md
@@ -0,0 +1,56 @@
+---
+name: subagent-review
+description: "Code review using a cross-family reviewer. Use when the user invokes /review."
+---
+
+# /review — Code Review
+
+## Agent Roster
+
+| Agent | Role |
+|-------|------|
+| reviewer | Cross-family code review (finds blind spots the author's model family misses) |
+| fixer | Applies review feedback precisely |
+
+## Workflow
+
+Single reviewer call — no chain needed:
+
+```js
+await subagent({ agent: "reviewer", output: `${chainDir}/review.md`, task: "Review the following changes: <description or files>" })
+```
+
+Read `review.md`. If issues found, apply fixes:
+
+```js
+await subagent({ agent: "fixer", task: `Read and apply the review feedback in ${chainDir}/review.md` })
+```
+
+## Workflow Summary
+
+After review (and optional fix), give the user a brief honest summary:
+
+- **What happened**: did the reviewer find issues, were fixes applied?
+- **Issues**: any agent silent failures, fallbacks used, or unexpected behavior
+- **Agent quality**: did any agent misinterpret the task, produce poor output, or need hand-holding? Name the agent and the problem
+- **Skill improvements**: did this workflow reveal gaps in the skill instructions or agent prompts? Note what should change
+
+Be concise — a few lines is enough when things went well. Only expand on problems.
+
+## Fallback Strategy
+
+When a subagent call returns no output (silent failure), apply cross-family model fallback. **Do not fall back to doing the work yourself** — always retry with the fallback model first.
+
+1. **First attempt**: Use the agent's default model
+2. **If silent failure or error**: Retry with the fallback model using `model` override
+3. **If the fallback also fails**: Report the double-failure to the user. Still do not do the work yourself.
+
+```js
+// Example: reviewer fails silently, retry with fallback
+subagent({ agent: "reviewer", task: "...", model: "zai/glm-5.1" })
+```
+
+| Agent | Primary | Fallback |
+|-------|---------|----------|
+| reviewer | anthropic/claude-opus-4-6 | zai/glm-5.1 |
+| fixer | anthropic/claude-sonnet-4-6 | zai/glm-5.1 |
--- a/pi/.pi/agent/skills/subagent-workflows/SKILL.md
+++ b/pi/.pi/agent/skills/subagent-workflows/SKILL.md
@@ -1,44 +0,0 @@
---
-name: subagent-workflows
-description: "Reference for the subagent catalog, workflows, and model assignments. Load when working with /implement, /plan, /review, /implement-critical, or when deciding which agents to use."
---
-
-# Subagent Catalog & Workflows
-
-## Agent Roster
-
-| Agent | Model | Role | Cost Profile |
-|-------|-------|------|-------------|
-| **router** | Qwen 122B-A10B | Evaluates task complexity, picks workflow | Cheap MoE |
-| **scout** | Qwen 122B-A10B | Fast codebase recon, structured findings | Cheap MoE |
-| **deep-scout** | Qwen 27B (dense) | Thorough architectural exploration | Moderate |
-| **planner** | Qwen Max (frontier) | Detailed implementation plans | Frontier, free via qwen.ai |
-| **plan-reviewer** | Claude Opus 4.6 | Reviews plans before implementation | $$$ but low volume |
-| **coder** | Qwen Max | Primary implementation | Frontier, free via qwen.ai |
-| **coder-parallel** | Qwen 397B-A17B | Parallel implementation tasks | Strong MoE, parallelizable |
-| **coder-claude** | Claude Sonnet 4.6 | Complex/unfamiliar implementation | $$ elite coder |
-| **reviewer** | Claude Opus 4.6 | Cross-family code review | $$$ but low volume |
-| **reviewer-quick** | Claude Haiku 4.5 | Fast cross-family sanity check | $ |
-| **fixer** | Claude Sonnet 4.6 | Applies review feedback precisely | $$ |
-
-## Workflows
-
-### `/implement <task>` — Adaptive (auto-selects pipeline)
-Router evaluates → SMALL/MEDIUM/LARGE/HUGE → runs appropriate chain.
-
-### `/plan <task>` — Plan only
-Scout → Planner (Max) → Plan Reviewer (Opus). No implementation.
-
-### `/review <files or description>` — Opus review
-Direct Opus review on changes.
-
-### `/implement-critical <task>` — Maximum quality
-Deep Scout → Planner → Opus Plan Review → Sonnet Coder → Opus Review → Fixer.
-
-## Design Principles
-
-1. **Qwen-first**: Max for planning and primary coding (free, frontier). MoE models for volume tasks.
-2. **Cross-family review**: Qwen codes → Claude reviews. Different training = different blind spots caught.
-3. **Opus for review, not implementation**: Low output tokens, maximum leverage. Finding bugs > writing code.
-4. **Sonnet 4.6 for precision edits**: Best instruction-following for applying review fixes.
-5. **Adaptive routing**: Don't run 5-agent pipeline for a one-line fix.