Official References: Sub-agents · Agent SDK · Claude Code CLI
Curriculum path
- CLAUDE.md Mastery — repo memory and rules
- Effective Prompting — task framing and constraints
- MCP Power Tools — connect tools and live context
- Multi-Agent Workflows — delegation and parallel execution ← You are here
- Hooks Automation — automate guardrails locally
- GitHub Actions Workflows — move repeatable work into team automation
Official docs used in this guide
- Subagents and delegation model — Sub-agents
- Programmatic orchestration concepts — Agent SDK
- CLI execution patterns — CLI usage
What Are Subagents?
Claude Code can spawn specialized subagents — separate Claude instances that work on focused tasks in parallel. Think of it as Claude delegating work to a team.
When to Use Subagents
| Scenario | Single Agent | Subagents |
|---|---|---|
| Fix a typo | ✅ | Overkill |
| Review 3 files | ✅ | ✅ (faster) |
| Refactor 20 files | Too slow | ✅ |
| Research + implement | ✅ | ✅ (parallel) |
| Cross-module debugging | Loses context | ✅ |
The Orchestrator Pattern
The main conversation acts as an orchestrator — it plans the work and delegates to specialists:
You: "Review the authentication module for security issues"
Claude (orchestrator):
→ Spawns Agent 1: Review auth/login.ts for injection risks
→ Spawns Agent 2: Review auth/session.ts for token handling
→ Spawns Agent 3: Review auth/middleware.ts for bypass risks
→ Aggregates findings into a single report
Practical Example: Large Refactoring
Migrate all API routes from Express middleware pattern to Next.js Route Handlers.
There are 15 routes in src/api/. Use parallel agents —
one per route group (auth, users, products, orders, analytics).
Don't modify the shared utility functions in src/api/utils/.
Claude will:
- Analyze the route groups
- Spawn 5 parallel agents
- Each agent handles 3 routes
- Results are aggregated and conflicts resolved
Agent Teams (Experimental)
Claude Code's Agent Teams feature enables peer-to-peer communication between agents. Instead of everything going through the orchestrator:
Reviewer Agent ←→ Implementer Agent
↕ ↕
Tester Agent ←→ Documentation Agent
Best for:
- Code review → fix → re-review cycles
- Complex debugging across modules
- Research requiring discussion and synthesis
Background Agents
Running agents in the background lets you continue other work without waiting. This is especially useful for long-running tasks.
Run the large refactoring job in the background.
Notify me when it's done — I'll keep working on other things.
Good candidates for background execution:
- Large-scale refactoring — converting dozens of files
- Long test runs — running the full test suite
- Code review — analyzing all files in a large PR
- Dependency analysis — traversing the full project dependency tree
When the agent completes, you receive a notification with the results. You can pick up wherever you left off.
Isolation with Git Worktrees
Having each agent work in an isolated git worktree lets you experiment without touching the main branch.
Have each agent work in its own worktree:
- Agent 1: Refactor the auth module in branch feature/auth-refactor
- Agent 2: Migrate APIs in branch feature/api-migration
- Agent 3: Improve test coverage in branch feature/test-coverage
Each should work independently and submit results as a separate branch.
Benefits of worktree isolation:
- Safe experimentation — if it fails, just delete the worktree
- Parallel branch work — develop multiple features simultaneously
- No interference — agents don't conflict with each other
- Easy rollback — discard a branch if the result isn't useful
Agent Specialization Patterns
Assigning agents specific expert roles produces much deeper analysis than a single generalist agent.
Analyze the entire project using three specialized agents:
Security Agent:
- Check against OWASP Top 10
- Detect SQL injection, XSS, and CSRF risks
- Review authentication and authorization logic
Performance Agent:
- Detect N+1 query patterns
- Identify potential memory leaks
- Flag unnecessary re-renders and expensive computations
Test Coverage Agent:
- Measure current code coverage
- Identify missing edge cases
- List modules lacking integration tests
Each agent focuses on its domain, producing results that a single agent spread across all three areas would miss.
Real-World Scenario: Frontend + Backend in Parallel
When adding a feature, you can build the frontend, backend, and database layer at the same time:
Add a "User Profile" feature. Have three agents work simultaneously:
Agent 1 (Backend): Implement GET/PUT /api/users/:id endpoints
Agent 2 (Frontend): Build ProfilePage and ProfileForm React components
Agent 3 (Database): Write migration to add bio and avatar_url columns to users table
Shared type definitions are in src/types/user.ts.
After all three finish, propose an integration plan.
With three agents working in parallel, total time is much closer to the time of the slowest agent rather than the sum of all three.
When Agents Aren't Worth It
More agents isn't always better. Keep these tradeoffs in mind:
- Simple tasks get slower — fixing a typo or editing one file is faster with a single agent. Agent creation has overhead.
- Context is separate — each agent has its own context window. Sharing state between agents requires files or a shared task list.
- Cost multiplies — agent count × API calls = increased spend. Use agents when the parallelism is genuinely worth it.
- 4+ concurrent agents can lose efficiency — Claude Code recommends up to 4 concurrent agents. Beyond that, coordination overhead grows.
Claude Code SDK for Programmatic Orchestration
Claude Code can be used as an npm package (@anthropic-ai/claude-code), letting you build custom orchestration workflows in TypeScript/JavaScript.
Basic usage:
import { claude } from "@anthropic-ai/claude-code";
// Simple prompt execution
const result = await claude({
prompt: "Analyze src/auth/ for security vulnerabilities",
options: {
maxTurns: 10,
allowedTools: ["Read", "Glob", "Grep"],
}
});
console.log(result.text);Multi-agent orchestration example:
// Parallel code review across modules
const modules = ["src/auth/", "src/api/", "src/database/"];
const reviews = await Promise.all(
modules.map(mod =>
claude({
prompt: `Review ${mod} for security issues. Focus on injection risks and auth bypasses.`,
options: { allowedTools: ["Read", "Glob", "Grep"] }
})
)
);
// Aggregate results
const summary = await claude({
prompt: `Synthesize these security reviews into a single report:\n${reviews.map(r => r.text).join("\n---\n")}`,
});Use cases: custom CI pipelines, automated code audits, batch processing across large codebases.
Headless Mode for Automation
claude -p "prompt" runs without an interactive conversation. Combine it with shell scripting for powerful automation:
#!/bin/bash
# Parallel security audit across services
services=("auth-service" "payment-service" "user-service")
for service in "${services[@]}"; do
claude -p "Audit $service/ for OWASP Top 10 vulnerabilities. Output as markdown." \
--model haiku \
--output-format text \
> "reports/${service}-audit.md" &
done
wait
echo "All audits complete"Useful options:
--output-format jsonfor machine-parseable output- stdin piping:
git diff HEAD~5 | claude -p "Summarize the changes in these commits" - Set
ANTHROPIC_API_KEYenvironment variable for CI/CD systems
Headless mode is especially useful for integrating Claude Code as an automation tool in GitHub Actions, Jenkins, and other CI systems.
Cost Optimization Strategies
Each subagent creates a new Claude context, which means additional API costs. Here are strategies to minimize spend:
| Strategy | How | Savings |
|---|---|---|
| Model mixing | haiku for search/read, sonnet for implementation | ~60% for read-heavy tasks |
| Task scoping | Narrow scope = fewer turns = less cost | ~40% per agent |
| Result caching | Store outputs, reuse across runs | Variable |
| Parallel batching | Group related files into fewer agents | ~30% vs per-file agents |
Anti-pattern: creating an agent per file for 50 files. Instead, create agents per module.
Sweet spot: 2-4 agents for most tasks, each handling a logical group of related files.
Real-World Scenario: Full-Stack Feature Sprint
Walk through a realistic multi-agent scenario:
Task: Add "user notifications" feature to a full-stack app
Agent orchestration plan:
├── Agent 1 (Backend): Design notification schema, create API endpoints
│ Scope: src/api/notifications/, prisma/schema.prisma
│
├── Agent 2 (Frontend): Build NotificationBell component, notification list page
│ Scope: src/components/notifications/, src/pages/notifications/
│
├── Agent 3 (Infrastructure): Set up WebSocket server, Redis pub/sub
│ Scope: src/services/websocket/, docker-compose.yml
│
└── Agent 4 (Testing): Write integration tests for the notification flow
Scope: tests/notifications/
Dependencies: Waits for Agents 1-3 to complete
Execution: Agents 1-3 run in parallel → Agent 4 runs after
Total time: ~5 min (vs ~15 min sequential)
Key insight: define shared interfaces (types/notification.ts) before spawning agents. Each agent gets read access to shared types but only writes to its own scope.
Codex Comparison
GPT Codex runs each task as a single sandboxed agent. There's no multi-agent orchestration — but the sandbox model gives full container isolation per task.
| Feature | Claude Code Subagents | Codex Tasks |
|---|---|---|
| Parallelism | Up to 4 concurrent | Concurrent tasks possible |
| Communication | Shared context, messaging | Independent, isolated |
| State sharing | Via files and task lists | Via committed code |
| Best for | Complex, interconnected work | Independent, well-defined tasks |