Why Multi-Agent Systems Need Architecture
A single Claude agent can handle many tasks. But some tasks exceed what a single agent context can handle:
- A codebase too large to fit in one context window
- A research task requiring simultaneous investigation of multiple domains
- A workflow requiring different tools at different stages
- Tasks where quality improves when one agent’s work is reviewed by another
Multi-agent systems exist to solve these problems. But adding agents without architecture creates new problems: race conditions, inconsistent state, invisible failures, and systems impossible to debug.
Hub-and-spoke is the architecture that solves this. It’s the standard pattern because it makes complex multi-agent systems understandable, debuggable, and reliable.
The Structure
┌─────────────────┐
│ COORDINATOR │
│ (Hub Agent) │
└────────┬────────┘
│
┌───────────────┼───────────────┐
│ │ │
┌────────▼────┐ ┌───────▼────┐ ┌──────▼─────┐
│ Subagent │ │ Subagent │ │ Subagent │
│ (Search) │ │ (Analysis) │ │ (Synthesis)│
└─────────────┘ └────────────┘ └────────────┘
The rules:
- Coordinator → Subagents: task assignment and context
- Subagents → Coordinator: results and errors
- Subagents ↔ Subagents: never
Context Isolation — The Most Misunderstood Part
When a coordinator spawns a subagent, the subagent starts with a blank context. It does not inherit:
- The coordinator’s conversation history
- Results from other subagents
- The user’s original query (unless you explicitly pass it)
- Any system prompt the coordinator has
This is both the most important feature and the most common source of confusion.
Wrong mental model: “The subagent knows what the coordinator knows.”
Right mental model: “The subagent is a new Claude instance. You brief it like you’d brief a new employee on their first day — tell them everything they need to know to do their specific task.”
# ❌ Wrong: assuming subagent has context
coordinator_prompt = """
You've been analyzing this codebase. Now spawn a subagent to write tests.
"""
# The subagent has never seen the codebase analysis
# ✅ Right: pass all necessary context explicitly
analysis_results = "..." # From earlier coordinator work
subagent_prompt = f"""
A codebase analysis has been completed. Here are the findings:
{analysis_results}
Based on these findings, write comprehensive tests for the functions identified
as highest risk. Focus on: [specific functions]. Use pytest format.
"""
The Task tool is how coordinators spawn subagents. Here’s what it looks like in practice:
# Coordinator's tools include the Task tool
tools = [
{
"name": "Task",
"description": "Spawn a subagent to handle a specific task",
"input_schema": {
"type": "object",
"properties": {
"description": {"type": "string", "description": "What the subagent should do"},
"prompt": {"type": "string", "description": "Full context and instructions for the subagent"},
"subagent_tools": {
"type": "array",
"description": "List of tool names this subagent is allowed to use"
}
},
"required": ["description", "prompt"]
}
},
# ... other coordinator-level tools
]
When Claude calls the Task tool, your orchestration layer:
- Extracts the prompt and allowed_tools
- Creates a new Claude API call with that prompt as the system or user message
- Runs the full agentic loop for the subagent
- Returns the subagent’s final response as the tool result
Parallel Execution — The Right Way
The key to parallel subagent execution is emitting multiple Task tool calls in a single coordinator response.
The coordinator does this in one response:
{
"stop_reason": "tool_use",
"content": [
{
"type": "tool_use",
"id": "task_001",
"name": "Task",
"input": {
"description": "Search for market data on AI adoption",
"prompt": "Search for recent statistics on enterprise AI adoption rates..."
}
},
{
"type": "tool_use",
"id": "task_002",
"name": "Task",
"input": {
"description": "Search for competitor analysis",
"prompt": "Find recent information about Claude competitors..."
}
},
{
"type": "tool_use",
"id": "task_003",
"name": "Task",
"input": {
"description": "Research regulatory landscape",
"prompt": "Find current AI regulation developments in the EU and US..."
}
}
]
}
Your orchestration layer sees three Task tool_use blocks and runs them in parallel:
import asyncio
async def run_coordinator(user_query):
coordinator_messages = [{"role": "user", "content": user_query}]
while True:
response = await call_claude_async(coordinator_messages, tools=coordinator_tools)
if response.stop_reason == "end_turn":
return extract_text(response)
task_tool_calls = [
b for b in response.content
if b.type == "tool_use" and b.name == "Task"
]
if task_tool_calls:
# Run ALL subagents in parallel
subagent_tasks = [
run_subagent(task.input["prompt"], task.input.get("subagent_tools", []))
for task in task_tool_calls
]
results = await asyncio.gather(*subagent_tasks, return_exceptions=True)
# Package results
tool_results = []
for task, result in zip(task_tool_calls, results):
tool_results.append({
"type": "tool_result",
"tool_use_id": task.id,
"content": str(result) if not isinstance(result, Exception)
else f"Subagent failed: {result}",
"is_error": isinstance(result, Exception)
})
coordinator_messages.append({
"role": "assistant",
"content": response.content
})
coordinator_messages.append({
"role": "user",
"content": tool_results
})
Error Recovery — Coordinator’s Responsibility
When a subagent fails, the coordinator decides what to do. The subagent’s job is to report the failure clearly — not to decide how to recover.
# Subagent reports failure clearly
def run_subagent(prompt, allowed_tools):
try:
result = run_agent_loop(prompt, allowed_tools)
return result
except Exception as e:
# Return structured error — coordinator will decide what to do
return {
"status": "error",
"error_type": type(e).__name__,
"message": str(e),
"partial_results": get_partial_results_if_any()
}
# Coordinator decides recovery strategy
# It might: retry, use alternative approach, or inform user
This is why routing everything through the coordinator is essential. If subagents recover autonomously, the coordinator has no visibility into what happened — it only sees the final (potentially incorrect) result.
Observability — The Hidden Benefit
Every message in hub-and-spoke passes through the coordinator. This means you can log, trace, and debug the entire execution by instrumenting one place:
def execute_with_logging(task_tool_call, task_id):
log.info(f"[{task_id}] Spawning subagent: {task_tool_call.input['description']}")
start = time.time()
result = run_subagent(task_tool_call.input["prompt"])
elapsed = time.time() - start
log.info(f"[{task_id}] Subagent completed in {elapsed:.2f}s: {result[:100]}...")
return result
In a peer-to-peer architecture (subagents communicating directly), you’d need to instrument every agent. In hub-and-spoke, instrument the coordinator and you see everything.
Key Takeaways
- Coordinator is the only communication hub — subagents never talk to each other directly
- Context isolation is total — subagents start blank, pass everything they need explicitly
- Parallel execution = multiple Task calls in one coordinator response, not multiple API calls
- Error recovery is the coordinator’s decision, not the subagent’s
- All communication through coordinator = full observability
- Subagent tool access should be scoped to only what that specific role needs