Claude Code Subagents Save 9,500 Tokens Per Research Pass - If You Stop Treating Isolation as Free

A subagent that burns 10,000 tokens of internal work but returns a 500-token summary saves you 9,500 tokens of context window — but most developers treat subagent isolation as free and blow the savings on verbose output. The 8th article in the Navigating Claude Code series spells out exactly where the economics break and how to fix them.

Built-in Subagents You Already Use

Claude Code ships with three subagents that activate automatically. Explore is a fast, read-only Haiku agent for file discovery and grep. Plan runs during claude --plan to prevent research from consuming your context before you see the plan. General-purpose inherits the main model and full tool access for multi-step tasks. You've seen them as [subagent: explore] labels in complex sessions.

None of these require configuration. They fire when Claude judges the task warrants delegation. The trick is recognizing when you should write a custom one instead of letting the defaults handle everything.

Writing Custom Subagents That Actually Save Context

Custom subagents live as Markdown files with YAML frontmatter. Location determines scope: .claude/agents/ for a project, ~/.claude/agents/ for all projects, or managed settings for org-wide deployment. The description field is the routing cue — write it as an invocation trigger (“Use after writing code”), not a title (“Code reviewer”). The model field is the highest-leverage setting: route read-only research to Haiku, analysis to Sonnet, and keep Opus only for subagents that need to reason through something complex.

Restrict tools aggressively. A read-only subagent should only have Read, Grep, Glob — no write access, no accidental overwrites. The permissionMode lets you bypass permissions for trusted deploy agents, but default is safer. Memory persistence via memory: user lets a subagent accumulate knowledge like “this project uses a custom error wrapper” across sessions.

Two Pitfalls That Kill Your Savings

First: verbose output. A subagent that returns a wall of text cancels the context savings. The fix is explicit in the system prompt: “Return a maximum of 10 bullet points. Each point needs a severity level, file reference, and one sentence. No code snippets unless unavoidable.” A 300-token report plus a clarifying follow-up beats a 3,000-token preemptive dump every time.

Second: description-driven delegation is unpredictable. If descriptions overlap, Claude might route in-line instead of to your subagent. The workaround: invoke by name — “Use the code-reviewer agent to check the auth module.” That bypasses the routing logic entirely and guarantees the subagent runs.

Run /agents to see every configured subagent grouped by source and spot silently overridden definitions. That's the first debug step when a custom agent stops firing.

The next article in the series covers agent teams — where multiple Claude instances communicate directly mid-task, removing the orchestrator bottleneck that subagents still suffer from. Until that lands, tight output formats and explicit routing are what separate token-savvy developers from the ones who wonder why their context window fills up anyway.

Source: Navigating Claude Code: Subagents Done Right
Domain: hackernoon.com

Claude Code Subagents Save 9,500 Tokens Per Research Pass - If You Stop Treating Isolation as Free

Built-in Subagents You Already Use

Writing Custom Subagents That Actually Save Context

Two Pitfalls That Kill Your Savings

More in Developer Tools