
The hidden thinking keyword system
Claude Code implements a secret thinking keyword system that changes how developers control AI computational resources. Through simple text commands embedded in prompts, developers can instantly allocate anywhere from 4,000 to 31,999 tokens for processing complex tasks. This system represents a paradigm shift from opaque AI processing to transparent, user-controlled resource allocation.
The three-tier system operates with elegant simplicity: “think” allocates 4,000 tokens for routine debugging and basic refactoring, “megathink” provides 10,000 tokens for architectural decisions and complex problem-solving, and “ultrathink” unleashes 31,999 tokens for the most challenging tasks that require deep, sustained reasoning.
Complete list of thinking trigger keywords
The thinking system responds to multiple trigger phrases, all detected through case-insensitive string matching:
Basic Think Mode (4,000 tokens):
- “think” – The foundational command
Megathink Mode (10,000 tokens):
- “megathink”
- “think hard”
- “think deeply”
- “think a lot”
- “think about it”
- “think more”
Ultrathink Mode (31,999 tokens):
- “ultrathink”
- “think harder”
- “think intensely”
- “think longer”
- “think really hard”
- “think super hard”
- “think very hard”
Technical architecture: The “tengu” system
The thinking system’s internal architecture, codenamed “tengu” after the clever supernatural beings from Japanese folklore, operates through a sophisticated preprocessing layer. When you type a prompt containing a thinking keyword, Claude Code’s JavaScript application intercepts it before reaching the underlying Claude model. The system then invokes the tengu_thinking function with the appropriate token allocation.
Simon Willison’s reverse engineering revealed the exact implementation:
if (B.includes("ultrathink") || B.includes("think harder") || ...) {
return l1("tengu_thinking", { tokenCount: 31999, messageId: Z, provider: G }), 31999;
}This preprocessing happens exclusively within Claude Code’s application layer, which explains the most critical fact about thinking keywords: they only work in Claude Code’s command-line interface, not in Claude.ai’s web interface or the API.
Why thinking keywords are Claude Code exclusive
The exclusivity stems from Claude Code’s unique architecture as a terminal-based application with its own preprocessing layer. Unlike the web interface or API, which send prompts directly to Claude models, Claude Code can intercept and modify requests before they reach the model. This architectural difference enables features impossible in other interfaces:
- Direct manipulation of token budgets
- Dynamic resource allocation based on task complexity
- Transparent thinking processes visible to users
- Adaptive computational scaling
When users attempt to use “ultrathink” in Claude.ai’s web interface, the word is simply passed as part of the prompt text with no special processing. The API requires explicit thinking parameters in the request structure rather than keywords.
Real-world impact of thinking modes
Developer experiences reveal dramatic differences between thinking tiers:
Basic Think (4,000 tokens): Handles routine tasks like fixing syntax errors, writing unit tests, and explaining code snippets. Processing typically takes 5-15 seconds.
Megathink (10,000 tokens): Excels at refactoring complex functions, designing database schemas, and solving algorithmic challenges. Users report 40% better architectural decisions compared to basic mode.
Ultrathink (31,999 tokens): Transforms seemingly impossible tasks into solvable problems. Developers report successfully refactoring 18,000-line components, designing distributed systems from scratch, and solving bugs that stumped entire teams. Processing can take 45-180 seconds but delivers breakthrough solutions.
Common misconceptions and pitfalls
Research reveals widespread confusion about thinking keywords:
- “Ultrathink works everywhere” myth: Users create elaborate prompts with “ultrathink” in Claude.ai, expecting enhanced performance. These keywords are meaningless outside Claude Code.
- Token counting confusion: Some believe thinking keywords add tokens to their prompt. In reality, they allocate computational tokens for processing, separate from input tokens.
- Stacking attempts: Users try combining keywords like “ultrathink megathink think hard,” believing it compounds effects. The system uses only the highest allocation detected.
- API parameter mixing: Developers attempt
thinking={"type": "ultrathink"}in API calls. The API requires numeric token budgets, not keywords.
Optimal usage strategies
Experienced developers have discovered patterns for maximizing thinking keywords:
Progressive escalation: Start with basic “think” for initial exploration. If the response seems incomplete or shallow, escalate to “megathink.” Reserve “ultrathink” for persistent challenges or critical decisions.
Context preparation: Before invoking ultrathink, use commands like /read to load relevant files. The thinking system performs best with comprehensive context.
Task matching: Use basic think for syntax fixes and simple refactoring. Deploy megathink for architectural decisions and complex debugging. Unleash ultrathink for system design, performance optimization, and seemingly intractable problems.
The logarithmic improvement curve
Performance improvements follow a logarithmic curve: doubling tokens from 4,000 to 10,000 yields significant gains, but the jump from 10,000 to 31,999 provides diminishing returns except for genuinely complex tasks. This explains why megathink often provides the optimal balance between performance and processing time.
Conclusion: Mastering the thinking hierarchy
Claude Code’s thinking keyword system represents a unique innovation in AI-assisted development, providing unprecedented control over computational resources through natural language. Understanding that these keywords function exclusively in Claude Code—not in web interfaces or APIs—proves crucial for effective usage. By mastering the progression from basic “think” through “megathink” to “ultrathink,” developers can match computational resources to task complexity, achieving optimal results while avoiding the frustration of attempting these commands where they simply don’t work.
The elegance lies in the simplicity: type a keyword, get more thinking power. But the power lies in understanding when and where these keywords actually function, transforming Claude Code from a capable assistant into an architectural powerhouse capable of tackling your most challenging development problems.