Claude Opus 4.6 Released: Anthropic's Smartest Model Yet Scores 65.4% on Terminal-Bench
Anthropic releases Claude Opus 4.6 with major upgrades to agentic coding, 1M token context window, and state-of-the-art performance on Terminal-Bench 2.0 and Humanity's Last Exam.

Claude Opus 4.6 Released: Anthropic's Smartest Model Yet Scores 65.4% on Terminal-Bench
Anthropic has released Claude Opus 4.6, its most capable model to date, featuring significant improvements in agentic coding, a 1M token context window, and state-of-the-art performance across multiple benchmarks.
Key Improvements
The new model builds on its predecessor's coding capabilities with several notable upgrades:
- •Improved agentic planning: Opus 4.6 breaks complex tasks into independent subtasks, runs tools and subagents in parallel, and identifies blockers with "real precision," according to Anthropic's release notes.
- •1M token context window: For the first time in an Opus-class model, users can access a 1M token context window in beta.
- •Sustained agentic tasks: The model can operate reliably in larger codebases for longer periods without degradation.
- •Better self-debugging: Opus 4.6 has improved code review and debugging skills to catch its own mistakes.
Benchmark Results
Opus 4.6 achieves top scores across multiple evaluations:
- •Terminal-Bench 2.0: 65.4% - highest score among all models on this agentic coding evaluation
- •Humanity's Last Exam: Leads all frontier models on this complex multidisciplinary reasoning test
- •GDPval-AA: Outperforms the industry's next-best model (OpenAI's GPT-5.2) by approximately 144 Elo points on economically valuable knowledge work tasks
- •BrowseComp: Best performance on locating hard-to-find information online
"Opus 4.6 is the strongest model Anthropic has shipped," the company stated. "It takes complicated requests and actually follows through, breaking them into concrete steps, executing, and producing polished work even when the task is ambitious."
New Developer Features
Anthropic also announced several new capabilities for developers:
- •Claude Code agent teams: Developers can now assemble multiple Claude agents to work on tasks collaboratively
- •Adaptive thinking: The model can pick up on contextual clues about how much extended thinking to apply
- •Effort controls: New parameter lets developers dial effort between high, medium, and low to balance intelligence, speed, and cost
- •Context compaction: API users can summarize context mid-task to perform longer-running operations
New Product Integrations
The release also includes expanded productivity integrations:
- •Claude in Excel: Substantial upgrades to working with spreadsheets
- •Claude in PowerPoint: New research preview for presentation creation
- •Cowork: Claude can now multitask autonomously on your behalf
Pricing and Availability
Claude Opus 4.6 is available today on claude.ai, the Anthropic API, and all major cloud platforms. Pricing remains unchanged at $5 input / $25 output per million tokens.
Sources
Share this article
About AI News Team
Expert researcher and writer at NeuralStackly, dedicated to finding the best AI tools to boost productivity and business growth.
View all posts