Introducing Claude 4 (Opus 4 & Sonnet 4)
blogCredibility Rating
High quality. Established institution or organization with editorial oversight and accountability.
Rating inherited from publication venue: Anthropic
Official Anthropic announcement of Claude 4 model family (May 2025); relevant for tracking frontier model capabilities, agentic AI deployment trends, and benchmarking progress in coding and reasoning tasks.
Metadata
Summary
Anthropic announces Claude Opus 4 and Sonnet 4, its next-generation AI models with state-of-the-art coding performance, extended thinking with tool use, and enhanced agentic capabilities. Claude Opus 4 leads on SWE-bench (72.5%) and Terminal-bench (43.2%), while both models support parallel tool use, improved instruction-following, and persistent memory. Alongside the models, Anthropic releases Claude Code as generally available and four new API capabilities for building AI agents.
Key Points
- •Claude Opus 4 is Anthropic's most capable model, topping SWE-bench (72.5%) and Terminal-bench (43.2%), designed for long-running agentic tasks lasting several hours.
- •Both Opus 4 and Sonnet 4 are hybrid models supporting near-instant responses and extended thinking, with extended thinking now able to interleave tool use like web search.
- •New memory capabilities allow models to extract and save key facts from local files to maintain continuity and build tacit knowledge over long-horizon tasks.
- •Claude Code is now generally available with GitHub Actions support, VS Code/JetBrains integrations, enabling seamless AI pair programming workflows.
- •Four new API capabilities released: code execution tool, MCP connector, Files API, and 1-hour prompt caching, enabling more powerful agent pipelines.
Cited by 2 pages
| Page | Type | Quality |
|---|---|---|
| Reasoning and Planning | Capability | 65.0 |
| Anthropic | Organization | 74.0 |
Cached Content Preview
Announcements Introducing Claude 4
May 22, 2025 Today, we’re introducing the next generation of Claude models: Claude Opus 4 and Claude Sonnet 4 , setting new standards for coding, advanced reasoning, and AI agents.
Claude Opus 4 is the world’s best coding model, with sustained performance on complex, long-running tasks and agent workflows. Claude Sonnet 4 is a significant upgrade to Claude Sonnet 3.7, delivering superior coding and reasoning while responding more precisely to your instructions.
Alongside the models, we're also announcing:
Extended thinking with tool use (beta) : Both models can use tools—like web search —during extended thinking, allowing Claude to alternate between reasoning and tool use to improve responses.
New model capabilities : Both models can use tools in parallel, follow instructions more precisely, and—when given access to local files by developers—demonstrate significantly improved memory capabilities, extracting and saving key facts to maintain continuity and build tacit knowledge over time.
Claude Code is now generally available : After receiving extensive positive feedback during our research preview, we’re expanding how developers can collaborate with Claude. Claude Code now supports background tasks via GitHub Actions and native integrations with VS Code and JetBrains, displaying edits directly in your files for seamless pair programming.
New API capabilities: We’re releasing four new capabilities on our API that enable developers to build more powerful AI agents: the code execution tool, MCP connector, Files API, and the ability to cache prompts for up to one hour.
Claude Opus 4 and Sonnet 4 are hybrid models offering two modes: near-instant responses and extended thinking for deeper reasoning. The Pro, Max, Team, and Enterprise Claude plans include both models and extended thinking, with Sonnet 4 also available to free users. Both models are available on our API, Amazon Bedrock, and Google Cloud's Vertex AI. Pricing remains consistent with previous Opus and Sonnet models: Opus 4 at $15/$75 per million tokens (input/output) and Sonnet 4 at $3/$15.
Claude 4
Claude Opus 4 is our most powerful model yet and the best coding model in the world, leading on SWE-bench (72.5%) and Terminal-bench (43.2%). It delivers sustained performance on long-running tasks that require focused effort and thousands of steps, with the ability to work continuously for several hours—dramatically outperforming all Sonnet models and significantly expanding what AI agents can accomplish.
Claude Opus 4 excels at coding and complex problem-solving, powering frontier agent products. Cursor calls it state-of-the-art for coding and a leap forward in complex codebase understanding. Replit reports improved precision and dramatic advancements for complex changes across multiple files. Block calls it the first model to boost code quality during editing and debugging in its agent, codename goose , while maintaining full pe
... (truncated, 11 KB total)4ec03078d3169fe5 | Stable ID: sid_iASmQ2UDQ1