Claude 3.7 Sonnet and Claude Code Announcement

web

Anthropic·anthropic.com/news/claude-3-7-sonnet

Credibility Rating

4/5

High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: Anthropic

Anthropic's announcement of Claude 3.7 Sonnet introduces a hybrid reasoning model with extended thinking capabilities, relevant to AI safety as it demonstrates frontier capability advances and Anthropic's approach to integrating reasoning into general-purpose models.

Metadata

Importance: 55/100press releasenews

Summary

Anthropic announces Claude 3.7 Sonnet, their most capable model to date and the first hybrid reasoning model, which can operate in both standard and extended thinking modes. The model shows strong improvements in coding, front-end development, and real-world task performance. Alongside it, Anthropic introduces Claude Code, an agentic command-line coding tool in limited research preview.

Key Points

•Claude 3.7 Sonnet is a hybrid model combining standard LLM responses with optional extended step-by-step reasoning in a single model.
•API users can set a token budget for thinking (up to 128K tokens), allowing trade-offs between speed, cost, and answer quality.
•Achieves state-of-the-art on SWE-bench Verified and TAU-bench, with strong endorsements from Cursor, Cognition, Vercel, Replit, and Canva.
•Claude Code is a new agentic CLI tool enabling developers to delegate substantial engineering tasks directly from their terminal.
•Priced at $3/million input tokens and $15/million output tokens, including thinking tokens, same as predecessors.

Cited by 1 page

Page	Type	Quality
Reasoning and Planning	Capability	65.0

Cached Content Preview

HTTP 200Fetched Apr 24, 202611 KB

Announcements Claude 3.7 Sonnet and Claude Code

 Feb 24, 2025 Today, we’re announcing Claude 3.7 Sonnet 1 , our most intelligent model to date and the first hybrid reasoning model on the market. Claude 3.7 Sonnet can produce near-instant responses or extended, step-by-step thinking that is made visible to the user . API users also have fine-grained control over how long the model can think for.

 Claude 3.7 Sonnet shows particularly strong improvements in coding and front-end web development. Along with the model, we’re also introducing a command line tool for agentic coding, Claude Code. Claude Code is available as a limited research preview, and enables developers to delegate substantial engineering tasks to Claude directly from their terminal.

 Claude 3.7 Sonnet is now available on all Claude plans—including Free, Pro, Team, and Enterprise—as well as the Claude Developer Platform , Amazon Bedrock , and Google Cloud’s Vertex AI . Extended thinking mode is available on all surfaces except the free Claude tier.

 In both standard and extended thinking modes, Claude 3.7 Sonnet has the same price as its predecessors: $3 per million input tokens and $15 per million output tokens—which includes thinking tokens.

 Claude 3.7 Sonnet: Frontier reasoning made practical

 We’ve developed Claude 3.7 Sonnet with a different philosophy from other reasoning models on the market. Just as humans use a single brain for both quick responses and deep reflection, we believe reasoning should be an integrated capability of frontier models rather than a separate model entirely. This unified approach also creates a more seamless experience for users.

 Claude 3.7 Sonnet embodies this philosophy in several ways. First, Claude 3.7 Sonnet is both an ordinary LLM and a reasoning model in one: you can pick when you want the model to answer normally and when you want it to think longer before answering . In the standard mode, Claude 3.7 Sonnet represents an upgraded version of Claude 3.5 Sonnet. In extended thinking mode , it self-reflects before answering, which improves its performance on math, physics, instruction-following, coding, and many other tasks. We generally find that prompting for the model works similarly in both modes.

 Second, when using Claude 3.7 Sonnet through the API, users can also control the budget for thinking: you can tell Claude to think for no more than N tokens, for any value of N up to its output limit of 128K tokens. This allows you to trade off speed (and cost) for quality of answer.

 Third, in developing our reasoning models, we’ve optimized somewhat less for math and computer science competition problems, and instead shifted focus towards real-world tasks that better reflect how businesses actually use LLMs.

 Early testing demonstrated Claude’s leadership in coding capabilities across the board: Cursor noted Claude is once again best-in-class for real-world coding tasks, with significant improvements in areas ranging from handling comple

... (truncated, 11 KB total)

Resource ID: a6c92f94a0af49f7 | Stable ID: sid_8DcTwmhB9A