Responsible Scaling Policy

web

Anthropic·anthropic.com/responsible-scaling-policy

Credibility Rating

4/5

High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: Anthropic

Anthropic's RSP is a foundational industry document for responsible development commitments; frequently cited in AI governance discussions as a model for 'if-then' safety commitments from frontier labs.

Metadata

Importance: 85/100policy briefprimary source

Summary

Anthropic's Responsible Scaling Policy (RSP) is a formal commitment outlining how the company will evaluate AI systems for dangerous capabilities and adjust deployment and development practices accordingly. It introduces 'AI Safety Levels' (ASL) analogous to biosafety levels, establishing thresholds that trigger specific safety and security requirements before proceeding. The policy aims to prevent catastrophic misuse while allowing continued AI development.

Key Points

•Defines AI Safety Levels (ASL-1 through ASL-4+) as capability thresholds that trigger increasingly stringent safety and security requirements
•Commits Anthropic to halting or slowing deployment/development if safety measures cannot keep pace with identified capability levels
•Requires regular evaluations ('evals') for dangerous capabilities such as CBRN weapons assistance and autonomous replication
•Establishes mandatory security standards (e.g., model weights protection) and deployment safeguards tied to each ASL threshold
•Represents one of the first public, binding self-governance frameworks from a frontier AI lab linking capability growth to safety commitments

Cited by 9 pages

Page	Type	Quality
Anthropic	Organization	74.0
Capability Elicitation	Approach	91.0
Corporate AI Safety Responses	Approach	68.0
Dangerous Capability Evaluations	Approach	64.0
Eval Saturation & The Evals Gap	Approach	65.0
AI Evaluation	Approach	72.0
AI Lab Safety Culture	Approach	62.0
Third-Party Model Auditing	Approach	64.0
AI Safety Cases	Approach	91.0

Cached Content Preview

HTTP 200Fetched Apr 9, 202621 KB

Anthropic&#x27;s Responsible Scaling Policy

 Anticipating and securing against emerging threats that accompany increasingly powerful models

 Last updated Apr 2, 2026

 Related: 

 Read Anthropic&#x27;s Responsible Scaling Policy As frontier AI models advance, we believe they will bring about transformative benefits for our society and economy. AI could accelerate scientific discoveries, revolutionize healthcare, enhance our education system, and create entirely new domains for human creativity and innovation. Frontier AI models also, however, present new challenges and risks that warrant careful study and effective safeguards. 

 In September 2023, we released the first version of our Responsible Scaling Policy (RSP). We believe that risk governance in this rapidly evolving domain should be proportional, iterative, and exportable. In that spirit, we have continued to refine the RSP over time and will use this page to inform the public of any developments.

 Current and Prior Versions

 Read the current version of the RSP

 See the PDF RSP Noncompliance Reporting and Anti-Retaliation Policy

 See the PDF Version 3.1 (effective April 2, 2026)
 Version 3.0 (effective February 24, 2026)
 Version 2.2 and redline (effective May 14, 2025)
 Version 2.1 (effective March 31, 2025)
 Version 2.0 (effective October 15, 2024)
 Version 1.0 (effective September 19, 2023)
 April 2, 2026 

 We are updating our Frontier Safety Roadmap , because we have achieved two of the goals we had set for our AI safety work. We have now launched the planned moonshot R&D projects, and we have replaced the goal of launching them with more detailed goals for ongoing projects. We have also completed the goal of creating a “comprehensive internal report to identify how our Safeguards could be improved by updating our data retention policies”.

 We have also made two minor updates to the text of the Responsible Scaling Policy, updating it to version 3.1. After we released the v3 update, readers noted two areas on which our language could be clarified. We have therefore made the following updates, neither of which significantly change the substance of the policy:

 We now include a clearer definition of the AI R&D capability threshold. For example, in v3, our language around AI doubling the rate of progress (“compress two years of 2018 – 2024 AI progress into a single year”) could have been read as AI “doubling the rate of progress in aggregate AI capabilities”, or “doubling the productivity of researchers”. In v3.1, we are clear that we mean the former and not the latter.
 We now clarify that, even if not required by the RSP, we remain free to take measures such as pausing the development of our AI systems in any circumstances in which we deem them appropriate. This was true of RSP v3, but it is stated more clearly in the v3.1 update.
 There are also a number of smaller edits for style and clarity elsewhere in the policy.

 As we noted in the post announcing RSP v3 , the policy is 

... (truncated, 21 KB total)

Resource ID: afe1e125f3ba3f14 | Stable ID: sid_HGWmTtp8PI