Skip to content
Longterm Wiki
Back

Anthropic pioneered the Responsible Scaling Policy

web

Credibility Rating

4/5
High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: Anthropic

Anthropic's RSP is one of the first formal industry commitments to conditional scaling, making it a key reference for AI governance discussions and a template other labs have since adapted.

Metadata

Importance: 78/100organizational reportprimary source

Summary

This page documents Anthropic's Responsible Scaling Policy (RSP), a framework that ties AI development and deployment decisions to demonstrated capability thresholds and corresponding safety measures. It outlines commitments to pause or restrict scaling if AI systems reach certain dangerous capability levels without adequate safeguards, and tracks updates to the policy over time.

Key Points

  • RSP establishes 'AI Safety Levels' (ASL) that define capability thresholds requiring progressively stricter safety and security measures before deployment.
  • Anthropic commits to halting deployment or training if a model reaches a new ASL without corresponding safety standards being met.
  • The policy addresses risks from CBRN (chemical, biological, radiological, nuclear) weapons uplift and autonomous AI capabilities as key danger categories.
  • RSP updates reflect iterative refinement of commitments based on new research, red-teaming results, and evolving understanding of frontier risks.
  • Serves as a model for other frontier AI labs considering similar safety-gated scaling commitments and industry-wide norms.

Cited by 4 pages

PageTypeQuality
AI EvaluationApproach72.0
Pause AdvocacyApproach91.0
Responsible Scaling PoliciesApproach62.0
Structured Access / API-OnlyApproach91.0

Cached Content Preview

HTTP 200Fetched Apr 9, 202621 KB
Anthropic's Responsible Scaling Policy

 Anticipating and securing against emerging threats that accompany increasingly powerful models

 Last updated Apr 2, 2026

 Related: 

 Read Anthropic's Responsible Scaling Policy As frontier AI models advance, we believe they will bring about transformative benefits for our society and economy. AI could accelerate scientific discoveries, revolutionize healthcare, enhance our education system, and create entirely new domains for human creativity and innovation. Frontier AI models also, however, present new challenges and risks that warrant careful study and effective safeguards. 

 In September 2023, we released the first version of our Responsible Scaling Policy (RSP). We believe that risk governance in this rapidly evolving domain should be proportional, iterative, and exportable. In that spirit, we have continued to refine the RSP over time and will use this page to inform the public of any developments.

 Current and Prior Versions

 Read the current version of the RSP

 See the PDF RSP Noncompliance Reporting and Anti-Retaliation Policy

 See the PDF Version 3.1 (effective April 2, 2026)
 Version 3.0 (effective February 24, 2026)
 Version 2.2 and redline (effective May 14, 2025)
 Version 2.1 (effective March 31, 2025)
 Version 2.0 (effective October 15, 2024)
 Version 1.0 (effective September 19, 2023)
 April 2, 2026 

 We are updating our Frontier Safety Roadmap , because we have achieved two of the goals we had set for our AI safety work. We have now launched the planned moonshot R&D projects, and we have replaced the goal of launching them with more detailed goals for ongoing projects. We have also completed the goal of creating a “comprehensive internal report to identify how our Safeguards could be improved by updating our data retention policies”.

 We have also made two minor updates to the text of the Responsible Scaling Policy, updating it to version 3.1. After we released the v3 update, readers noted two areas on which our language could be clarified. We have therefore made the following updates, neither of which significantly change the substance of the policy:

 We now include a clearer definition of the AI R&D capability threshold. For example, in v3, our language around AI doubling the rate of progress (“compress two years of 2018 – 2024 AI progress into a single year”) could have been read as AI “doubling the rate of progress in aggregate AI capabilities”, or “doubling the productivity of researchers”. In v3.1, we are clear that we mean the former and not the latter.
 We now clarify that, even if not required by the RSP, we remain free to take measures such as pausing the development of our AI systems in any circumstances in which we deem them appropriate. This was true of RSP v3, but it is stated more clearly in the v3.1 update.
 There are also a number of smaller edits for style and clarity elsewhere in the policy.

 As we noted in the post announcing RSP v3 , the policy is 

... (truncated, 21 KB total)
Resource ID: c6766d463560b923 | Stable ID: sid_2mcG4n9UV8