OpenAI: Preparedness Framework Version 2
webCredibility Rating
High quality. Established institution or organization with editorial oversight and accountability.
Rating inherited from publication venue: OpenAI
OpenAI's official internal safety governance framework, updated as v2; relevant for understanding how a leading frontier AI lab operationalizes risk thresholds and pre-deployment evaluation processes.
Metadata
Summary
OpenAI's Preparedness Framework v2 outlines the company's structured approach to evaluating and managing catastrophic risks from frontier AI models, including definitions of risk severity levels and thresholds that determine whether a model can be deployed or developed further. It establishes a systematic process for tracking, evaluating, and preparing for frontier model risks across domains such as CBRN threats, cyberattacks, and loss of human control. The framework represents OpenAI's operationalized safety commitments with concrete governance mechanisms.
Key Points
- •Defines a four-tier risk severity scale (low, medium, high, critical) across key risk categories including CBRN, cybersecurity, persuasion/influence ops, and model autonomy.
- •Sets deployment and development thresholds: models rated 'critical' risk cannot be deployed; models rated 'high' require additional mitigations before release.
- •Establishes a Preparedness team responsible for conducting 'scorecards' evaluating frontier models against defined risk benchmarks before and after deployment.
- •Includes a safety advisory group and board-level oversight mechanism to ensure accountability beyond standard product teams.
- •Represents a living document intended to evolve as capabilities and understanding of risks develop over time.
Cited by 3 pages
| Page | Type | Quality |
|---|---|---|
| Corporate AI Safety Responses | Approach | 68.0 |
| Dangerous Capability Evaluations | Approach | 64.0 |
| Eval Saturation & The Evals Gap | Approach | 65.0 |
1 FactBase fact citing this source
| Entity | Property | Value | As Of |
|---|---|---|---|
| OpenAI | AI Safety Level | High/Critical capability thresholds (Preparedness Framework v2) | Apr 2025 |
Cached Content Preview
Preparedness Framework
Version 2. Last updated: 15th April, 2025
-- 1 of 22 --
OpenAI’s mission is to ensure that AGI (artificial general intelligence) benefits all of humanity. To pursue
that mission, we are committed to safely developing and deploying highly capable AI systems, which
create significant benefits and also bring new risks. We build for safety at every step and share our
learnings so that society can make well-informed choices to manage new risks from frontier AI.
The Preparedness Framework is OpenAI’s approach to tracking and preparing for frontier capabilities
that create new risks of severe harm.1 We currently focus this work on three areas of frontier capability,
which we call Tracked Categories:
• Biological and Chemical capabilities that, in addition to unlocking discoveries and cures, can also
reduce barriers to creating and using biological or chemical weapons.
• Cybersecurity capabilities that, in addition to helping protect vulnerable systems, can also create
new risks of scaled cyberattacks and vulnerability exploitation.
• AI Self- improvement capabilities that, in addition to unlocking helpful capabilities faster, could
also create new challenges for human control of AI systems.
In each area, we develop and maintain a threat model that identifies the risks of severe harm and sets
thresholds we can measure to tell us when the models get capable enough to meaningfully pose these
risks. We won’t deploy these very capable models until we’ve built safeguards to sufficiently minimize
the associated risks of severe harm. This Framework lays out the kinds of safeguards we expect to need,
and how we’ll confirm internally and show externally that the safeguards are sufficient.
In this updated version of the Framework we also introduce a set of Research Categories. These are
areas of capability that could pose risks of severe harm, that do not yet meet our criteria to be Tracked
Categories, and where we are investing now to further develop our threat models and capability elicitation
techniques.
We are constantly refining our practices and advancing the science, to unlock the benefits of these
technologies while addressing their risks. This revision of the Preparedness Framework focuses on the
safeguards we expect will be needed for future models more capable than those we have today.
1 By “severe harm” in this document, we mean the death or grave injury of thousands of people or hundreds of billions of
dollars of economic damage. Our safety stack addresses a broad spectrum of risks, including many with harms below this severity.
In choosing to set a high bar here, we aim to ensure that the most severe risks receive attention commensurate with their magnitude.
1
-- 2 of 22 --
Contents
1 Introduction 3
1.1 Why we’re updating the Preparedness Framework . . . . . . . . . . . . . . . . . . . . . . . 3
2 Deciding where to focus 4
2.1 Holistic risk assessment and categorization . . . . . . . . . . . . . . . . . . . . . . . . . . .
... (truncated, 66 KB total)ec5d8e7d6a1b2c7c | Stable ID: sid_StaXN818oH