OpenAI: Preparedness Framework Version 2

web

OpenAI·cdn.openai.com/pdf/18a02b5d-6b67-4cec-ab64-68cdfbddebcd/p...

Credibility Rating

4/5

High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: OpenAI

OpenAI's official internal safety governance framework, updated as v2; relevant for understanding how a leading frontier AI lab operationalizes risk thresholds and pre-deployment evaluation processes.

Metadata

Importance: 72/100organizational reportprimary source

Summary

OpenAI's Preparedness Framework v2 outlines the company's structured approach to evaluating and managing catastrophic risks from frontier AI models, including definitions of risk severity levels and thresholds that determine whether a model can be deployed or developed further. It establishes a systematic process for tracking, evaluating, and preparing for frontier model risks across domains such as CBRN threats, cyberattacks, and loss of human control. The framework represents OpenAI's operationalized safety commitments with concrete governance mechanisms.

Key Points

•Defines a four-tier risk severity scale (low, medium, high, critical) across key risk categories including CBRN, cybersecurity, persuasion/influence ops, and model autonomy.
•Sets deployment and development thresholds: models rated 'critical' risk cannot be deployed; models rated 'high' require additional mitigations before release.
•Establishes a Preparedness team responsible for conducting 'scorecards' evaluating frontier models against defined risk benchmarks before and after deployment.
•Includes a safety advisory group and board-level oversight mechanism to ensure accountability beyond standard product teams.
•Represents a living document intended to evolve as capabilities and understanding of risks develop over time.

Cited by 3 pages

Page	Type	Quality
Corporate AI Safety Responses	Approach	68.0
Dangerous Capability Evaluations	Approach	64.0
Eval Saturation & The Evals Gap	Approach	65.0

1 FactBase fact citing this source

Entity	Property	Value	As Of
OpenAI	AI Safety Level	High/Critical capability thresholds (Preparedness Framework v2)	Apr 2025

Cached Content Preview

HTTP 200Fetched Apr 7, 202666 KB

Preparedness Framework
Version 2. Last updated: 15th April, 2025

-- 1 of 22 --

OpenAI’s mission is to ensure that AGI (artificial general intelligence) benefits all of humanity. To pursue
that mission, we are committed to safely developing and deploying highly capable AI systems, which
create significant benefits and also bring new risks. We build for safety at every step and share our
learnings so that society can make well-informed choices to manage new risks from frontier AI.
The Preparedness Framework is OpenAI’s approach to tracking and preparing for frontier capabilities
that create new risks of severe harm.1 We currently focus this work on three areas of frontier capability,
which we call Tracked Categories:
• Biological and Chemical capabilities that, in addition to unlocking discoveries and cures, can also
reduce barriers to creating and using biological or chemical weapons.
• Cybersecurity capabilities that, in addition to helping protect vulnerable systems, can also create
new risks of scaled cyberattacks and vulnerability exploitation.
• AI Self- improvement capabilities that, in addition to unlocking helpful capabilities faster, could
also create new challenges for human control of AI systems.
In each area, we develop and maintain a threat model that identifies the risks of severe harm and sets
thresholds we can measure to tell us when the models get capable enough to meaningfully pose these
risks. We won’t deploy these very capable models until we’ve built safeguards to sufficiently minimize
the associated risks of severe harm. This Framework lays out the kinds of safeguards we expect to need,
and how we’ll confirm internally and show externally that the safeguards are sufficient.
In this updated version of the Framework we also introduce a set of Research Categories. These are
areas of capability that could pose risks of severe harm, that do not yet meet our criteria to be Tracked
Categories, and where we are investing now to further develop our threat models and capability elicitation
techniques.
We are constantly refining our practices and advancing the science, to unlock the benefits of these
technologies while addressing their risks. This revision of the Preparedness Framework focuses on the
safeguards we expect will be needed for future models more capable than those we have today.
1 By “severe harm” in this document, we mean the death or grave injury of thousands of people or hundreds of billions of
dollars of economic damage. Our safety stack addresses a broad spectrum of risks, including many with harms below this severity.
In choosing to set a high bar here, we aim to ensure that the most severe risks receive attention commensurate with their magnitude.
1

-- 2 of 22 --

Contents
1 Introduction 3
1.1 Why we’re updating the Preparedness Framework . . . . . . . . . . . . . . . . . . . . . . . 3
2 Deciding where to focus 4
2.1 Holistic risk assessment and categorization . . . . . . . . . . . . . . . . . . . . . . . . . . . 

... (truncated, 66 KB total)

Resource ID: ec5d8e7d6a1b2c7c | Stable ID: sid_StaXN818oH