Skip to content
Longterm Wiki
Back

Sora 2: OpenAI's Flagship Video and Audio Generation Model

web

Credibility Rating

4/5
High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: OpenAI

Relevant to AI safety discussions around increasingly capable generative models, world simulators, and deepfake/identity risks from voice and appearance cloning features; represents frontier capabilities progress rather than safety research.

Metadata

Importance: 35/100press releasenews

Summary

OpenAI releases Sora 2, a significantly improved video and audio generation model featuring enhanced physical accuracy, controllability, synchronized dialogue, and sound effects. The model represents a major step toward world simulation, better modeling physical laws including failure states, and supports injection of real-world elements like specific people into generated scenes.

Key Points

  • Sora 2 improves physical accuracy over prior models, correctly simulating physics failures (e.g., basketball rebounds) rather than 'morphing reality' to fulfill prompts.
  • Described as a potential 'GPT-3.5 moment for video,' demonstrating complex physical dynamics like gymnastics and paddleboard backflips.
  • Features synchronized audio generation including speech, background soundscapes, and sound effects with high realism.
  • Supports injection of real-world people, animals, or objects into generated environments with accurate appearance and voice replication.
  • OpenAI frames advanced video generation as critical infrastructure for training AI systems that deeply understand the physical world.

Cited by 1 page

PageTypeQuality
OpenAIOrganization62.0

Cached Content Preview

HTTP 200Fetched Apr 9, 202619 KB
Sora 2 is here | OpenAI

 

 
 
 
 

 Mar
 APR
 May
 

 
 

 
 05
 
 

 
 

 2025
 2026
 2027
 

 
 
 

 

 

 
 
success

 
fail

 
 
 
 
 
 
 
 
 
 
 

 

 
 
 
 
 
 
 
 
 

 

 About this capture
 

 

 

 

 

 

 
COLLECTED BY

 

 

 
 
Collection: Save Page Now Outlinks

 

 

 

 

 
TIMESTAMPS

 

 

 

 

 

 

The Wayback Machine - https://web.archive.org/web/20260405211721/https://openai.com/index/sora-2/

 

Skip to main content

li:hover)>li:not(:hover)>*]:text-primary-60 flex h-full min-w-0 items-baseline gap-0 overflow-x-hidden whitespace-nowrap [-ms-overflow-style:none] [scrollbar-width:none] focus-within:overflow-visible [&::-webkit-scrollbar]:hidden">
Research

Products

Business

Developers

Company

Foundation(opens in a new window)

Log in

Try ChatGPT

(opens in a new window)

Research

Products

Business

Developers

Company

Foundation

(opens in a new window)

Try ChatGPT

(opens in a new window)Login

OpenAI

Table of contents

Deployment of Sora 2

Launching responsibly

Sora 2 availability and what’s next

September 30, 2025
ResearchReleaseProduct

Sora 2 is here

Our latest video generation model is more physically accurate, realistic, and more controllable than prior systems. It also features synchronized dialogue and sound effects. Create with it in the new Sora app.

Loading…

Share

Today we’re releasing Sora 2, our flagship video and audio generation model.

The original Sora model⁠ from February 2024 was in many ways the GPT‑1 moment for video—the first time video generation started to seem like it was working, and simple behaviors like object permanence emerged from scaling up pre-training compute. Since then, the Sora team has been focused on training models with more advanced world simulation capabilities. We believe such systems will be critical for training AI models that deeply understand the physical world. A major milestone for this is mastering pre-training and post-training on large-scale video data, which are in their infancy compared to language.

Prompt: figure skater performs a triple axle with a cat on her head

With Sora 2, we are jumping straight to what we think may be the GPT‑3.5 moment for video. Sora 2 can do things that are exceptionally difficult—and in some instances outright impossible—for prior video generation models: Olympic gymnastics routines, backflips on a paddleboard that accurately model the dynamics of buoyancy and rigidity, and triple axels while a cat holds on for dear life.

Prompt: a guy does a backflip

Prior video models are overoptimistic—they will morph objects and deform reality to successfully execute upon a text prompt. For example, if a basketball player misses a shot, the ball may spontaneously teleport to the hoop. In Sora 2, if a basketball player misses a shot, it will rebound off the backboard. Interestingly, “mistakes” the model makes frequently appear to be mistakes of the internal agent that Sora 2 is implicitly modeling; though still imperfect, it is bette

... (truncated, 19 KB total)
Resource ID: edc1663b7d3b8ac2 | Stable ID: ZDQ1ZTA5NT