Epoch AI projections

web

Epoch AI·epoch.ai/blog/model-counts-compute-thresholds

Credibility Rating

4/5

High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: Epoch AI

Useful reference for AI governance discussions about compute-based regulation; provides quantitative estimates of how many models fall above specific FLOP thresholds used in major regulatory proposals.

Metadata

Importance: 62/100blog postanalysis

Summary

Epoch AI analyzes how many AI models would fall above various compute thresholds (measured in FLOPs), providing empirical projections relevant to governance frameworks that use compute as a regulatory trigger. The analysis helps policymakers and researchers understand the practical scope and selectivity of compute-based oversight mechanisms.

Key Points

•Estimates how many frontier AI models exceed various compute thresholds, informing threshold-based regulatory frameworks like those in the EU AI Act and US Executive Order.
•Shows that higher compute thresholds (e.g., 10^26 FLOPs) capture only a small number of the most capable models, while lower thresholds capture many more.
•Provides empirical grounding for debates about where to set compute governance triggers to balance coverage and administrative burden.
•Demonstrates that compute thresholds are a blunt but tractable proxy for identifying potentially high-risk AI systems.
•Epoch AI's dataset of historical training runs underpins the projections, giving quantitative context to policy discussions.

Cited by 3 pages

Page	Type	Quality
AI Capability Threshold Model	Analysis	72.0
Epoch AI	Organization	51.0
Compute Monitoring	Approach	69.0

Cached Content Preview

HTTP 200Fetched Apr 7, 202656 KB

How many AI models will exceed compute thresholds? | Epoch AI 

 
 
 
 

 

 
 Executive summary

 The compute used to train AI models has been a key driver of AI progress, informing many predictions of AI’s future capabilities. However, the number of AI models that will surpass different compute levels has received less attention. This is relevant to compute-based AI regulation, as well as AI development and deployment more broadly. We develop a projective model that relates key inputs such as investment and the distribution of compute to the number of notable AI models : models that are state of the art, highly cited, or otherwise historically notable. The projections can be explored in a new interactive tool .

 Figure 1: Median projection for future notable AI model releases with different levels of compute, by year. Note: these projections are likely to be smaller than total model counts as a compute threshold falls further behind the frontier, since lower-compute models are less likely to meet Epoch AI’s notability criteria or be publicly documented.

 Our modeling shows that the number of notable AI models above a given compute threshold rapidly accelerates over time. For example, the first model in our dataset estimated to use over 10 26 FLOP was Grok-3 from xAI, released in February 2025. Extrapolating current trends, there would be around 30 such models by the start of 2027, and over 200 models by the start of 2030. As the compute threshold is increased, the model count drops substantially, but growth remains rapid.

 These counts focus on notable models involving a new training run that exceed a given compute threshold. These are based on our dataset of publicly announced notable models where we can estimate training compute—a subset of all AI models. For thresholds well below the frontier, such as 10 23 FLOP today, the total number of published models could easily be 4x higher than our projections (see Dataset and inclusion criteria ).

 To illustrate the range of plausible model counts through to 2030, we developed two alternative scenarios to the median, denoted as “conservative” and “aggressive”. These scenarios are defined by three inputs: investment in the largest training run, total number of models per year, and number of models near the largest training run. For each of these inputs, we model extremes in how the current trends could change, grounded in evidence from historical variation, present day spending on AI, and potential bottlenecks to scaling (see Scenarios based on AI investment and model development ). The conservative and aggressive scenarios predict about 10 and 80 notable models above 10 26 FLOP by 2027, respectively. This highlights our uncertainty, while affirming that model counts are likely to grow rapidly.

 Enable JavaScript to see an interactive visualization.

 Figure 2: The number of AI models above 10 26 FLOP, comparing the median extrapolation to two alternative scenarios. The “conservative” and “aggre

... (truncated, 56 KB total)

Resource ID: 080da6a9f43ad376 | Stable ID: ZmVmYmE5Nm