Compute scaling will slow down due to increasing lead times | Epoch AI

web

Epoch AI·epoch.ai/gradient-updates/compute-scaling-will-slow-down-...

Credibility Rating

4/5

High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: Epoch AI

This Epoch AI analysis argues that compute scaling will decelerate due to increasing lead times and economic uncertainty, with implications for AI capabilities growth and safety timelines.

Metadata

Importance: 62/100blog postanalysis

Summary

Epoch AI argues that massive compute scaling driving AI progress since 2020 will slow due to economic uncertainty and lengthening development lead times. Their key finding is that every 10× increase in compute scale adds roughly one year of lead time delay, pushing a trillion-dollar cluster from ~2030 to ~2035. This dynamic may slow AI progress broadly, though frontier labs could still scale training runs for 1-2 more years by reallocating existing compute.

Key Points

•Every additional 10× increase in compute scale is estimated to lengthen lead times by approximately one year, compounding delays at larger scales.
•A trillion-dollar compute cluster, extrapolated from current trends (~2030), would be delayed to ~2035 when accounting for lead time effects.
•Investors are unlikely to commit to massive 'YOLO' scale-ups due to uncertain returns; incremental 10× investments with evaluation checkpoints are more probable.
•Even if overall compute stock scaling slows, frontier labs may still scale training compute at ~5× per year for 1-2 more years by allocating a larger fraction to training.
•Slower compute scaling could reduce the pace of algorithmic experimentation and model capability improvements, potentially slowing AI progress overall.

Cited by 1 page

Page	Type	Quality
Novel / Unknown Approaches	Capability	53.0

Cached Content Preview

HTTP 200Fetched Apr 9, 202617 KB

Compute scaling will slow down due to increasing lead times | Epoch AI 

 
 
 
 

 

 

 

 
Gradient Updates shares more opinionated or informal takes on big questions in AI progress. These posts solely represent the views of the authors, and do not necessarily reflect the views of Epoch AI as a whole.
 The massive compute scaling that has driven AI progress since 2020 is likely to slow down soon, due to increasing economic uncertainty and longer development cycles.

 While investors could theoretically scale compute by several orders of magnitude, the required hundreds of billions, combined with uncertain returns, will push them toward incremental scaling — investing, deploying products to gauge returns, then reevaluating further investment. Additionally, as the required compute grows larger, the time between project initiation and product deployment (i.e. “lead time”) lengthens significantly, creating a feedback loop that naturally slows the pace of compute scaling.

 In particular, our current best guess is that every additional 10× increase in compute scale lengthens lead times by around a year. For example, OpenAI currently likely has over $15 billion worth of compute, and this compute stock has been growing by around 2.2× each year. 1 At that pace, current trends would predict a trillion dollar cluster around 2030 — but longer lead times would delay this to around 2035.

 The “extrapolation with lead times” is determined by taking the direct extrapolation, and adjusting it such that each additional 10× increase in compute stock increases lead times by a year. These accumulate so that the total delay is larger at greater compute scales. 

 
 Importantly, this dynamic applies most directly to the overall stock of compute at a lab, not the size of training runs. Even if lead times slow scaling of compute investment, frontier AI labs may still scale training compute at 5× per year for another 1-2 years by allocating a larger fraction of compute to training.

 The implications could be huge. It may become harder to obtain the compute needed to run experiments and find novel algorithmic improvements, and at some point training compute scaling will also have to decelerate. Since compute is strongly correlated with model capabilities, we expect this dynamic to slow AI progress.

 Get the latest from Epoch AI Subscribe 
 Uncertainties about investment returns prevent a “YOLO scaleup”

 If investors believe the Bitter Lesson , one might expect them to throw everything into expanding the available compute. For example, we see this sort of dynamic in economic models like the GATE model , which (dubiously) predicts that it is optimal to invest $25 trillion in AI today.

 But in practice, such massive investment is highly unlikely. This is partly because the returns to scaling are deeply uncertain, while the costs are enormous. For example, the second phase of xAI’s Colossus cluster already required nearly $10 billion in hardware acquisition costs . Sca

... (truncated, 17 KB total)

Resource ID: 121f6161e455aa94 | Stable ID: sid_ThZX7E64jR