Epoch AI
- ClaimHigh-quality training text data (~10^13 tokens) may be exhausted by the mid-2020s, creating a fundamental bottleneck that could force AI development toward synthetic data generation and multimodal approaches.S:4.5I:4.0A:4.5
- Quant.Training compute for frontier AI models is doubling every 6 months (compared to Moore's Law's 2-year doubling), creating a 10,000x increase from 2012-2022 and driving training costs to $100M+ with projections of billions by 2030.S:4.0I:4.5A:4.0
- Counterint.Algorithmic efficiency in AI is improving by 2x every 6-12 months, which could undermine compute governance strategies by reducing the effectiveness of hardware-based controls.S:4.0I:4.0A:3.5
- Links15 links could use <R> components
Quick Assessment
Section titled “Quick Assessment”| Dimension | Assessment | Evidence |
|---|---|---|
| Research Output | Very High | 36 Data Insights, 37 Gradient Updates, 3,200+ models tracked in 2025 |
| Policy Influence | High | Cited in US Executive Order 14110; 10^26 FLOP threshold based on Epoch data |
| Data Infrastructure | Exceptional | Largest public ML model database; 3,200+ models from 1950-present |
| Funding Stability | Strong | $1.1M Open Philanthropy grant (2025), additional $1.9M general support |
| Team Expertise | Strong | 34 staff; director holds PhD in AI from University of Aberdeen |
| Methodological Rigor | High | Multiple forecasting methods with explicit uncertainty bounds |
| Independence | Medium-High | Nonprofit with EA/Open Philanthropy funding; no industry capture |
| Benchmark Innovation | High | FrontierMath benchmark with 60+ mathematicians including Fields Medalist |
Overview
Section titled “Overview”Epoch AI is a research organization founded in 2022 that provides rigorous, data-driven empirical analysis and forecasting of AI progress. Based in San Francisco with 34 employees, their work serves as critical infrastructure for AI governance and timeline forecastingAgi TimelineComprehensive synthesis of AGI timeline forecasts showing dramatic acceleration: expert median dropped from 2061 (2018) to 2047 (2023), Metaculus from 50 years to 5 years since 2020, with current p...Quality: 59/100. Their public database tracks over 3,200 machine learning models from 1950 to present—the largest resource of its kind.
Their research documents three key scaling dynamics: training compute growing at 4.4x per year since 2010, high-quality training data stock of approximately 300 trillion tokens with exhaustion projected between 2026-2032, and algorithmic efficiency doubling every 6-12 months. As of June 2025, they have identified over 30 publicly announced AI models exceeding the 10^25 FLOP training compute threshold.
Unlike organizations developing AI capabilities or safety techniques directly, Epoch provides the empirical foundation that informs strategic decisions across the AI ecosystem. Their databases and forecasts are cited by policymakers designing compute governance frameworks, safety researchers planning research timelines, and AI labs benchmarking their progress against industry trends. In 2024, they were featured on The New York Times’ “2024 Good Tech Awards” list.
Their most influential finding is the exponential growth in training compute for frontier models—with training costs growing 2-3x annually and projected to exceed $1 billion per model by 2027. This analysis has become foundational for understanding AI progress and informing governance approaches focused on compute as a key chokepoint.
2025 Research Impact
Section titled “2025 Research Impact”| Output Category | 2025 Metrics | Key Products | Policy Relevance |
|---|---|---|---|
| Data Insights | 36 published | Training compute decomposition, API cost trends | Informs compute governance thresholds |
| Gradient Updates | 37 published | Weekly trend analyses | Real-time capability tracking |
| Model Database | 3,200+ entries | Largest public ML model database | Foundation for policy research |
| Benchmarks | 2 major releases | FrontierMath, Epoch Capabilities Index | Standardized capability measurement |
| AI Chip Tracking | New data explorer | 15M+ H100-equivalents tracked globally | Export control effectiveness analysis |
| Cost Analysis | Multiple reports | 10x API price drop (Apr 2023-Mar 2025) | Market concentration analysis |
Key Findings Summary
Section titled “Key Findings Summary”| Metric | Current Value | Trend | Source |
|---|---|---|---|
| Training compute growth | 4.4x/year (2010-2024) | Accelerating to 5x/year post-2020 | Epoch Trends |
| Frontier model training costs | Tens of millions USD | Projected to exceed $1B by 2027 | Cost Analysis |
| Total AI compute capacity | 15M+ H100-equivalents | Growing rapidly | Epoch AI Chip Sales |
| Models above 10^25 FLOP | 30+ (as of June 2025) | ≈2 announced monthly in 2024 | Model Database |
| High-quality text stock | ≈300 trillion tokens | Exhaustion projected 2026-2032 | Epoch Data Analysis |
| API cost reduction | 10x (Apr 2023-Mar 2025) | Continuing decline | 2025 Impact Report |
Organizational Risk Assessment
Section titled “Organizational Risk Assessment”| Risk Category | Assessment | Evidence | Timeline | Trajectory |
|---|---|---|---|---|
| Data Bottleneck | High | High-quality text ≈300T tokens, current usage accelerating | 2026-2032 | Worsening |
| Compute Scaling | Medium | 4.4x annual growth potentially unsustainable; energy constraints emerging | 2030s | Stable |
| Governance Lag | High | Policy development slower than tech progress; 1,000+ US state AI bills in 2025 | Ongoing | Improving |
| Forecasting Accuracy | Medium | Wide uncertainty bounds; Epoch Capabilities Index improving precision | Continuous | Improving |
Key Research Areas
Section titled “Key Research Areas”Compute Trends Analysis
Section titled “Compute Trends Analysis”Epoch’s flagship research tracks computational resources used to train AI models, revealing exponential scaling patterns. Their Machine Learning Trends dashboard provides real-time tracking of these dynamics.
| Metric | Current Trend | Key Finding | Policy Implication |
|---|---|---|---|
| Training Compute | 4.4x/year (2010-2024) | 5x/year for frontier models since 2020 | Compute governance thresholds need updating |
| Training Costs | $50-100M for frontier models | Projected to exceed $1B by 2027 | Market concentration accelerating |
| Hardware Capacity | 15M+ H100-equivalents globally | Companies own 80% of AI supercomputers | Export controls increasingly important |
| Total AI Compute | Doubling every 7 months | 3.3x annual growth since 2022 | Governance frameworks struggling to keep pace |
Critical findings from Epoch’s compute database:
- Exponential growth faster than Moore’s Law: While chip performance doubles every ~2 years, AI training compute grows 4.4x annually—driven primarily by larger training clusters, longer training runs, and improved hardware
- Economic scaling: Training costs growing 2-3x annually, reaching $50-100M+ for frontier models; the largest AI supercomputers (like Anthropic’s 750MW Indiana facility) cost billions to build
- Concentration effects: Only 12 organizations have trained models above 10^25 FLOP; companies now own 80% of all AI supercomputers while government share has declined
Training Data Constraints
Section titled “Training Data Constraints”Epoch’s “Will We Run Out of Data?”↗🔗 web★★★★☆Epoch AI"Will We Run Out of Data?"Source ↗Notes research revealed potential bottlenecks for continued AI scaling. Their analysis estimates the total stock of human-generated public text at approximately 300 trillion tokens.
| Data Type | Estimated Stock | Current Usage Rate | Exhaustion Timeline |
|---|---|---|---|
| High-quality text | ≈300 trillion tokens | Accelerating | 2026-2032 |
| All web text | ≈10^15 tokens | Increasing | Early 2030s |
| Image data | Larger but finite | Growing rapidly | 2030s+ |
| Video data | Massive but hard to use | Early stages | Unknown |
Key implications:
- Pressure for efficiency: Data constraints may force more efficient training methods; Epoch research shows algorithmic efficiency doubling every 6-12 months
- Synthetic data boom: Investment in AI-generated training data accelerating to extend runway
- Multimodal shift: Language modelsCapabilityLarge Language ModelsComprehensive analysis of LLM capabilities showing rapid progress from GPT-2 (1.5B parameters, 2019) to o3 (87.5% on ARC-AGI vs ~85% human baseline, 2024), with training costs growing 2.4x annually...Quality: 60/100 increasingly incorporating image/video data
- Overtraining risk: If models are “intensely overtrained,” high-quality text could be exhausted even earlier than 2026
Timeline Forecasting Methodology
Section titled “Timeline Forecasting Methodology”Epoch employs multiple complementary approaches to estimate transformative AI timelines:
| Method | Current Estimate Range | Key Variables | Confidence Level |
|---|---|---|---|
| Trend Extrapolation | 2030s-2040s | Compute, data, algorithms | Medium |
| Biological Anchors | 2040s-2050s | Brain computation estimates | Low |
| Benchmark Analysis | 2030s-2050s | Task performance rates | Medium |
| Economic Modeling | 2035-2060s | Investment trends, ROI | Low |
Impact on AI Safety and Governance
Section titled “Impact on AI Safety and Governance”Policy Integration
Section titled “Policy Integration”Epoch’s data directly informs major governance initiatives. Their research has been particularly influential in establishing compute thresholds for regulatory frameworks.
| Policy Area | Epoch Contribution | Real-World Impact |
|---|---|---|
| US AI Executive Order 14110↗🏛️ government★★★★☆White HouseBiden Administration AI Executive Order 14110Source ↗Notes | 10^26 FLOPs threshold analysis | Training run reporting requirements for frontier models |
| Export controls↗🏛️ government★★★★☆Bureau of Industry and SecurityExport controlsSource ↗Notes | H100/A100 performance data, chip sales tracking | Chip restriction implementation and effectiveness monitoring |
| UK AI Safety Institute↗🏛️ government★★★★☆UK GovernmentUK AISISource ↗Notes | Capability benchmarking, FrontierMath | Model evaluation frameworks |
| EU AI Act | Compute-based GPAI thresholds | Classification of general-purpose AI systems |
| Compute governance research | Database infrastructure, threshold analysis | Academic and policy research foundation |
FrontierMath Benchmark
Section titled “FrontierMath Benchmark”Epoch’s FrontierMath benchmark represents a significant contribution to AI evaluation infrastructure:
| Aspect | Details | Significance |
|---|---|---|
| Problem Count | 350 original problems (300 in Levels 1-3, 50 in Level 4) | Covers major branches of modern mathematics |
| Expert Collaboration | 60+ mathematicians, including 14 IMO gold medalists, 1 Fields Medalist | Highest-quality benchmark construction |
| AI Performance | Less than 2% of problems solved by leading models | Reveals substantial gap between AI and human mathematical capability |
| Tier 4 Commission | 50 research-level problems commissioned by OpenAI | Testing frontier reasoning capabilities |
| Version Updates | Token budget increased 10x in November 2025 | Adapting to improved model inference |
Research Community Influence
Section titled “Research Community Influence”| Metric | Evidence | Source |
|---|---|---|
| Academic citations | 1,000+ citations across safety research | Google Scholar↗🔗 web★★★★☆Google ScholarGoogle ScholarSource ↗Notes |
| Policy references | 50+ government documents cite Epoch | Government databases |
| Database usage | 10,000+ downloads of compute database | Epoch analytics |
| Media coverage | NYT “2024 Good Tech Awards” recognition | New York Times |
| EA Forum engagement | Active community discussion and feedback | EA Forum posts |
Current State and Trajectory
Section titled “Current State and Trajectory”2024 Developments
Section titled “2024 Developments”Database expansion:
- Added 200+ new model entries to Parameter Database
- Enhanced tracking of Chinese and European models
- Improved cost estimation methodologies
- Real-time updates↗🔗 web★★★★☆Epoch AIReal-time updatesSource ↗Notes for new releases
Research breakthroughs:
- Refined algorithmic efficiency measurement showing 6-12 month doubling times
- Updated data exhaustion projections with synthetic data considerations
- New economic modeling of AI investment trends
- Bioweapons AI uplift analysisModelAI Uplift Assessment ModelSystematic model quantifying AI's marginal contribution to bioweapons risk, finding asymmetric uplift: evasion capabilities (2-3x current, potentially 7-10x by 2028) substantially exceed knowledge ...Quality: 71/100
2025-2026 Projections
Section titled “2025-2026 Projections”| Area | Expected Development | Impact | Source |
|---|---|---|---|
| Model scaling | 10+ models above 10^26 FLOP by 2026 | Over 200 projected by 2030 | Epoch projections |
| Data bottleneck | High-quality text exhaustion begins 2026-2032 | Synthetic data scaling accelerates | Epoch data analysis |
| Compute governance | Standardized international monitoring needed | Enhanced export controlsPolicyUS AI Chip Export ControlsComprehensive empirical analysis finds US chip export controls provide 1-3 year delays on Chinese AI development but face severe enforcement gaps (140,000 GPUs smuggled in 2024, only 1 BIS officer ...Quality: 73/100 | Epoch policy research |
| Benchmark development | 2 new benchmarks in development | Improved capability measurement | 2025 Impact Report |
| Capability acceleration | 15 points/year on ECI (up from 8 points pre-April 2024) | Faster than historical trend | Epoch Capabilities Index |
| Open-source threshold | Frontier open models may exceed 10^26 FLOP before 2026 | Challenges compute governance approach | Epoch data insights |
Key Uncertainties and Debates
Section titled “Key Uncertainties and Debates”Forecasting Limitations
Section titled “Forecasting Limitations”| Uncertainty | Impact on Estimates | Mitigation Strategy |
|---|---|---|
| Algorithmic breakthroughs | Could accelerate timelines by years | Multiple forecasting methods |
| Data efficiency improvements | May extend scaling runway | Conservative assumptions |
| Geopolitical disruption | Could fragment or accelerate development | Scenario planning |
| Hardware bottlenecks | May slow progress unexpectedly | Supply chain analysis |
Methodological Debates
Section titled “Methodological Debates”Trend extrapolation reliability:
- Optimists: Historical trends provide best available evidence for forecasting
- Pessimists: Sharp left turnsRiskSharp Left TurnThe Sharp Left Turn hypothesis proposes AI capabilities may generalize discontinuously while alignment fails to transfer, with compound probability estimated at 15-40% by 2027-2035. Empirical evide...Quality: 69/100 and discontinuities make extrapolation unreliable
- Epoch position: Multiple methods with explicit uncertainty bounds
Information hazards:
- Security concern: Publishing compute data aids adversaries in capability assessment
- Racing dynamics: Timeline estimates may encourage competitive behavior
- Transparency advocates: Public data essential for democratic governance
Value of Empirical AI Forecasting (4 perspectives)
Epoch's data provides crucial foundation for rational planning. Timeline estimates inform urgency decisions. Compute tracking enables governance. Superior to pure speculation.
Valuable for trend identification but shouldn't drive strategy alone. High uncertainty requires robust planning across scenarios rather than point estimates.
AI development too discontinuous to forecast meaningfully. Unknown unknowns dominate. Resources better spent on robustness than prediction.
Timeline publication creates racing dynamics. Compute data aids adversaries. False precision worse than acknowledged uncertainty. Focus on safety regardless of timelines.
Leadership and Organization
Section titled “Leadership and Organization”Key Personnel
Section titled “Key Personnel”Organizational Structure
Section titled “Organizational Structure”| Function | Team Size | Key Responsibilities | 2025 Focus |
|---|---|---|---|
| Research | 15-18 people | Forecasting, analysis, publications | FrontierMath, Epoch Capabilities Index |
| Data & Engineering | 8-10 people | Database infrastructure, automation | AI Chip Sales explorer, model tracking |
| Operations | 4-6 people | Funding, administration, communications | Grant management, public engagement |
| Advisory | External | Policy guidance, technical review | Academic partnerships |
| Total | 34 employees | Headquartered in San Francisco | Growing capacity |
Funding Profile
Section titled “Funding Profile”| Funder | Amount | Period | Purpose |
|---|---|---|---|
| Open Philanthropy (2025) | $1,132,488 | 2 years | General support |
| Open Philanthropy (additional) | $1,922,565 | Multi-year | General support |
| Open Philanthropy (FrontierMath) | $10,000 | 2025 | Benchmark improvements |
| Open Philanthropy (2022) | $1,960,000 | Initial | Organization founding |
| Total raised | ≈$13M+ | 2022-2025 | Research & operations |
Additional funding sources:
- Future of Humanity Institute↗🔗 web★★★★☆Future of Humanity Institute**Future of Humanity Institute**Source ↗Notes (historical support)
- Government contracts for specific projects
- Research grants from academic institutions
Comparative Analysis
Section titled “Comparative Analysis”vs. Other Forecasting Organizations
Section titled “vs. Other Forecasting Organizations”| Organization | Focus | Methodology | Update Frequency | Policy Impact | Database Size |
|---|---|---|---|---|---|
| Epoch AI | AI-specific empirical data | Multiple quantitative methods | Continuous | High | 3,200+ models |
| Metaculus↗🔗 web★★★☆☆MetaculusMetaculusMetaculus is an online forecasting platform that allows users to predict future events and trends across areas like AI, biosecurity, and climate change. It provides probabilisti...Source ↗Notes | Crowdsourced forecasting | Prediction aggregation | Real-time | Medium | N/A |
| AI Impacts↗🔗 web★★★☆☆AI ImpactsAI Impacts 2023Source ↗Notes | Historical AI analysis | Case studies, trend analysis | Irregular | Medium | Limited |
| FHI↗🔗 web★★★★☆Future of Humanity Institute**Future of Humanity Institute**Source ↗Notes | Existential risk research | Academic research | Project-based | High | N/A |
| METR | Model evaluations | Technical testing | Per-model | Growing | N/A |
| Our World in Data | Data visualization | Data aggregation (uses Epoch data) | Regular | High | Derivative |
Relationship to Safety Organizations
Section titled “Relationship to Safety Organizations”| Organization Type | Relationship to Epoch | Information Flow |
|---|---|---|
| Safety research orgs | Data consumers | Epoch → Safety orgs |
| AI labs | Data subjects | Labs → Epoch (reluctantly) |
| Government bodies | Policy clients | Epoch ↔ Government |
| Think tanks | Research partners | Collaborative |
Future Directions and Challenges
Section titled “Future Directions and Challenges”Research Roadmap (2025-2027)
Section titled “Research Roadmap (2025-2027)”2026 Plans (from Impact Report):
- Development of 2 new benchmarks beyond FrontierMath
- Continued expansion of the Epoch Capabilities Index
- Deeper analysis of decentralized training feasibility (10 GW training runs across thousands of kilometers)
- Enhanced AI Chip Sales data explorer across Nvidia, Google, Amazon, AMD, and Huawei
Expanding scope:
- Multimodal training data analysis beyond text
- Energy consumption and environmental impact tracking (largest data centers now approaching gigawatt scale)
- International AI development monitoring (enhanced coverage of Chinese and European models)
- Risk assessment frameworks for different development pathways
Methodological improvements:
- Better algorithmic progress measurement via Epoch Capabilities Index
- Synthetic data quality and scaling analysis
- Economic impact modeling of AI deployment
- Scenario analysis for different development paths
Scaling Challenges
Section titled “Scaling Challenges”| Challenge | Current Limitation | Planned Solution | Priority |
|---|---|---|---|
| Data collection | Manual curation, limited sources | Automated scraping, industry partnerships | High |
| International coverage | US/UK bias in data | Partnerships with Chinese and European researchers | High |
| Real-time tracking | Lag in proprietary model information | Industry reporting standards advocacy | Medium |
| Resource constraints | 34 person team | Gradual expansion with $1.1M/year budget | Medium |
| Compute governance gaps | Threshold accuracy uncertain | Better compute-capability correlation research | High |
| Open-source proliferation | Frontier open models approaching 10^26 FLOP | Policy recommendations for dual governance | High |
Key Questions (6)
- How accurate are extrapolation-based AI timeline forecasts given potential discontinuities?
- Will synthetic data generation solve the training data bottleneck or create new limitations?
- How should compute governance adapt as algorithmic efficiency reduces compute as a chokepoint?
- What level of transparency in AI development is optimal for governance without security risks?
- How can empirical forecasting organizations maintain independence while engaging with policymakers?
- What leading indicators best predict dangerous capability emergence beyond compute scaling?
Sources & Resources
Section titled “Sources & Resources”Primary Resources
Section titled “Primary Resources”| Resource Type | Description | Link |
|---|---|---|
| AI Models Database | 3,200+ models tracked with training compute, parameters, cost | epoch.ai/data/ai-models |
| Machine Learning Trends | Real-time visualization of AI progress metrics | epoch.ai/trends |
| FrontierMath Benchmark | 350 expert-crafted math problems | epoch.ai/frontiermath |
| Epoch Capabilities Index | Unified capability measurement across benchmarks | epoch.ai/benchmarks |
| Research Blog | 36+ Data Insights, 37+ Gradient Updates in 2025 | epoch.ai/blog |
| AI Chip Sales | Global compute capacity tracking (15M+ H100-equivalents) | epoch.ai/data |
Key Publications
Section titled “Key Publications”| Title | Year | Impact | Citation |
|---|---|---|---|
| ”Compute Trends Across Three Eras of Machine Learning” | 2022 | Foundational for compute governance | Sevilla et al.↗📄 paper★★★☆☆arXivSevilla et al.Jaime Sevilla, Lennart Heim, Anson Ho et al. (2022)Source ↗Notes |
| ”Will We Run Out of Data?“ | 2022 | Sparked synthetic data research boom | Villalobos et al.↗📄 paper★★★☆☆arXivVillalobos et al.Pablo Villalobos, Anson Ho, Jaime Sevilla et al. (2022)Source ↗Notes |
| ”Algorithmic Progress in Computer Vision” | 2023 | Quantified efficiency improvements | Besiroglu et al.↗📄 paper★★★☆☆arXivBesiroglu et al.Ege Erdil, Tamay Besiroglu (2022)Source ↗Notes |
| ”FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning” | 2024 | arXiv paper on benchmark methodology | Epoch AI et al. |
| Top 10 Data Insights of 2025 | 2025 | Annual synthesis of key findings | Epoch AI |
| 2025 Impact Report | 2025 | Comprehensive organizational review | Epoch AI |
External Coverage
Section titled “External Coverage”| Source Type | Description | Example Links |
|---|---|---|
| Policy Documents | Government citations of Epoch work | US NAIRR↗🏛️ governmentUS NAIRRSource ↗Notes, UK AI White Paper↗🏛️ government★★★★☆UK GovernmentUK AI White PaperSource ↗Notes |
| Academic Citations | Research building on Epoch data | Google Scholar search↗🔗 web★★★★☆Google ScholarGoogle Scholar searchSource ↗Notes |
| Data Visualization | Our World in Data uses Epoch datasets | Training compute charts |
| Media Coverage | NYT “2024 Good Tech Awards”, InfoQ FrontierMath coverage | MIT Technology Review↗🔗 web★★★★☆MIT Technology ReviewMIT Technology Review: Deepfake CoverageSource ↗Notes |
| Industry Analysis | Business intelligence using Epoch metrics | CB Insights↗🔗 webCB InsightsSource ↗Notes, McKinsey AI reports↗🔗 web★★★☆☆McKinsey & CompanyMcKinseySource ↗Notes |
| EA Community | EA Forum discussions, 80,000 Hours listings | Active engagement |