Skip to content

Epoch AI

📋Page Status
Page Type:ContentStyle Guide →Standard knowledge base article
Quality:51 (Adequate)⚠️
Importance:42 (Reference)
Last edited:2026-01-29 (3 days ago)
Words:4.6k
Backlinks:2
Structure:
📊 30📈 1🔗 6📚 6611%Score: 14/15
LLM Summary:Epoch AI maintains comprehensive databases tracking 3,200+ ML models showing 4.4x annual compute growth and projects data exhaustion 2026-2032. Their empirical work directly informed EU AI Act's 10^25 FLOP threshold and US EO 14110, with their Epoch Capabilities Index showing ~90% acceleration in AI progress since April 2024.
Issues (2):
  • QualityRated 51 but structure suggests 93 (underrated by 42 points)
  • Links15 links could use <R> components
See also:EA Forum
DimensionAssessmentEvidence
Research ImpactVery HighCited in US AI Executive Order 14110, EU AI Act 10^25 FLOP threshold, Congressional testimony
Data QualityExceptional3,200+ ML models tracked from 1950-present; most comprehensive public dataset
Methodology RigorHighPeer-reviewed publications (arXiv:2202.05924); transparent compute estimation methodology
Policy InfluenceStrongUK DSIT collaboration; JRC European Commission consultations; House of Lords evidence submission
Industry UsageWidespreadOpenAI commissioned FrontierMath; Google DeepMind collaborated on ECI methodology
FundingStable≈$7M through 2025 via Coefficient Giving grants
Team Size≈34 employeesFounded by 7 researchers; now includes ML, economics, statistics, policy backgrounds
Key MetricsQuantified15M+ H100-equivalents tracked; 30+ models above 10^25 FLOP; 7-month doubling time for AI compute
AttributeDetails
Full NameEpoch AI
FoundedApril 2022
LocationSan Francisco, CA (headquarters); remote-first operations
StatusIndependent 501(c)(3) nonprofit (since early 2025; previously fiscally sponsored by Rethink Priorities)
Websiteepoch.ai
DirectorJaime Sevilla (Mathematics and Computer Science background)
Key OutputsML Trends Database, Epoch Capabilities Index, FrontierMath Benchmark, GATE Economic Model, AI Chip Sales Tracker
Primary FundersCoefficient Giving ($6.3M+ in grants), Carl Shulman ($100K), individual donors
GitHubepoch-research

Epoch AI is a research institute dedicated to tracking and forecasting AI development through rigorous empirical analysis. Founded in April 2022 by Jaime Sevilla and six co-founders, Epoch has become the authoritative source for data on AI training compute, model parameters, hardware capabilities, and development timelines. Their research directly informs policy discussions, corporate planning, and academic research on AI trajectories. The New York Times praised their work for bringing “much-needed rigor and empiricism to an industry that often runs on hype and vibes,” and featured them in their 2024 Good Tech Awards.

The organization’s core contribution is maintaining comprehensive databases that enable quantitative analysis of AI progress. Their public database tracks over 3,200 machine learning models from 1950 to present, documenting the training compute, parameters, and capabilities of each system. By cataloging these metrics, Epoch provides the empirical foundation for discussions about AI timelines, resource constraints, and capability trajectories. Their work bridges the gap between speculative AI forecasting and evidence-based analysis.

Epoch’s research has been directly cited in major policy documents including the EU AI Act (which adopted their 10^25 FLOP compute threshold) and US Executive Order 14110. Their data informs Congressional hearings, and leading AI labs use their metrics for planning. As director Jaime Sevilla stated, “We want to do something similar for artificial intelligence to what William Nordhaus, the Nobel laureate, did for climate change. He set the basis for rigorous study and thoughtful action guided by evidence.” The organization represents a critical piece of epistemic infrastructure for understanding where AI development is headed and what constraints may shape its trajectory.

Epoch AI emerged from a collaborative research effort that began when Jaime Sevilla, a Spanish researcher, put his Ph.D. on pause and issued a call for volunteers to systematically document the critical inputs of every significant AI model ever created. The initial team that responded went on to become Epoch’s founding members: Jaime Sevilla, Tamay Besiroglu, Lennart Heim, Pablo Villalobos, Eduardo Infante-Roldan, Marius Hobbhahn, and Anson Ho. Collectively, they brought backgrounds in Machine Learning, Statistics, Economics, Forecasting, Physics, Computer Engineering, and Software Engineering.

During Epoch’s first retreat in April 2022, the members decided to formalize as an organization and chose the name “Epoch” through a Twitter poll. When they published their findings in the paper “Compute Trends Across Three Eras of Machine Learning” in early 2022, the reaction was overwhelmingly positive, with the paper going viral in AI research communities. The paper documented the 10-billion-fold increase in training compute since 2010 and identified three distinct eras of ML development with different scaling dynamics.

From its founding, Epoch was fiscally sponsored and operationally supported by Rethink Priorities, whose Special Projects team provided critical infrastructure for the growing organization. At founding, Epoch had a staff of 13 people (9 FTEs). Coefficient Giving (then Open Philanthropy) provided early support with a $1.96M grant for general support, followed by additional grants totaling over $6M through 2025.

During this period, Epoch expanded its database from the initial 123 models documented in their founding paper to over 3,200 models. They launched new data products including the AI Chip Sales tracker, Parameter Counts database, and AI Supercomputer Tracker. The team grew to approximately 34 employees with headquarters established in San Francisco.

In early 2025, Epoch spun out from its fiscal sponsor and began operating as an independent 501(c)(3) nonprofit organization. This transition marked their maturation from a research project to a fully independent institution. Key 2025 developments included:

  • Launch of the Epoch Capabilities Index (ECI) in October 2025, a unified metric combining scores from 37 benchmarks
  • Completion of FrontierMath Tier 4, commissioned by OpenAI, featuring 50 research-level mathematics problems
  • Publication of more plots and visualizations in 2025 than in all previous years combined
  • Development of the GATE model for forecasting AI’s economic impact
NameRoleBackgroundKey Contributions
Jaime SevillaDirectorMathematics, Computer ScienceFounded Epoch; leads research on AI forecasting and trends
Tamay BesirogluResearch Advisor (former co-founder)Economics of computingCo-authored founding paper; now leads Mechanize startup
Lennart HeimCo-founderComputer EngineeringCompute governance research; hardware tracking
Pablo VillalobosResearcherStatisticsData constraints research; “Will We Run Out of Data?” paper
Marius HobbhahnCo-founderML, PhysicsCompute trends analysis
Anson HoCo-founderML, Software EngineeringDatabase development; trend analysis
Eduardo Infante-RoldanCo-founderEconomicsEconomic modeling

The current team of approximately 34 employees includes researchers with diverse backgrounds:

  • Machine Learning researchers: Core technical expertise in model architectures and training
  • Economists: Analyze AI’s economic implications and build forecasting models
  • Statisticians: Develop rigorous methodologies for trend analysis
  • Policy analysts: Translate research findings for governance contexts
  • Data engineers: Maintain and expand the organization’s databases
Loading diagram...

A critical contribution of Epoch AI is their rigorous methodology for estimating the compute used to train machine learning models. This methodology enables accurate comparisons across models and time periods, forming the foundation for their scaling analysis.

Epoch uses two complementary approaches to estimate training compute:

MethodInputs RequiredWhen UsedPrecision
Architecture-basedModel architecture, training data size, parameter countWhen architecture details are publishedHigh
Hardware-basedGPU type, training time, utilization rateWhen hardware details are availableModerate

Architecture-based estimation: Epoch maintains detailed tables of common neural network layers, estimating parameters and FLOP per forward pass. For many layers, forward pass FLOP approximately equals twice the parameter count. A backward pass adds approximately 2x the forward pass FLOP, yielding the common heuristic: total training FLOP approximately equals 6 times parameters times tokens.

Hardware-based estimation: When architecture details are unavailable, Epoch calculates compute from GPU training time multiplied by peak GPU performance, adjusted by utilization rate. Their empirical analysis found utilization rates typically range from 0.3 to 0.75 depending on architecture and batch size.

  • The backward/forward FLOP ratio is “very likely 2:1” after correcting for common counting errors
  • The “Theory method” multiplies forward pass FLOP by 3.0 to account for backward pass
  • Larger batch sizes yield more consistent utilization rates
  • Parameter sharing (as in CNNs) and word embeddings require special handling
MetricFindingTime PeriodSource
Compute Growth4.4x per year2010-2025Epoch Trends
Doubling Time≈5-6 monthsDeep Learning era (2012+)arXiv:2202.05924
Pre-Deep Learning Doubling≈20 monthsBefore 2010Moore’s Law trajectory
Models above 10^25 FLOP30+ models from 12 developersAs of June 2025Epoch Data
Global AI Chip Capacity15M+ H100-equivalents2025AI Chip Sales

Epoch’s foundational 2022 paper identified three distinct eras with different compute scaling dynamics:

EraPeriodDoubling TimeCharacteristics
Pre-Deep LearningBefore 2010≈20 monthsFollowed Moore’s Law; academic-dominated
Deep Learning2010-2015≈5-6 monthsRapid scaling; breakthrough architectures
Large-Scale2015-present≈5-6 months2-3 OOM more compute than previous trend; industry-dominated

This analysis corrected earlier estimates (Amodei and Hernandez 2018) that suggested 3.4-month doubling, finding the actual rate closer to 5-6 months with approximately 10x more data points.

Epoch’s database reveals the dramatic scaling of AI training compute:

EraRepresentative ModelTraining Compute (FLOP)Approximate Cost
2012AlexNet10^17Thousands
2017Transformer (original)10^18Tens of thousands
2020GPT-310^23Millions
2023GPT-410^25Tens of millions
2024Frontier models10^26≈$100 million
2027 (proj.)Next-gen frontier10^27+Greater than $1 billion

The first model trained at the 10^25 FLOP scale was GPT-4, released in March 2023. As of June 2025, Epoch has identified over 30 publicly announced AI models from 12 different developers that exceed this threshold.

MetricAnnual GrowthCurrent Status
GPU FLOP/s (FP32)1.35xContinuing Moore’s Law trajectory
GPU FLOP/s (FP16)Similar to FP32Optimized for ML workloads
NVIDIA Total Compute2.3x since 2019Hopper generation: 77% of total
Global AI Compute Capacity3.3x per year (7-month doubling)15M+ H100-equivalents total

Epoch’s AI Chip Sales data explorer is the most comprehensive public dataset tracking global AI compute capacity across vendors:

VendorCoverageKey Metrics Tracked
NVIDIAPrimaryGPU sales, FLOP capacity, power consumption
Google (TPU)IncludedCustom silicon production
Amazon (Trainium)IncludedCloud AI accelerators
AMDIncludedMI series GPUs
HuaweiIncludedAscend chips (domestic China)

Key finding: Global computing capacity has been growing by 3.3x per year, equivalent to a doubling time of approximately 7 months.

Epoch’s analysis of AI supercomputer trends projects significant scaling challenges:

YearProjected ChipsEstimated CostPower Required
2024≈100,000≈$10 billion≈1 GW
2027≈500,000≈$50 billion≈4 GW
2030≈2 million≈$200 billion≈9 GW

The 9 GW power requirement for 2030 frontier training represents the equivalent of 9 nuclear reactors—a scale beyond any existing industrial facility. This represents a potential binding constraint on AI scaling.

RegionShare of AI Supercomputer CapacityTrend
United States≈75%Dominant and growing
China≈15%Second place, facing chip restrictions
Europe≈5%Limited domestic capacity
Other≈5%Emerging efforts

The shift from academic to industry dominance has been dramatic:

YearIndustry ShareAcademic/Government Share
2019≈40%≈60%
2022≈65%≈35%
2025≈80%≈20%

Epoch’s influential research on training data constraints (the “data wall”) has become central to discussions of AI scaling limits. Their paper “Will We Run Out of ML Data?” projects when AI development may exhaust human-generated training data.

Data SourceCurrent StatusExhaustion Projection (80% CI)
Public web textHeavily utilized2026-2028
Books and academic papersLargely incorporated2027-2030
All human-generated textApproaching limits2026-2032

The exact date depends on scaling assumptions. According to researcher Tamay Besiroglu: “There is a serious bottleneck here. If you start hitting those constraints about how much data you have, then you can’t really scale up your models efficiently anymore. And scaling up models has been probably the most important way of expanding their capabilities.”

Overtraining FactorData Exhaustion YearExample
Compute-optimal (1x)≈2028Enough for 5x10^28 FLOP model
5x overtrained≈2027Common practice
10x overtrained≈2026-2027Llama 3-70B level
100x overtrained≈2025Extreme efficiency

Epoch’s analysis has evolved based on new evidence:

  • The effectiveness of carefully filtered web data and multi-epoch training has substantially increased estimates of available high-quality data
  • After accounting for data quality, availability, multiple epochs, and multimodal tokenizer efficiency, Epoch estimates 400 trillion to 20 quadrillion tokens available for training by 2030
  • This allows for training runs from 6x10^28 to 2x10^32 FLOP

Epoch identifies three categories of innovation that could extend the scaling runway:

MitigationMechanismStatus
Synthetic dataAI-generated training dataActive research; quality concerns remain
Multimodal dataImages, video, audio expand data poolIncreasingly used
Data efficiencyBetter algorithms require less dataOngoing improvements

While Sam Altman noted OpenAI experiments with “generating lots of synthetic data,” he expressed reservations: “There’d be something very strange if the best way to train a model was to just generate, like, a quadrillion tokens of synthetic data and feed that back in.” Research shows training on AI-generated data can produce “model collapse” with degraded outputs.

The Epoch Capabilities Index (ECI), launched in October 2025, represents a major methodological advance in measuring AI progress. As individual benchmarks saturate, ECI provides a unified scale for comparing models across time.

ECI combines scores from 37 distinct benchmarks into a single “general capability” scale, similar to how IQ tests capture broad underlying capability:

AspectDetails
Benchmarks included37 distinct benchmarks
Evaluations used1,123 distinct evaluations
Models covered147 models
Time spanDecember 2021 - December 2025
Methodology basis”A Rosetta Stone for AI Benchmarks” (collaboration with Google DeepMind AGI Safety team)

ECI scores function like Elo ratings: absolute values are less meaningful than relative comparisons. The scale is linear, so a 10-point jump should be equally significant whether moving from 100 to 110 or from 140 to 150.

Epoch’s analysis reveals a significant acceleration in AI capabilities progress:

PeriodAnnual ECI GrowthKey Driver
December 2021 - April 2024≈8 points/yearScaling laws, architecture improvements
April 2024 - December 2025≈15 points/yearReasoning models, reinforcement learning
Acceleration≈90%Coincides with rise of o1-style reasoning models

This acceleration is corroborated by METR’s Time Horizon benchmark, which found a ~40% acceleration in task completion capabilities starting around the same period.

FrontierMath is Epoch’s benchmark of original, expert-crafted mathematics problems designed to evaluate advanced reasoning capabilities—problems that typically require hours or days for expert mathematicians to solve.

AspectDetails
Total problems350 (300 base + 50 Tier 4 expansion)
Collaborating mathematicians60+ from leading institutions
Notable contributors14 International Mathematical Olympiad gold medalists, 1 Fields Medal recipient
Problem domainsComputational number theory, abstract algebraic geometry, and other advanced fields
CommissioningOpenAI commissioned the core 300 problems
TierProblem CountDifficultyTypical Human Solve Time
Tier 1≈100Advanced undergraduateHours
Tier 2≈100Graduate levelHours to days
Tier 3≈100Research levelDays
Tier 450Short research projectsDays to weeks

While leading AI models achieve near-perfect scores on traditional math benchmarks (GSM-8k, MATH), FrontierMath reveals substantial gaps:

ModelFrontierMath ScoreTraditional BenchmarksNotes
GPT-4o, Claude 3.5Less than 2%Greater than 90% on MATHBaseline frontier models
o3 (December 2024)≈25% (announced)Near-perfect on MATHPre-release version
o3 (April 2025 release)≈10%Near-perfect on MATHOfficial Epoch evaluation

The discrepancy between o3’s announced 25% and measured 10% reflects differences in model versions and benchmark composition over time. Both the model and benchmark changed between December 2024 and April 2025.

FrontierMath addresses two critical challenges:

  1. Benchmark saturation: Traditional math benchmarks no longer differentiate frontier models
  2. Data contamination: Using entirely new, unpublished problems with automated verification

The GATE model (Growth and AI Transition Endogenous) is Epoch’s integrated assessment model of AI’s economic impact, published in 2025 (arXiv:2503.04941).

GATE models an automation feedback loop: investments drive increases in compute for training and deploying AI, which leads to gradual task automation, which generates returns enabling further investment.

MetricGATE ProjectionContext
AI Investment PeakGreater than 10% of world GDP≈50x increase over current levels
Growth at 30% automationGreater than 20% annual GWP growthComparable to industrial revolution peaks
Growth at 40% automation≈12% annual GWP growthComparable to East Asian miracle economies

Epoch explicitly cautions against treating GATE outputs as precise quantitative predictions. The model illustrates key dynamics rather than providing forecasts. According to their analysis: “These findings suggest that those who are confidently either extremely skeptical or extremely bullish about an unprecedented growth acceleration due to AI are likely miscalibrated.”

A public GATE playground allows users to modify parameters and explore scenarios.

Epoch’s research has directly influenced major AI governance frameworks. Their compute trend data provides the empirical foundation for regulatory thresholds.

PolicyThresholdEpoch’s Role
EU AI Act10^25 FLOP for systemic risk modelsEpoch data cited in JRC technical documents
US Executive Order 1411010^26 FLOP for reporting requirementsThreshold informed by Epoch trend analysis
UK Frontier AI SafetyUses compute as capability proxyMethodology collaboration with UK DSIT

The EU AI Act explicitly references the statistical relationship between training compute and model capabilities documented by Epoch, noting that “performance of 231 language models (measured in log-perplexity) against scale (measured in FLOP)” shows clear trends.

Government BodyEngagement TypeYear
UK DSITConsultation on “Frontier AI: capabilities and risks”2023
JRC European CommissionCollaboration on AI Act technical documentation2023-2024
House of LordsEvidence submission on language models2023
NISTInput on AI Risk Management Framework2023-2024
US OSTPBriefings on compute trends2023-2024

Epoch’s analysis of how many models will exceed compute thresholds directly informs regulatory planning:

ThresholdModels (June 2025)Developers
10^23 FLOPHundredsDozens
10^25 FLOP30+12
10^26 FLOPSeveralMajor labs
PublicationYearKey FindingCitation
”Compute Trends Across Three Eras of Machine Learning”20224.4x annual growth; 5-6 month doubling; three distinct erasarXiv:2202.05924
”Will We Run Out of ML Data?“2022Data exhaustion projected 2026-2032Epoch Blog
”Estimating Training Compute of Deep Learning Models”2022Methodology for FLOP estimationEpoch Blog
”Can AI Scaling Continue Through 2030?“2024Analysis of compute, data, energy constraintsEpoch Blog
”AI Capabilities Progress Has Sped Up”2024≈90% acceleration since April 2024Epoch Data Insights
”FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI”2024Frontier models solve less than 2%arXiv:2411.04872
”GATE: An Integrated Assessment Model for AI Automation”2025Economic modeling of AI transitionarXiv:2503.04941
”How Well Did Forecasters Predict 2025 AI Progress?“2025Metacognitive evaluation of forecastingEpoch Blog
”Global AI Computing Capacity is Doubling Every 7 Months”202515M+ H100-equivalents; 3.3x annual growthEpoch Data Insights

Epoch AI has raised approximately $7 million through September 2025, primarily from Coefficient Giving grants.

GrantAmountPurposeDate
General Support (2022)$1,960,000Initial organizational support2022
General Support (2023)$4,132,488Two-year general support2023
Worldview Investigations$188,558AI-related worldview research2023
General Support (2025)UndisclosedIndependent operations2025
  • Carl Shulman: $100,000 individual donation
  • Various individual donors
  • Contract revenue from clients including AI labs and government offices

Coefficient Giving has cited Epoch as producing “world-class work that is widely read, used, and shared.”

Comparison with Other Forecasting Organizations

Section titled “Comparison with Other Forecasting Organizations”
OrganizationFocusMethodologyKey StrengthCompute Expertise
Epoch AIEmpirical AI trendsDatabase analysis, benchmark developmentHardware/compute tracking; 3,200+ modelsPrimary focus
MetaculusCrowd forecastingPrediction aggregationDiverse questions; large forecaster baseQuestions only
Our World in DataData visualizationCurates authoritative sourcesBroad topic coverage; accessibilityUses Epoch data
AI ImpactsAI forecastingExpert surveys, trend extrapolationTimeline estimatesModerate
QURIEpistemic toolsSoftware developmentProbabilistic modelingLimited

Our World in Data directly incorporates Epoch’s compute trend data in their AI visualizations, extending Epoch’s reach to a broader audience.

StrengthEvidence
Comprehensive dataMost complete public database: 3,200+ ML models from 1950-present
Transparent methodologyOpen documentation of compute estimation methods; peer-reviewed publications
Policy relevanceDirectly cited in EU AI Act, US EO 14110; collaborations with UK DSIT, JRC
Regular updatesDatabases continuously maintained; published more plots in 2025 than all previous years
Methodological innovationECI provides unified capability measurement; FrontierMath addresses benchmark saturation
Industry recognitionNew York Times “Good Tech Awards” 2024; praised for “rigor and empiricism”
LimitationImplication
Historical focusPrimarily backward-looking; projections carry significant uncertainty
Compute-centricAlgorithmic efficiency improvements harder to quantify than hardware scaling
Industry opacityLabs don’t disclose training details; estimates rely on public information
Threshold arbitrariness10^25 FLOP thresholds are useful proxies but don’t directly measure capability
US-centricLimited visibility into Chinese AI development due to information barriers
Funding concentrationHeavy reliance on Coefficient Giving creates potential dependency

Epoch fills a crucial gap in the AI ecosystem by providing rigorous empirical grounding for discussions that previously relied on intuition and speculation. Before Epoch, claims about AI progress rates were often based on anecdotes or marketing materials. Epoch’s systematic data collection enables evidence-based analysis of trends that matter for policy and planning.

Their methodology for compute estimation has become the de facto standard, cited by academics, policymakers, and industry alike. The FrontierMath benchmark addresses a genuine problem—traditional benchmarks saturating—with a thoughtful approach using novel problems from credentialed experts.

Key Questions (6)
  • How reliable are compute estimates for models where labs don't disclose training details?
  • Will the historical relationship between compute and capabilities continue, or are we approaching diminishing returns?
  • Can policy thresholds based on compute (10^25 FLOP) remain meaningful as algorithmic efficiency improves?
  • How should Epoch's projections be weighted against insider knowledge from labs?
  • Will data constraints prove as binding as projected, or will synthetic data and efficiency gains extend the runway?
  • Can a small organization maintain comprehensive coverage as AI development accelerates globally?
Views on Epoch AI's Contribution
Value and Limitations of Empirical AI Tracking (3 perspectives)
Essential Infrastructure
Dominant

Epoch provides irreplaceable empirical grounding for AI policy and research. Their data and analysis have elevated discourse from speculation to evidence-based discussion. Expansion of their work should be a priority.

AI governance researchers · Policymakers · Safety-focused AI researchers
Valuable but Incomplete
Moderate

Epoch's compute tracking is useful but overemphasizes hardware relative to algorithms and data quality. Capability improvements from techniques like RLHF and chain-of-thought are harder to quantify but equally important.

ML researchers · Industry practitioners
Projection Skepticism
Minority

Historical trends are useful for backward-looking analysis but extrapolation to future capabilities is unreliable. Past growth rates may not predict discontinuities or saturation.

Forecasting skeptics · Some AI researchers
DateEvent
April 2022Epoch AI founded; team of 7 researchers; fiscally sponsored by Rethink Priorities
February 2022”Compute Trends Across Three Eras” published; paper goes viral in AI research community
2022First Coefficient Giving (then Open Philanthropy) grant ($1.96M)
2022”Will We Run Out of ML Data?” published; establishes data wall projections
2023Database grows to 800+ models; UK DSIT and JRC collaborations begin
2023Additional Coefficient Giving (then Open Philanthropy) grants ($4.3M); House of Lords evidence submission
2024FrontierMath benchmark developed with 60+ mathematicians; OpenAI commissions 300 problems
2024Database exceeds 3,200 models; AI Chip Sales tracker launched
2024New York Times “Good Tech Awards” recognition
October 2024Epoch Capabilities Index (ECI) launched with 37 benchmarks
December 2024o3 announced with 25% FrontierMath score (later measured at ≈10%)
Early 2025Spin out from fiscal sponsor to independent 501(c)(3)
2025GATE economic model published; 15M+ H100-equivalents tracked
2025Published more visualizations than all previous years combined