Compute & Hardware
- Quant.Algorithmic efficiency improvements are outpacing Moore's Law by 4x, with compute needed to achieve a given performance level halving every 8 months (95% CI: 5-14 months) compared to Moore's Law's 2-year doubling time.S:4.0I:4.5A:4.0
- Quant.Training compute for frontier AI models has grown 4-5x annually since 2010, with over 30 models now trained at GPT-4 scale (10²⁵ FLOP) as of mid-2025, suggesting regulatory thresholds may need frequent updates.S:3.5I:4.5A:4.5
- Quant.AI power consumption is projected to grow from 40 TWh in 2024 to 945 TWh by 2030 (nearly 3% of global electricity), with annual growth of 15% - four times faster than total electricity growth.S:4.0I:4.5A:4.0
Quick Assessment
Section titled “Quick Assessment”| Dimension | Assessment | Evidence |
|---|---|---|
| Data Completeness | High for public metrics | Epoch AI↗🔗 web★★★★☆Epoch AIEpoch AISource ↗Notes, IEA reports↗🔗 webIEA Energy and AI ReportThe International Energy Agency's report analyzes the energy implications of AI, focusing on electricity demand, energy sources, and potential impacts on security, emissions, an...Source ↗Notes, company filings |
| Training Compute Growth | 4-5x per year since 2010 | 30+ models at GPT-4 scale (10²⁵ FLOP) as of mid-2025 |
| Algorithmic Efficiency | Doubles every 8 months (95% CI: 5-14) | Epoch AI research↗🔗 web★★★★☆Epoch AIEpoch AI algorithmic progressA comprehensive analysis of language model algorithmic progress reveals rapid efficiency improvements, with compute requirements halving approximately every 8 months. However, c...Source ↗Notes on language models |
| Market Concentration | NVIDIA holds 80-90% share | Data center GPU revenue, CUDA ecosystem lock-in |
| Energy Trajectory | 15% annual growth to 2030 | IEA projects↗🔗 webIEA Energy and AI ReportThe International Energy Agency forecasts data center electricity consumption to double by 2030, with AI-driven accelerated servers being a key driver of increased energy demand.Source ↗Notes 945 TWh by 2030 (3% of global electricity) |
| Key Constraint | Packaging (CoWoS) more than wafers | HBM supply and advanced packaging limit GPU production |
| China Gap | 1-2 node generations behind | SMIC 7nm vs. TSMC 3nm/2nm; Huawei yields at 20-50% |
Overview
Section titled “Overview”Compute and hardware metrics are fundamental to understanding AI progress. The availability of specialized AI chips (especially GPUs), total compute used for training, and efficiency improvements determine what models can be built and how quickly capabilities advance. These metrics also inform regulatory thresholds and help forecast future AI development trajectories.
AI Hardware Supply Chain
Section titled “AI Hardware Supply Chain”1. GPU Manufacturing & Distribution
Section titled “1. GPU Manufacturing & Distribution”Annual GPU Production (2023-2025)
Section titled “Annual GPU Production (2023-2025)”| Year | H100/H100-Equivalent | Total Data Center GPUs | Key Notes |
|---|---|---|---|
| 2022 | approximately 0 (A100 era) | 2.64M | Pre-H100, primarily A100s |
| 2023 | approximately 0.5M | 3.76M | H100 ramp-up begins |
| 2024 | approximately 2.0M | approximately 3.0M H100-equiv | Primarily H100 and early Hopper |
| 2025 (proj) | 2M Hopper + 5M Blackwell | 6.5-7M | Shift to Blackwell architecture |
Customer Orders (2024): Microsoft purchased 485,000 Hopper AI chips—twice the amount bought by Meta (approximately 240,000), according to Statista market data↗🔗 webStatista market dataSource ↗Notes.
Data Quality: Medium-High. Based on Epoch AI↗🔗 web★★★★☆Epoch AIEpoch AISource ↗Notes estimates, industry reports, and TSMC capacity analysis.
Sources: Epoch AI GPU production tracking↗🔗 web★★★★☆Epoch AIEpoch AI GPU production trackingEpoch AI tracked NVIDIA GPU computing power growth, finding a 2.3x annual increase since 2019. The Hopper generation currently dominates with 77% of total AI hardware computing ...Source ↗Notes, Tom’s Hardware H100 projections↗🔗 webTom's Hardware H100 projectionsNvidia aims to significantly increase production of its H100 compute GPUs in 2024, driven by massive demand for AI and HPC applications. The company faces technical challenges i...Source ↗Notes
Cumulative Installed Base
Section titled “Cumulative Installed Base”As of mid-2024, Epoch AI estimates approximately 4 million H100-equivalent GPUs (4e21 FLOP/s) deployed globally. This represents cumulative sales of roughly 3 million H100s between 2022-2024, accounting for depreciation.
The stock of computing power from NVIDIA chips has been doubling every 10 months since 2019, with growth accelerating to 2.3x per year.
Major Lab Holdings (End of 2024 estimates):
- OpenAI: ~250k average, ramping to 460k H100-equivalents by year-end (5% of global supply)
- Anthropic: ~360k H100-equivalents (4% of global supply), including 400k Amazon Trainium2
- Google: Largest holder with proprietary TPUs plus GPUs (21% of global AI compute)
- Meta: 13% of global AI compute share
Data Quality: Medium. Based on cost reports, capacity estimates, and informed analysis from industry observers.
Sources: LessWrong GPU estimates↗✏️ blog★★★☆☆LessWrongLessWrong GPU estimatesCharlesD (2024)A detailed breakdown of expected GPU and compute availability across major tech companies like Microsoft, Meta, Google, Amazon, and XAI. Estimates are based on publicly availabl...Source ↗Notes, Epoch AI computing capacity↗🔗 web★★★★☆Epoch AIEpoch AI computing capacityEpoch AI analyzed computing capacity across leading tech companies, estimating their AI chip holdings in H100 equivalents. Google, Microsoft, Meta, and Amazon collectively own s...Source ↗Notes
2. AI Training Compute (FLOP)
Section titled “2. AI Training Compute (FLOP)”Cumulative Global Training Compute
Section titled “Cumulative Global Training Compute”Training compute for frontier AI models has grown 4-5x per year since 2010, with acceleration to 5x per year since 2020. According to Epoch AI↗🔗 web★★★★☆Epoch AIEpoch AISource ↗Notes, this growth rate has been consistent across frontier models, large language models, and models from leading companies.
Notable Training Runs:
| Model | Year | Training Compute | Cost Estimate | Notes |
|---|---|---|---|---|
| GPT-3 | 2020 | approximately 3×10²³ FLOP | approximately $5M | Foundation of modern LLMs |
| GPT-4 | 2023 | approximately 1×10²⁵ FLOP | $40-100M | First model at 10²⁵ scale |
| GPT-4o | 2024 | approximately 3.8×10²⁵ FLOP | $100M+ | Largest documented 2024 model |
| Gemini 1.0 Ultra | 2024 | approximately 2×10²⁵ FLOP | $192M | Most expensive confirmed training |
| Llama 3.1 405B | 2024 | approximately 1×10²⁵ FLOP | approximately $50M+ | Trained on 15T tokens |
| Projected 2027 frontier | 2027 | approximately 2×10²⁸ FLOP | $1B+ | 1000x GPT-4 scale |
Growth in Large-Scale Models (Epoch AI data insights↗🔗 web★★★★☆Epoch AIEpoch AI model databaseEpoch AI analyzed the landscape of large-scale AI models, identifying over 30 models trained with more than 10^25 floating-point operations (FLOP). The analysis covers models fr...Source ↗Notes):
- 2020: Only 2 models trained with greater than 10²³ FLOP
- 2023: Over 40 models at this scale
- Mid-2025: Over 30 models trained at greater than 10²⁵ FLOP (GPT-4 scale)
- By 2028: Projected 165 models at greater than 10²⁵ FLOP; 81 models at greater than 10²⁶ FLOP
Regulatory Thresholds:
- EU AI Act: 10²⁵ FLOP reporting requirement
- US Executive Order 14110: 10²⁶ FLOP reporting requirement
Cost Trajectory: The cost of training frontier AI models has grown by a factor of 2-3x per year for the past eight years, suggesting that the largest models will cost over a billion dollars by 2027 (arXiv analysis↗📄 paper★★★☆☆arXivArXiv training costsBen Cottier, Robi Rahman, Loredana Fattorini et al. (2024)Source ↗Notes).
Data Quality: High for published models, Medium-Low for unreleased/future models.
Sources: Epoch AI model database↗🔗 web★★★★☆Epoch AIEpoch AI model databaseEpoch AI analyzed the landscape of large-scale AI models, identifying over 30 models trained with more than 10^25 floating-point operations (FLOP). The analysis covers models fr...Source ↗Notes, Our World in Data AI training↗🔗 web★★★★☆Our World in DataOur World in Data AI trainingThe source discusses AI training computation, explaining how machine learning systems require massive computational resources measured in floating-point operations (FLOPs). It e...Source ↗Notes, Epoch AI tracking↗🔗 web★★★★☆Epoch AIEpoch AI trackingEpoch AI presents a comprehensive dataset tracking the development of large-scale AI models, showing exponential growth in training compute and model complexity across various d...Source ↗Notes
3. Cost per FLOP (Declining Curve)
Section titled “3. Cost per FLOP (Declining Curve)”Hardware Price-Performance Trends
Section titled “Hardware Price-Performance Trends”The cost of compute has declined dramatically, outpacing Moore’s Law by ~50x in recent years.
Key Metrics:
- Overall decline (2019-2025): FP32 FLOP cost decreased ~74% (2025 price = 26% of 2019 price)
- AI training cost decline: ~10x per year (50x faster than Moore’s Law)
- GPU price-performance: Doubling every 16 months on frontier chips
Historical Training Cost Examples:
- ResNet-50 image recognition: $1,000 (2017) → $10 (2019)
- ImageNet 93% accuracy: Halving every 9 months (2012-2022)
- GPT-4 equivalent model: $100M (2023) → ≈$20M (Q3 2023) → ≈$3M (efficiency optimized, 01.ai claim)
GPU Generation Improvements:
- A100 → H100: 2x price-performance in 16 months
- Expected trend: ~1.4x per year improvement for frontier chips
- Google TPU v5p (2025): 30% throughput improvement, 25% lower energy vs v4
Data Quality: High for historical data, Medium for projections.
Sources: Epoch AI training costs↗🔗 web★★★★☆Epoch AIEpoch AI training costsA comprehensive study examining the dollar cost of training machine learning systems shows training costs have been increasing by around 0.5 orders of magnitude annually, with s...Source ↗Notes, ARK Invest AI training analysis↗🔗 webARK Invest AI training analysisSource ↗Notes, Our World in Data GPU performance↗🔗 web★★★★☆Our World in DataOur World in Data GPU performanceOur World in Data provides analysis of GPU computational performance, measuring calculations per dollar for AI training hardware. The data focuses on GPUs used in large AI model...Source ↗Notes
4. Training Efficiency (Algorithmic Progress)
Section titled “4. Training Efficiency (Algorithmic Progress)”Algorithmic improvements contribute as much to AI progress as increased compute. According to Epoch AI research↗🔗 web★★★★☆Epoch AIEpoch AI algorithmic progressA comprehensive analysis of language model algorithmic progress reveals rapid efficiency improvements, with compute requirements halving approximately every 8 months. However, c...Source ↗Notes, the compute needed to achieve a given performance level has halved roughly every 8 months (95% CI: 5-14 months)—faster than Moore’s Law’s 2-year doubling time.
Algorithmic Progress Estimates
Section titled “Algorithmic Progress Estimates”| Study | Annual Efficiency Gain | Methodology |
|---|---|---|
| Ho et al. 2024↗📄 paper★★★☆☆arXivHo et al. 2024Hans Gundlach, Alex Fogelson, Jayson Lynch et al. (2025)Source ↗Notes | 2.7x (95% CI: 1.8-6.3x) | Language model benchmarks |
| Ho et al. 2025 | 6x per year | Updated methodology |
| OpenAI 2020 | approximately 4x per year | ImageNet classification |
| Epoch AI 2024 | 3x per year average | Cross-benchmark analysis |
Key Findings:
- Doubling time: Algorithms double effective compute every 8 months (95% CI: 5-14 months)
- Annual improvement rate: 2.7-6x per year in FLOP efficiency depending on methodology
- Contribution to progress: 35% from algorithmic improvements, 65% from scale (since 2014)
Major Sources of Efficiency Gains (arXiv research↗📄 paper★★★☆☆arXivHo et al. 2024Hans Gundlach, Alex Fogelson, Jayson Lynch et al. (2025)Source ↗Notes): Between 2017 and 2025, 91% of algorithmic progress at frontier scale comes from two innovations:
- Switch from LSTM to Transformer architecture
- Rebalancing to Chinchilla-optimal scaling
Specific Benchmarks:
- ImageNet classification: 44x less compute for AlexNet-level performance (2012-2024)
- Language modeling: Algorithms account for 22,000x improvement on paper (2012-2023)
- Actual measured innovations account for less than 100x
- Gap explained by scale-dependent efficiency improvements
Inference Cost Reduction Example:
- GPT-3.5-equivalent model cost: $20 per million tokens (Nov 2022) to $0.07 per million tokens (Oct 2024)
- Total reduction: 280x+ in approximately 18 months
Recent Efficiency Breakthroughs:
- DeepSeek V3: GPT-4o-level performance with fraction of training compute
- AlphaEvolve↗🔗 web★★★★☆Google DeepMindAlphaEvolveSource ↗Notes: 32.5% speedup for FlashAttention kernel in Transformers
Data Quality: High. Based on rigorous academic research and reproducible benchmarks.
Sources: Epoch AI algorithmic progress↗🔗 web★★★★☆Epoch AIEpoch AI algorithmic progressA comprehensive analysis of language model algorithmic progress reveals rapid efficiency improvements, with compute requirements halving approximately every 8 months. However, c...Source ↗Notes, OpenAI efficiency research↗🔗 web★★★★☆OpenAIOpenAI efficiency researchOpenAI research demonstrates significant algorithmic efficiency gains in AI, showing neural networks require less computational resources over time to achieve similar performanc...Source ↗Notes, ArXiv algorithmic progress paper↗📄 paper★★★☆☆arXivArXiv algorithmic progress paperGundlach, Hans, Fogelson, Alex, Lynch, Jayson et al. (2025)A study examining algorithmic efficiency improvements in AI from 2012-2023, revealing that efficiency gains are highly scale-dependent and much smaller than previously estimated...Source ↗Notes
5. Data Center Power Consumption for AI
Section titled “5. Data Center Power Consumption for AI”Current State (2024)
Section titled “Current State (2024)”According to the IEA Energy and AI Report↗🔗 webIEA Energy and AI ReportThe International Energy Agency forecasts data center electricity consumption to double by 2030, with AI-driven accelerated servers being a key driver of increased energy demand.Source ↗Notes, data center electricity consumption has grown at 12% per year over the last five years.
Global Data Centers:
- Total electricity consumption: 415 TWh (1.5% of global electricity)
- AI-specific consumption: 40 TWh (15% of data center total, up from 2 TWh in 2017)
- AI share of data center power: 5-15% currently, projected to reach 35-50% by 2030
Regional Breakdown (2024) per IEA analysis↗🔗 webIEA Energy and AI ReportThe International Energy Agency forecasts data center electricity consumption to double by 2030, with AI-driven accelerated servers being a key driver of increased energy demand.Source ↗Notes:
| Region | Data Center Consumption | Share of Global Total |
|---|---|---|
| United States | 183 TWh | 45% |
| China | 104 TWh | 25% |
| Europe | 62 TWh | 15% |
| Rest of World | 66 TWh | 15% |
United States (Pew Research↗🔗 web★★★★☆Pew Research CenterPew Research data center energyPew Research analyzes the growth of U.S. data centers, examining their energy consumption, geographical distribution, and potential environmental implications during the AI boom.Source ↗Notes):
- Data center consumption: 183 TWh (over 4% of US total, equivalent to Pakistan’s annual consumption)
- Growth: 58 TWh (2014) to 183 TWh (2024)
Future Projections (2025-2030)
Section titled “Future Projections (2025-2030)”Global (IEA projections↗🔗 webIEA projectionsSource ↗Notes):
- 2030 projection: 945 TWh (nearly 3% of global electricity)
- Annual growth rate: 15% per year (2024-2030)—4x faster than total electricity growth
- AI-optimized data centers: more than 4x growth by 2030
Regional Growth to 2030 (IEA Base Case):
| Region | 2024 | 2030 Projection | Increase |
|---|---|---|---|
| United States | 183 TWh | 423 TWh | +130% |
| China | 104 TWh | 279 TWh | +170% |
| Europe | 62 TWh | 107+ TWh | +70% |
Server Type Breakdown:
- Accelerated servers (AI): 30% annual growth
- Conventional servers: 9% annual growth
Data Quality: High. Based on IEA, DOE, and industry analyses.
Sources: IEA Energy and AI Report↗🔗 webIEA Energy and AI ReportThe International Energy Agency forecasts data center electricity consumption to double by 2030, with AI-driven accelerated servers being a key driver of increased energy demand.Source ↗Notes, Pew Research data center energy↗🔗 web★★★★☆Pew Research CenterPew Research data center energyPew Research analyzes the growth of U.S. data centers, examining their energy consumption, geographical distribution, and potential environmental implications during the AI boom.Source ↗Notes, DOE data center report↗🏛️ governmentDOE data center reportA Department of Energy report highlights significant growth in data center energy usage, with electricity consumption expected to increase dramatically by 2028 due to AI and tec...Source ↗Notes
6. Chip Fab Capacity for AI Accelerators
Section titled “6. Chip Fab Capacity for AI Accelerators”TSMC (Market Leader)
Section titled “TSMC (Market Leader)”TSMC has committed 28% of its total wafer capacity to AI chip manufacturing. Advanced 3nm and 5nm nodes contribute approximately 74% to overall wafer revenue, and the AI/HPC segment accounts for 59% of total returns (Spark analysis↗🔗 webSpark analysisSource ↗Notes).
3nm Capacity Ramp (WCCFtech↗🔗 webWCCFtechSource ↗Notes):
- Q3 2025: 3nm at 23% of total revenue (surpassing 5nm)
- Current production: 100,000-110,000 wafers/month
- End of 2025 target: 160,000 wafers/month
- NVIDIA adding 35,000 wafers/month in 3nm alone
2nm Node (N2) Roadmap (WCCFtech↗🔗 webWCCFtechSource ↗Notes):
- Mass production: Q4 2025
- End of 2025: 45,000-50,000 wafers/month
- End of 2026: 100,000 wafers/month
- 2028: 200,000 wafers/month (including Arizona)
- Major customers: Apple (50% reserved), Qualcomm; NVIDIA starting 2027
US Expansion (Tom’s Hardware↗🔗 webTom's HardwareSource ↗Notes):
- Arizona Fab 1: 4nm production online (late 2024)
- Arizona Fab 2: 3nm production starting 2027 (ahead of schedule)
- Total US investment: $165 billion for three fabs, packaging, and R&D
TSMC Capacity Allocation
Section titled “TSMC Capacity Allocation”| Node | 2024 Status | 2025 Projection | 2026 Projection |
|---|---|---|---|
| 3nm | 100-110k wpm | 160k wpm | Fully booked |
| 2nm | Risk production | 45-50k wpm | 100k wpm |
| CoWoS packaging | Doubled 2024 | Doubling again | Critical constraint |
Samsung
Section titled “Samsung”Current/Near-term:
- 3nm SF3 (GAA): Available 2025
- 2nm SF2: Late 2025 start
- Monthly capacity target: 21k wpm by end of 2026 (163% increase from 2024)
Long-term:
- Sub-2nm target: 50-100k wpm by 2028
- Taylor, Texas fab: 93.6% complete (Q3 2024), full completion July 2026
Market Position:
- Gaining from TSMC capacity constraints
- Major wins: Tesla AI chips, AMD/Google considering 2nm production
Global Foundry Market
Section titled “Global Foundry Market”- 2024 growth: 11% capacity increase
- 2025 growth: 10% capacity increase (17% for leading-edge with 2nm ramp)
- 2026 capacity: 12.7M wafers per month
- Main constraint: Chip packaging (CoWoS) and HBM, not wafer production
Data Quality: High. Based on company reports, industry analysis, and fab construction tracking.
Sources: SEMI fab capacity report↗🔗 webSEMI fab capacity reportSource ↗Notes, TrendForce Samsung 2nm↗🔗 webTrendForce Samsung 2nmSamsung is emerging as a key 2nm chip manufacturer for Big Tech companies, leveraging TSMC's production limitations and geopolitical tensions.Source ↗Notes
7. GPU Utilization Rate at Major Labs
Section titled “7. GPU Utilization Rate at Major Labs”Current Understanding (2024):
- Training vs. Inference split: Currently ~80% training, ~20% inference
- Projected 2030 split: ~30% training, ~70% inference (reversal)
Lab-Specific Data:
OpenAI (2024):
- Training compute: $3B amortized cost
- Inference compute: $1.8B (likely understated for single-year)
- Research compute: $1B
- Inference is becoming 15-118x more expensive than training over model lifetime
Historical Inference Ratios:
- Google (2019-2021): Inference = 60% of total ML compute (three-week snapshots)
- Inference costs grow continuously after deployment while training is one-time
Utilization Challenges:
- Packaging bottlenecks (CoWoS)
- HBM supply constraints
- Infrastructure development lag
Data Quality: Medium-Low. Most labs don’t publish utilization rates; estimates based on cost reports.
Sources: Epoch AI inference allocation↗🔗 web★★★★☆Epoch AIEpoch AI inference allocationA theoretical analysis suggests that the most efficient compute spending for AI models involves approximately equal investment in training and inference, with techniques like pr...Source ↗Notes, A&M training demand analysis↗🔗 webA&M training demand analysisSource ↗Notes
8. Inference vs. Training Compute Ratio
Section titled “8. Inference vs. Training Compute Ratio”Current State:
- Industry split: 80% training, 20% inference (2024)
- OpenAI token generation: ~100B tokens/day = 36T tokens/year
- Training tokens for modern LLMs: ~10T tokens
- Token cost ratio: Training tokens ~3x more expensive than inference
Evolution:
- 2019-2021 (Google): 60% inference, 40% training (based on 3-week snapshots)
- 2024 (Industry): 80% training, 20% inference (during training surge)
- 2030 (Projected): 70% inference, 30% training (post-surge equilibrium)
Theoretical Optimal Allocation:
- For roughly equal value per compute in training vs. inference, the tradeoff parameter (α) must be near 1
- For significantly different allocations (10x difference), α must be below 0.1 or above 10
- Current industry behavior suggests α close to 1, hence similar magnitudes
Inference Growth Drivers:
- Deployment at scale requires continuous inference compute
- One-time training cost vs. ongoing serving costs
- By 2030, ~70% of data center AI demand projected to be inference
Data Quality: Medium. Based on partial disclosures and theoretical models.
Sources: Epoch AI compute allocation theory↗🔗 web★★★★☆Epoch AIEpoch AI inference allocationA theoretical analysis suggests that the most efficient compute spending for AI models involves approximately equal investment in training and inference, with techniques like pr...Source ↗Notes, Epoch AI OpenAI compute spend↗🔗 web★★★★☆Epoch AIEpoch AI OpenAI compute spendEpoch AI analyzed OpenAI's 2024 compute spending, estimating $5 billion in R&D compute and $2 billion in inference compute. Most compute was likely used for experimental and unr...Source ↗Notes
9. GPT-4 Level Training Costs Projection
Section titled “9. GPT-4 Level Training Costs Projection”Current GPT-4 Training Costs
Section titled “Current GPT-4 Training Costs”Initial Training (2023):
- Official estimate: “More than $100M” (Sam Altman)
- Epoch AI hardware/energy only: $40M
- Full cost estimates: $78-192M depending on methodology
GPT-4-Equivalent Training Costs (Optimized):
- Q3 2023: ≈$20M (3x cheaper with efficiency improvements)
- 01.ai claim: ≈$3M using 2,000 GPUs and optimization
Cost Trend Analysis
Section titled “Cost Trend Analysis”Training Cost Growth (Frontier Models):
- Historical trend: Tripling per year (4x compute growth, 1.3x efficiency gain)
- If trend continues: $1B+ training runs by 2027
- Dario Amodei (Aug 2024): “$1B models this year, $10B models by 2025”
Cost Decline (Equivalent Performance):
- Algorithmic efficiency: 2x every 9 months
- Hardware efficiency: 1.4x per year
- Combined: ~10x cost reduction per year for equivalent capability
When Will GPT-4 Training Cost Under $1M?
Section titled “When Will GPT-4 Training Cost Under $1M?”Optimistic Scenario (Efficiency improvements continue):
- 2023: $20M (optimized)
- 2024: $2M (10x reduction)
- 2025: $200k (10x reduction)
- 2026: under $100k (below $1M threshold)
Conservative Scenario (Slower efficiency gains):
- Assume 3x annual reduction instead of 10x
- 2023: $20M
- 2025: $2.2M
- 2027: $240k (below $1M threshold)
Important Notes:
- These projections are for achieving GPT-4-level performance, not frontier capabilities
- Frontier models will continue to cost $100M-$1B+ as labs push boundaries
- The trend is divergent: equivalent performance gets cheaper while cutting-edge gets more expensive
Data Quality: Medium. Based on historical trends and partial cost disclosures.
Sources: Juma GPT-4 cost breakdown↗🔗 webJuma GPT-4 cost breakdownThe article explores the costs of training large language models like GPT-3 and GPT-4, highlighting the substantial financial and environmental implications of AI model developm...Source ↗Notes, Fortune AI training costs↗🔗 web★★★☆☆FortuneFortune AI training costsResearch shows AI training costs are dramatically increasing, with models potentially costing billions of dollars and computational requirements doubling every six months. The t...Source ↗Notes, ArXiv training costs↗📄 paper★★★☆☆arXivArXiv training costsBen Cottier, Robi Rahman, Loredana Fattorini et al. (2024)Source ↗Notes
10. Nvidia’s AI Accelerator Market Share
Section titled “10. Nvidia’s AI Accelerator Market Share”Current Market Position (2024-2025) (Statista↗🔗 webStatista market dataSource ↗Notes, Fortune Business Insights↗🔗 webFortune Business InsightsSource ↗Notes):
- Dominant share: 80-95% of AI accelerator market
- Conservative estimates: 70-86%
- Most commonly cited: 80-90%
Market Size (Grand View Research↗🔗 webGrand View ResearchSource ↗Notes):
- 2024: $14.48B data center GPU market
- 2032 projected: $295B (13.5% CAGR)
- Alternative estimate (Precedence Research): $192B by 2034
Nvidia Revenue (Statista↗🔗 webStatista market dataSource ↗Notes):
- FY 2024 data center revenue: $47.5B (216% YoY increase)
- Q3 2025 data center revenue: $30.8B (112% YoY)
- Data center share: 87% of total segment revenue
Competitive Landscape:
| Company | 2025 Market Share | Key Products | Notes |
|---|---|---|---|
| Nvidia | 80-90% | H100, H200, Blackwell | CUDA lock-in, dominant position |
| AMD | approximately 8-10% | MI300 series | $5.6B projected (2025), doubling DC footprint |
| Intel | approximately 8% | Gaudi 3 | 8.7% of training accelerators by end 2025 |
| Internal use | TPU v5p | $3.1B value (2025), custom deployment |
Nvidia’s Competitive Advantages:
- CUDA ecosystem: Deep software integration, high switching costs
- Performance leadership: H100/H200 industry standard
- Supply relationships: Preferential TSMC access
- First-mover advantage: Established during AI boom
Emerging Threats:
- Custom silicon (Google TPU, Amazon Trainium)
- Meta considering shift from CUDA to TPU (billions in spending)
- JAX job postings grew 340% vs. CUDA 12% (Jan 2025)
- Inference workloads bleeding to ASICs
Data Quality: High. Based on market research firms and financial disclosures.
Sources: PatentPC AI chip market stats↗🔗 webPatentPC AI chip market statsThe AI chip market is experiencing explosive growth, with Nvidia leading the way and companies like AMD and Intel emerging as competitive alternatives. Market projected to grow ...Source ↗Notes, TechInsights Q1 2024↗🔗 webTechInsights Q1 2024TechInsights reports on the explosive growth of the data-center AI chip market in 2023, highlighting NVIDIA's market leadership and revenue surge.Source ↗Notes, CNBC Nvidia market analysis↗🔗 web★★★☆☆CNBCCNBC Nvidia market analysisNvidia controls the majority of the AI chip market, with unprecedented market capitalization and revenue driven by AI accelerator demand. Competitors are emerging from tech gian...Source ↗Notes
11. China’s Domestic AI Chip Production Capacity
Section titled “11. China’s Domestic AI Chip Production Capacity”Current Production Capacity (2024-2025)
Section titled “Current Production Capacity (2024-2025)”SMIC (Semiconductor Manufacturing International Corporation) (Tom’s Hardware↗🔗 webTom's Hardware China AI chip productionChinese tech firms are ramping up domestic AI chip production to reduce dependence on foreign technologies. Their efforts face significant challenges in semiconductor fabricatio...Source ↗Notes):
- Current 7nm capacity: approximately 30k wafers per month (wpm)
- 2025 target: 45-50k wpm advanced nodes
- 2026 projection: 60k wpm
- 2027 projection: 80k wpm (with yields potentially reaching 70%)
- Plans to double 7nm capacity in 2025 (most advanced process in mass production in China)
Huawei Ascend AI Chips (SemiAnalysis↗🔗 webSemiAnalysis Huawei productionSemiAnalysis examines Huawei's AI chip production capabilities, highlighting challenges from export controls and memory bottlenecks. The analysis reveals China's strategic effor...Source ↗Notes, Bloomberg↗🔗 webBloombergSource ↗Notes):
| Metric | 2024 | 2025 (Projected) | 2026 (Projected) |
|---|---|---|---|
| Dies produced | 507k (mostly 910B) | 805k-1.5M | 1.2M+ (Q4 alone) |
| Packaged chips shipped | approximately 200k | 600-700k | approximately 600k (910C) |
| Yield rate (910C) | — | approximately 20-30% | Improving to 70% target |
| Technology node | SMIC 7nm (DUV) | SMIC N+2 | Continued DUV |
Production Bottlenecks (SemiAnalysis↗🔗 webSemiAnalysis Huawei productionSemiAnalysis examines Huawei's AI chip production capabilities, highlighting challenges from export controls and memory bottlenecks. The analysis reveals China's strategic effor...Source ↗Notes):
-
HBM (High-Bandwidth Memory) - Critical constraint:
- Huawei’s stockpile: 11.7M HBM stacks (7M from Samsung pre-restrictions)
- Stockpile depletion: Expected end of 2025
- CXMT domestic production: approximately 2M stacks in 2026 (supports only 250-400k chips)
-
Yield challenges (TrendForce↗🔗 webTrendForceSource ↗Notes):
- Ascend 910C yield: approximately 20-30% (on older stockpiled equipment)
- Ascend 910B yield: approximately 50%
- Low yields force production cuts and order delays
- Without EUV, advanced packaging, and unrestricted HBM access, chips remain constrained
-
TSMC die bank:
- Huawei received 2.9M+ Ascend dies from TSMC (pre-sanctions)
- This stockpile enables 2024-2025 production
- Without die bank, production would be much lower
Future Plans
Section titled “Future Plans”Huawei Fab Buildout:
- Dedicated AI chip facility: End of 2025
- Additional sites: 2 more in 2026
- WFE (wafer fab equipment) spending: $7.3B (2024, up 27% YoY)
- Global ranking: 4th largest WFE customer (from zero in 2022)
Production Ramp Timeline:
- Q3 2024: Ascend 910B production ramp begins
- Q1 2025: Ascend 910C mass production starts (on SMIC N+2 process)
- 2025-2026: Continued ramp, constrained by HBM
Performance Gap
Section titled “Performance Gap”Huawei vs. Nvidia (Tom’s Hardware analysis↗🔗 webTom's Hardware analysisSource ↗Notes):
- Huawei ecosystem scaling up but lags significantly on efficiency and performance
- Technology node: 7nm (Huawei/SMIC) vs. 4nm/3nm (Nvidia/TSMC)
- Memory bottleneck: Ascend chips cannot match NVIDIA’s HBM subsystem
- Export controls successfully limiting China’s access to cutting-edge AI chips
- Gap expected to persist due to continued US restrictions
Data Quality: Medium. Based on industry analysis, supply chain reports, and informed estimates.
Sources: Tom’s Hardware China AI chip production↗🔗 webTom's Hardware China AI chip productionChinese tech firms are ramping up domestic AI chip production to reduce dependence on foreign technologies. Their efforts face significant challenges in semiconductor fabricatio...Source ↗Notes, SemiAnalysis Huawei production↗🔗 webSemiAnalysis Huawei productionSemiAnalysis examines Huawei's AI chip production capabilities, highlighting challenges from export controls and memory bottlenecks. The analysis reveals China's strategic effor...Source ↗Notes, WCCFtech Huawei capacity↗🔗 webWCCFtech Huawei capacityA CSIS report suggests Huawei has found ways to circumvent US chip sanctions by acquiring manufacturing equipment and stockpiling previous-generation chip dies. The company may ...Source ↗Notes
12. Semiconductor Equipment Lead Times
Section titled “12. Semiconductor Equipment Lead Times”ASML Lithography Equipment
Section titled “ASML Lithography Equipment”Historical Peak Lead Times (2022): During the chip shortage peak:
- ArF immersion equipment: 24 months
- EUV equipment: 18 months
- I-line equipment: 18 months
- Industry average (all equipment): 14 months (up from 3-6 months pre-shortage)
Current State (2024-2025):
- Lead times have moderated from 2022 peak but remain “incredibly long”
- Foundries must plan capacity expansions well in advance
- Exact current lead times not publicly disclosed
ASML Production Capacity Targets:
| Equipment Type | 2025 Target | Medium-term Target |
|---|---|---|
| EUV 0.33 NA | 90 systems/year | Maintained |
| DUV (immersion + dry) | 600 systems/year | Maintained |
| EUV High-NA (0.55 NA) | - | ≈20 systems/year |
2024 Shipments (Actual):
- Total lithography: 418 systems
- EUV: 44 systems
- DUV: 374 systems
- Metrology/inspection: 165 systems
High-NA EUV Systems:
- Cost: $400M+ per system (vs. $200M for low-NA)
- First commercial deployment: Intel TWINSCAN EXE:5200B
- Status: Transition from low-NA to high-NA beginning 2024-2025
Market Concentration
Section titled “Market Concentration”ASML Market Dominance:
- Lithography equipment market share: ~94% (2024)
- Remaining 6%: Canon and Nikon
- Monopoly on EUV lithography (only supplier globally)
Geopolitical Constraints
Section titled “Geopolitical Constraints”China Export Restrictions:
- ASML expects China customer demand to decline significantly in 2026 vs. 2024-2025
- However, total 2026 net sales not expected to fall below 2025 levels (non-China growth compensates)
China’s EUV Development:
- Reports of prototype EUV lithography machine development
- Target: AI chip output by 2028 using domestic EUV
- Status: Early prototype, far from production capability
Lead Time Implications:
- Long lead times favor incumbents with existing allocations
- New entrants (especially geopolitically restricted) face multi-year delays
- Supply constraints on advanced packaging (CoWoS) now more critical than lithography
Data Quality: Medium-High. Based on ASML reports and industry analysis.
Sources: SMM ASML lead times↗🔗 webSMM ASML lead timesI apologize, but the provided content does not appear to be a substantive source document about AI safety or anything meaningful. The text seems to be a jumbled list of market a...Source ↗Notes, TrendForce ASML EUV analysis↗🔗 webTrendForce ASML EUV analysisASML has established a near-monopoly in advanced semiconductor lithography equipment by developing EUV technology through extensive international partnerships and iterative inno...Source ↗Notes, Tom’s Hardware ASML capacity↗🔗 webTom's Hardware ASML capacityASML, the world's leading lithography scanner manufacturer, is experiencing massive semiconductor equipment demand, with limited production capacity to meet current orders acros...Source ↗Notes
Data Quality Summary
Section titled “Data Quality Summary”| Metric | Data Quality | Update Frequency | Key Gaps |
|---|---|---|---|
| GPU Production | Medium-High | Quarterly | Exact production numbers proprietary |
| Training Compute | High (public models) | Ongoing | Unreleased model estimates uncertain |
| Cost per FLOP | High | Annual | Future projections uncertain |
| Training Efficiency | High | Annual | Contribution breakdown debated |
| Data Center Power | High | Annual | AI-specific breakdown incomplete |
| Fab Capacity | High | Quarterly | Packaging/HBM constraints harder to track |
| GPU Utilization | Low | Rare | Most labs don’t disclose |
| Inference/Training Ratio | Medium | Rare | Industry-wide data sparse |
| Cost Projections | Medium | N/A | Depends on uncertain trends |
| Nvidia Market Share | High | Quarterly | Custom silicon market opaque |
| China Production | Medium | Quarterly | True yields/capacity uncertain |
| Equipment Lead Times | Medium | Annual | Real-time data proprietary |
Key Uncertainties & Debate
Section titled “Key Uncertainties & Debate”Algorithmic Progress Measurement
Section titled “Algorithmic Progress Measurement”The actual contribution of algorithmic improvements vs. scale-dependent effects remains debated. Measured innovations account for less than 100x of the claimed 22,000x improvement, with the gap attributed to scaling effects that are harder to isolate.
Inference Compute Growth
Section titled “Inference Compute Growth”Whether inference will truly dominate by 2030 depends on:
- Rate of model deployment at scale
- Efficiency improvements in inference
- Whether training runs continue to grow exponentially
China’s Production Reality
Section titled “China’s Production Reality”Estimates of China’s domestic chip production vary widely (200k to 1.5M dies) due to:
- Yield rate uncertainty
- HBM supply constraints
- Stockpile utilization vs. new production
- Lack of independent verification
GPU Utilization
Section titled “GPU Utilization”Major labs don’t disclose actual utilization rates, training efficiency, or infrastructure bottlenecks. The 80/20 training/inference split is an industry estimate, not measured data.
Sources
Section titled “Sources”This page synthesizes data from:
Primary Sources:
- Epoch AI↗🔗 web★★★★☆Epoch AIEpoch AISource ↗Notes - GPU production, training compute, model database
- IEA Energy and AI Report↗🔗 webIEA Energy and AI ReportThe International Energy Agency's report analyzes the energy implications of AI, focusing on electricity demand, energy sources, and potential impacts on security, emissions, an...Source ↗Notes - Data center power consumption
- SEMI↗🔗 webSEMISource ↗Notes - Fab capacity and equipment
- Our World in Data↗🔗 web★★★★☆Our World in DataOur World in DataOur World in Data provides a comprehensive overview of AI's current state and potential future, highlighting exponential technological progress and significant societal implicat...Source ↗Notes - Long-term trends
- Stanford AI Index↗🔗 webAI Index ReportStanford HAI's AI Index is a globally recognized annual report tracking and analyzing AI developments across research, policy, economy, and social domains. It offers rigorous, o...Source ↗Notes - Comprehensive annual metrics
Industry Analysis:
- TrendForce↗🔗 webTrendForceSource ↗Notes - Semiconductor production forecasts
- SemiAnalysis↗🔗 webSemiAnalysisSource ↗Notes - Deep-dive industry analysis
- Tom’s Hardware↗🔗 webTom's HardwareSource ↗Notes - Hardware specifications and roadmaps
- Financial disclosures from Nvidia, TSMC, ASML
Research:
- Epoch AI algorithmic progress↗🔗 web★★★★☆Epoch AIEpoch AI algorithmic progressA comprehensive analysis of language model algorithmic progress reveals rapid efficiency improvements, with compute requirements halving approximately every 8 months. However, c...Source ↗Notes - Language model efficiency trends
- arXiv training costs↗📄 paper★★★☆☆arXivArXiv training costsBen Cottier, Robi Rahman, Loredana Fattorini et al. (2024)Source ↗Notes - Rising costs of frontier models
- Regulatory filings and government reports (DOE, EU AI Act)
Market Research:
- Statista AI statistics↗🔗 webStatista market dataSource ↗Notes - Market size and revenue data
- Grand View Research↗🔗 webGrand View ResearchSource ↗Notes - Market projections
- Pew Research↗🔗 web★★★★☆Pew Research CenterPew Research data center energyPew Research analyzes the growth of U.S. data centers, examining their energy consumption, geographical distribution, and potential environmental implications during the AI boom.Source ↗Notes - US data center energy
Last updated: December 2025