AGI Bottlenecks

The AGI Bottlenecks table maps the supply chain of frontier capability progress: what the next generation of capability gains requires more of, how constrained each input is today, and who controls it. Rows are inputs (compute, fabs, energy, data, talent, capital, regulation); columns are the dimensions along which an input can bind: present tightness, direction of change, time-to-relieve, cost-to-expand, controllers, and geographic concentration.

The table is intended as a strategic map. Bottlenecks define where governance interventions have leverage — a binding constraint controlled by a single firm in a single country (HBM, EUV lithography) is a different policy target from a moderately tight constraint distributed across many actors (deployment infrastructure, capital). The companion Actor Power Scorecard scores who exerts power over the AI development trajectory; this table scores the inputs they have power over.

Columns:|

	How constrained is this input right now?	Is the constraint tightening or loosening?	How fast can supply be expanded?	Capex required to relieve the constraint	Who controls	How geographically concentrated is supply?	Strategic notes
Fab capacity Leading-edge semiconductor fabrication (3nm and below) — the upstream constraint behind nearly all frontier AI accelerators.	Binding TSMC sub-3nm sold out 12+ months ahead; one credible supplier for frontier nodes	Stable TSMC Arizona, Samsung Texas, Intel Foundry expanding; multi-year ramps	Very Long Fab buildout 3-5 yr; equipment lead times 1-2 yr	Very High $20-40B per leading-edge fab; only 3 firms can plausibly attempt	TSMCSamsung FoundryIntel FoundryASML (EUV supply)	Single Point TSMC Taiwan dominant; ASML EUV monopoly. Geopolitical concentration risk.	The upstream chokepoint. US CHIPS Act and Taiwan tensions make this the bottleneck with the highest strategic salience.
HBM / memory supply High-bandwidth memory (HBM3/HBM3E/HBM4) — paired with logic dies in every frontier accelerator. Memory bandwidth, not FLOPS, often gates throughput.	Binding SK Hynix HBM3E sold out into 2026; memory bandwidth gates accelerator throughput	Tightening Each chip generation increases HBM stack count; demand outpaces DRAM fab additions	Medium DRAM fab lead times shorter than logic; HBM packaging (CoWoS) is the bottleneck within the bottleneck	High Multi-billion DRAM and packaging capex; CoWoS capacity at TSMC is gating	SK HynixSamsungMicronTSMC (CoWoS packaging)	Concentrated South Korea dominant for HBM; Taiwan for advanced packaging	Often overlooked. The reason H100 and B200 supply is constrained is as much HBM and CoWoS as logic-die wafers.
Compute (training) Frontier training compute — H100/B200-class clusters at 100K+ accelerator scale used for foundation-model training runs.	Tight Frontier training runs use 100K+ H100-equivalents; allocations contested across hyperscalers	Tightening Frontier training compute growing ~4-10x/yr; supply ramp lags demand	Long New fab capacity 2-4 yr; new GPU generations on 1-2 yr cycle	Very High Single frontier cluster $5-50B in capex	NVIDIATSMCHyperscalers (MSFT/AMZN/GOOG)Frontier labs	Concentrated Most frontier training compute in US data centers; UAE/Singapore growing	The headline bottleneck. NVIDIA-led GPU supply, hyperscaler cluster build-out, and TSMC fab capacity are the three sub-constraints.
Compute (inference) Serving capacity for deployed models — accelerator-hours per token across cloud and edge deployment.	Tight Major releases regularly hit capacity limits; reasoning models multiply inference cost 10-100x	Tightening Demand growth (reasoning chains, agents) outpacing efficiency gains and capacity adds	Medium Inference is more parallelizable; custom silicon (Trainium, TPU, Groq) helping	High Substantial but lower per-FLOP than training; commoditizing	Hyperscalers (AWS/Azure/GCP)NVIDIACustom silicon (Trainium, TPU, Groq, Cerebras)	Distributed Many regions, but US/EU dominant; latency forces geographic distribution	Less single-point-of-failure than training. Inference is where most users feel capacity constraints (rate limits, queues).
Energy / power Grid interconnect capacity, generation, and transmission — increasingly the rate-limiter for new datacenter siting.	Tight Multi-GW interconnects waitlisted 5-10 yr in major US grids; UK, Ireland, Netherlands at moratorium	Tightening Gigawatt-scale clusters proposed; grid build-out 5-10x slower than datacenter demand	Very Long Transmission line permitting 5-10 yr; new generation (nuclear, gas) similar	High Interconnect upgrades $100M-$1B per site; new generation in $B	UtilitiesFERC / state PUCsIndependent power producersLocal zoning authorities	Distributed Many viable regions globally, but per-site permitting is the constraint	Replacing compute as the headline bottleneck for next-generation clusters. Stargate, Microsoft-Constellation, and Meta nuclear deals all reflect this.
High-quality training data Pretraining corpora — text, code, multimodal data of sufficient quality to drive frontier model improvements.	Tight Public web text approaching exhaustion at frontier scale; quality filtering aggressive	Mixed Tightening for novel human-generated data; loosening via synthetic data and licensing deals	Medium Synthetic generation pipelines, real-world telemetry, and licensed corpora maturing	Medium Licensing deals (NYT, Reddit, Shutterstock) in $10s-$100s of millions; labeling at scale	Frontier labs (proprietary pipelines)Major publishers (NYT, Reddit, Stack Exchange)Data brokers and labeling firmsSynthetic-data toolchains	Global Data sources are global; legal regimes (GDPR, copyright) regional	Synthetic-data viability is the open question. If synthetic scales, this bottleneck loosens; if not, real-data licensing becomes a moat.
Evaluation data Benchmarks, held-out test sets, and dangerous-capability evals used to measure frontier-model progress and safety.	Tight Public benchmarks saturating; contamination widespread; few credible held-out evals	Mixed Public benchmarks tightening (saturation); private holdout sets from AISIs and labs emerging	Medium Eval suite development 6-18 months; held-out test creation iterative	Low Eval data cheap relative to training data; expert annotation is the cost driver	METRUS/UK AISILabs (private holdouts)Academic groups (BIG-Bench, HELM)	Global Eval development globally distributed; AISIs concentrated in US/UK/EU	A governance-critical bottleneck. Bad evals → labs can't credibly self-report risk → safety commitments lose teeth.
Talent supply Frontier ML researchers, alignment researchers, and senior engineering staff capable of operating at the capability frontier.	Tight Top frontier-ML researchers ~1000s globally; alignment researchers ~100s; compensation reflects scarcity	Loosening PhD pipelines growing, but slowly; lab hiring frenzy bids up the senior tier	Very Long PhD pipelines 5-10 yr; senior engineering experience harder to manufacture	High Senior researcher comp packages reach $1M-$10M; aggressive cross-lab poaching	Top frontier labs (compete for talent)Universities and PhD programsGovernment immigration policy	Concentrated US Bay Area, UK, China dominate; visa policy is a material lever	A slow-moving bottleneck. Less acute than compute today, but the long pipeline makes shocks (e.g. visa restrictions) hard to recover from.
Cooling / siting Water rights, land, climate suitability, and zoning for hyperscale datacenters.	Moderate Water-cooled designs constrained in arid regions; direct-liquid cooling reduces water demand	Stable Air and direct-liquid cooling retrofits expanding; sites still available outside drought regions	Medium Datacenter construction 1-2 yr once site and power are secured	Medium Site acquisition and construction in $100M-$1B range	HyperscalersLocal zoning boardsWater authorities	Distributed Many viable global sites; concentration follows power and fiber availability	Less acute than power but interacts with it — water-cooled efficiency is what makes some 100MW+ sites viable.
Algorithmic insights Research breakthroughs in architectures, training procedures, post-training methods, and reasoning techniques.	Moderate Key insights diffuse fast via papers, code, and lab transitions; some methods stay private 6-18 months	Loosening Open-weight releases (DeepSeek, Llama, Mistral) and academic publications keep diffusion fast	Short Papers and code propagate in weeks once published	Low Research is talent-intensive, not capex-intensive	Frontier labs (closed research)Academic communityOpen-weight releasers (Meta, DeepSeek, Mistral, AI2)	Global Research community is global; lab concentration in US/UK/China	The least supply-constrained bottleneck. The question is whether closed labs can sustain a multi-month research lead.
Capital Investment capital for frontier training runs, infrastructure buildout, and operating losses during scaling.	Moderate Frontier labs have access to $10s-$100s of billions; trailing labs more constrained	Loosening Sovereign wealth (UAE, Saudi), hyperscaler commitments, Stargate-scale infrastructure deals	Short Capital can be deployed in months when investor appetite exists	Low Capital is fungible; the question is access, not creation	Hyperscalers (MSFT, AMZN, GOOG)Sovereign wealth (UAE, Saudi, Singapore)Mega-cap VC and strategic investors	Concentrated US capital markets dominant; sovereign-wealth tier is the swing player	Capital is the least constrained input for top-3 labs. For everyone else it is the binding constraint.
Deployment infrastructure APIs, enterprise integrations, agent frameworks, and tooling needed to put model capabilities in front of users.	Moderate Major API providers mature; enterprise integration and agent tooling still nascent	Loosening API ecosystems, agent frameworks (MCP, agent SDKs), and enterprise pipelines maturing	Short Engineering effort scales with capital; not capacity-bound	Low Engineering cost, not capex; standard SaaS economics	Major API providers (OpenAI, Anthropic, Google)Cloud platforms (AWS Bedrock, Azure OpenAI, Vertex)Enterprise integrators	Distributed API access global where not legally restricted	Less a supply bottleneck than a maturity bottleneck. Affects deployment timelines more than capability frontier.
Regulatory friction Legal, regulatory, and compliance constraints — EU AI Act, US export controls, sectoral rules, liability frameworks.	Moderate EU AI Act and US export controls binding for some uses; not capability-gating for frontier training in the US	Tightening EU enforcement ramping; US export controls expanding; California SB-53 and sectoral rules emerging	Long Legislative cycles slow; rule-making takes years	N/A This is a constraint, not a resource to expand	EU CommissionUS BIS / White HouseChina State Council / CACCalifornia, NY, and state legislatures	Regional Brussels Effect plus US-China bifurcation creating regional regulatory regimes	The bottleneck most amenable to deliberate governance intervention. Tightening it (more rules) reduces risk but slows deployment; loosening it does the opposite.

13 bottlenecks across 5 categories. Snapshot as of 2026-05; bottleneck dynamics change on quarter-to-year timescales.

What the table is not

This is a snapshot, not a forecast. The cells reflect mid-2026 conditions. Several of the rows have plausible scenarios in which their tightness flips within twelve months — HBM packaging capacity, training-compute interconnect availability, and US export-control scope are all moving fast. The footer in the table notes the snapshot date and update cadence.

The table also flattens several distinctions worth noting:

Compute is treated as a single bottleneck, but the H100/H200/B200 generations are simultaneously binding (current-gen supply) and not (next-gen reflects scaling needs not yet at frontier). Cells reflect the present-generation gating constraint.
Algorithmic insights are scored as relatively unconstrained because public-research diffusion is fast, but specific frontier-lab insights (reasoning-model training recipes, post-training procedures) stay private for 6–18 months and confer real capability advantage during that window.
Regulatory friction is the only row where "cost to expand" does not apply — regulation is a constraint to bind, not a resource to expand. The cell is marked N/A.
Geographic concentration is scored by where supply is produced, not where it is deployed. Compute deployment is global; compute production is concentrated.

Methodology

Tightness, trajectory, time-to-relieve, cost-to-expand, and geographic-concentration cells are subjective expert judgments synthesized from public reporting through mid-2026. They are not derived from a single quantitative model. For each cell, the inline note captures the specific facts driving the rating; readers should weight the note more heavily than the badge.

The notes column is intended for strategic framing — the single most important thing to know about each bottleneck that the structured columns do not capture.

Actor Power Scorecard — scores the actors exerting power over AI development; this table scores the inputs they have power over.
AI-Driven Concentration of Power — analysis of how control over bottleneck inputs concentrates strategic leverage.
Concentration of Power Systems Model — the causal model behind why bottleneck control matters.

Changelog

Date	Change
2026-05-11	Initial table created; 13 bottlenecks scored across 5 categories

AGI Bottlenecks

What the table is not

Methodology

Related

Changelog