Longterm Wiki
Navigation
Updated 2026-03-13HistoryData
Page StatusContent
Edited today1.4k words4 backlinksUpdated quarterlyDue in 13 weeks
60QualityGood •53.5ImportanceUseful59.5ResearchModerate
Summary

Economic model analyzing AI safety research returns, recommending 3-10x funding increases from current ~\$500M/year to \$2-5B, with highest marginal returns (5-10x) in alignment theory and governance research currently receiving only 10% of funding each. Provides specific allocation recommendations across philanthropic (\$600M-1B), industry (\$600M), and government (\$1B) sources with concrete investment priorities and timelines.

Content8/13
LLM summaryScheduleEntityEdit historyOverview
Tables14/ ~6Diagrams1/ ~1Int. links41/ ~11Ext. links0/ ~7Footnotes0/ ~4References28/ ~4Quotes0Accuracy0RatingsN:4 R:3.5 A:8 C:7Backlinks4
Issues1
QualityRated 60 but structure suggests 80 (underrated by 20 points)
TODOs4
Complete 'Conceptual Framework' section
Complete 'Quantitative Analysis' section (8 placeholders)
Complete 'Strategic Importance' section
Complete 'Limitations' section (6 placeholders)

Expected Value of AI Safety Research

Analysis

AI Safety Research Value Model

Economic model analyzing AI safety research returns, recommending 3-10x funding increases from current ~\$500M/year to \$2-5B, with highest marginal returns (5-10x) in alignment theory and governance research currently receiving only 10% of funding each. Provides specific allocation recommendations across philanthropic (\$600M-1B), industry (\$600M), and government (\$1B) sources with concrete investment priorities and timelines.

Model TypeCost-Effectiveness Analysis
ScopeSafety Research ROI
Key InsightSafety research value depends critically on timing relative to capability progress
1.4k words · 4 backlinks
Analysis

AI Safety Research Value Model

Economic model analyzing AI safety research returns, recommending 3-10x funding increases from current ~\$500M/year to \$2-5B, with highest marginal returns (5-10x) in alignment theory and governance research currently receiving only 10% of funding each. Provides specific allocation recommendations across philanthropic (\$600M-1B), industry (\$600M), and government (\$1B) sources with concrete investment priorities and timelines.

Model TypeCost-Effectiveness Analysis
ScopeSafety Research ROI
Key InsightSafety research value depends critically on timing relative to capability progress
1.4k words · 4 backlinks

Overview

This economic model quantifies the expected value of marginal investments in AI safety research. Current global spending of ≈$100M annually on safety research appears significantly below optimal levels, with analysis suggesting 2-5x returns available in neglected areas.

Key findings: Safety research could reduce AI catastrophic risk by 20-40% over the next decade, with particularly high returns in alignment theory and governance research. Current 100:1 ratio of capabilities to safety spending creates systematic underinvestment in risk mitigation.

The model incorporates deep uncertainty about AI risk probabilities (1-20% existential risk this century), tractability of safety problems, and optimal resource allocation across different research approaches.

Risk/Impact Assessment

FactorAssessmentEvidenceSource
Current UnderinvestmentHigh100:1 capabilities vs safety ratioEpoch AI (2024)
Marginal ReturnsMedium-High2-5x potential in neglected areasCoefficient Giving
Timeline SensitivityHighValue drops 50%+ if timelines <5 yearsAI Impacts Survey
Research Direction RiskMedium10-100x variance between approachesAnalysis based on expert interviews

Strategic Framework

Core Expected Value Equation

EV = P(AI catastrophe) × R(research impact) × V(prevented harm) - C(research costs)

Where:
- P ∈ [0.01, 0.20]: Probability of catastrophic AI outcome
- R ∈ [0.05, 0.40]: Fractional risk reduction from research
- V ≈ \$10¹⁵-10¹⁷: Value of prevented catastrophic harm
- C ≈ \$10⁹: Annual research investment

Investment Priority Matrix

Research AreaCurrent Annual FundingMarginal ReturnsEvidence Quality
Alignment Theory$50MHigh (5-10x)Low
Interpretability$175MMedium (2-3x)Medium
Evaluations$100MHigh (3-5x)High
Governance Research$50MHigh (4-8x)Medium
RLHF/Fine-tuning$125MLow (1-2x)High

Source: Author estimates based on Anthropic, OpenAI, DeepMind public reporting

Resource Allocation Analysis

Current vs. Optimal Distribution

Loading diagram...
AreaCurrent ShareRecommendedChangeRationale
Alignment Theory10%20%+50MHigh theoretical returns, underinvested
Governance Research10%15%+25MPolicy leverage, regulatory preparation
Evaluations20%25%+25MNear-term safety, measurable progress
Interpretability35%30%-25MWell-funded, diminishing returns
RLHF/Fine-tuning25%10%-75MMay accelerate capabilities

Actor-Specific Investment Strategies

Philanthropic Funders ($200M/year current)

Recommended increase: 3-5x to $600M-1B/year

PriorityInvestmentExpected ReturnTimeline
Talent pipeline$100M/year3-10x over 5 yearsLong-term
Exploratory research$200M/yearHigh varianceMedium-term
Policy research$100M/yearHigh if timelines shortNear-term
Field building$50M/yearNetwork effectsLong-term

Key organizations: Coefficient Giving, Future of Humanity Institute, Long-Term Future Fund

AI Labs ($300M/year current)

Recommended increase: 2x to $600M/year

  • Internal safety teams: Expand from 5-10% to 15-20% of research staff
  • External collaboration: Fund academic partnerships, open source safety tools
  • Evaluation infrastructure: Invest in red-teaming, safety benchmarks

Analysis of Anthropic, OpenAI, DeepMind public commitments

Government Funding ($100M/year current)

Recommended increase: 10x to $1B/year

AgencyCurrentRecommendedFocus Area
NSF$20M$200MBasic research, academic capacity
NIST$30M$300MStandards, evaluation frameworks
DARPA$50M$500MHigh-risk research, novel approaches

Comparative Investment Analysis

Returns vs. Other Interventions

InterventionCost per QALYProbability AdjustmentAdjusted Cost
AI Safety (optimistic)$0.01P(success) = 0.3$0.03
AI Safety (pessimistic)$1,000P(success) = 0.1$10,000
Global health (GiveWell)$100P(success) = 0.9$111
Climate change mitigation$50-500P(success) = 0.7$71-714

QALY = Quality-Adjusted Life Year. Analysis based on GiveWell methodology

Risk-Adjusted Portfolio

Risk ToleranceAI Safety AllocationOther Cause AreasRationale
Risk-neutral80-90%10-20%Expected value dominance
Risk-averse40-60%40-60%Hedge against model uncertainty
Very risk-averse20-30%70-80%Prefer proven interventions

Current State & Trajectory

2024 Funding Landscape

Total AI safety funding: ≈$500-700M globally

SourceAmountGrowth RateKey Players
Tech companies$300M+50%/yearAnthropic, OpenAI, DeepMind
Philanthropy$200M+30%/yearCoefficient Giving, FTX regrants
Government$100M+100%/yearNIST, UK AISI, EU
Academia$50M+20%/yearStanford HAI, MIT, Berkeley

2025-2030 Projections

Scenario: Moderate scaling

  • Total funding grows to $2-5B by 2030
  • Government share increases from 15% to 40%
  • Industry maintains 50-60% share

Bottlenecks limiting growth:

  1. Talent pipeline: ~1,000 qualified researchers globally
  2. Research direction clarity: Uncertainty about most valuable approaches
  3. Access to frontier models: Safety research requires cutting-edge systems

Source: Future of Humanity Institute talent survey, author projections

Key Uncertainties & Research Cruxes

Fundamental Disagreements

DimensionOptimistic ViewPessimistic ViewCurrent Evidence
AI Risk Level2-5% x-risk probability15-20% x-risk probabilityExpert surveys show 5-10% median
Alignment TractabilitySolvable with sufficient researchFundamentally intractableMixed signals from early work
Timeline SensitivityDecades to solve problemsNeed solutions in 3-7 yearsAcceleration in capabilities suggests shorter timelines
Research TransferabilityInsights transfer across architecturesApproach-specific solutionsLimited evidence either way

Critical Research Questions

Empirical questions that would change investment priorities:

  1. Interpretability scaling: Do current techniques work on 100B+ parameter models?
  2. Alignment tax: What performance cost do safety measures impose?
  3. Adversarial robustness: Can safety measures withstand optimization pressure?
  4. Governance effectiveness: Do AI safety standards actually get implemented?

Information Value Estimates

Value of resolving key uncertainties:

QuestionValue of InformationTimeline to Resolution
Alignment difficulty$1-10B3-7 years
Interpretability scaling$500M-5B2-5 years
Governance effectiveness$100M-1B5-10 years
Risk probability$10-100BUncertain

Implementation Roadmap

2025-2026: Foundation Building

Year 1 Priorities ($1B investment)

  • Talent: 50% increase in safety researchers through fellowships, PhD programs
  • Infrastructure: Safety evaluation platforms, model access protocols
  • Research: Focus on near-term measurable progress

2027-2029: Scaling Phase

Years 2-4 Priorities ($2-3B/year)

  • International coordination on safety research standards
  • Large-scale alignment experiments on frontier models
  • Policy research integration with regulatory development

2030+: Deployment Phase

Long-term integration

  • Safety research embedded in all major AI development
  • International safety research collaboration infrastructure
  • Automated safety evaluation and monitoring systems

See Also

  • Pre-TAI Capital Deployment — How $100-300B+ gets allocated across the AI industry before transformative AI
  • Safety Spending at Scale — Analysis of safety budgets as AI labs scale to billions in annual spending
  • Frontier Lab Cost Structure — Breakdown of where frontier lab budgets go (compute, talent, safety, overhead)
  • AI Talent Market Dynamics — Competition for scarce AI researchers and its effect on safety capacity

Sources & Resources

Academic Literature

PaperKey FindingRelevance
Ord (2020)10% x-risk this centuryRisk probability estimates
Amodei et al. (2016)Safety research agendaResearch direction framework
Russell (2019)Control problem formulationAlignment problem definition
Christiano (2018)IDA proposalSpecific alignment approach

Research Organizations

OrganizationFocusAnnual BudgetKey Publications
AnthropicConstitutional AI, interpretability$100M+Constitutional AI paper
MIRIAgent foundations$5MLogical induction
CHAIHuman-compatible AI$10MCIRL framework
ARCAlignment research$15MEliciting latent knowledge

Policy Resources

SourceTypeKey Insights
NIST AI Risk Management FrameworkStandardsRisk assessment methodology
UK AI Safety InstituteGovernment researchEvaluation frameworks
EU AI ActRegulationCompliance requirements
RAND AI StrategyAnalysisMilitary AI implications

Funding Sources

FunderFocus AreaAnnual AI SafetyApplication Process
Coefficient GivingTechnical research, policy$100M+LOI system
Future FundLongtermism, x-risk$50M+Grant applications
NSFAcademic research$20MStandard grants
Survival and Flourishing FundExistential risk$10MQuarterly rounds

References

1Epoch AI (2024)Epoch AI
★★★★☆

Anthropic conducts research across multiple domains including AI alignment, interpretability, and societal impacts to develop safer and more responsible AI technologies. Their work aims to understand and mitigate potential risks associated with increasingly capable AI systems.

★★★★☆
★★★★☆
6DeepMindGoogle DeepMind
★★★★☆
7Open Philanthropy grants databaseopenphilanthropy.org

Open Philanthropy provides grants across multiple domains including global health, catastrophic risks, and scientific progress. Their focus spans technological, humanitarian, and systemic challenges.

8**Future of Humanity Institute**Future of Humanity Institute
★★★★☆
10AnthropicAnthropic
★★★★☆
11OpenAIOpenAI
★★★★☆
12Google DeepMindGoogle DeepMind
★★★★☆
13NSFnsf.gov·Government
14Guidelines and standardsNIST·Government
★★★★★
15DARPAdarpa.mil
16GiveWellgivewell.org
17AI Impacts 2023AI Impacts
★★★☆☆
19Concrete Problems in AI SafetyarXiv·Dario Amodei et al.·2016·Paper
★★★☆☆

The Center for Human-Compatible AI (CHAI) focuses on reorienting AI research towards developing systems that are fundamentally beneficial and aligned with human values through technical and conceptual innovations.

★★★★★
23AI Safety InstituteUK AI Safety Institute·Government
★★★★☆
24EU AI OfficeEuropean Union
★★★★☆
26Future Fundftxfuturefund.org
27NSFnsf.gov·Government
28Survival and Flourishing Fundsurvivalandflourishing.fund

SFF is a virtual fund that organizes grant recommendations and philanthropic giving, primarily supporting organizations working on existential risk and AI safety. They use a unique S-Process and have distributed over $152 million in grants since 2019.

Related Pages

Top Related Pages

Analysis

Frontier Lab Cost StructureAI Talent Market DynamicsAnthropic Founder Pledges: Interventions to Increase Follow-ThroughAI Risk Portfolio AnalysisAI Risk Activation Timeline ModelAI Compounding Risks Analysis Model

Organizations

Epoch AICoefficient GivingMachine Intelligence Research InstituteCenter for Human-Compatible AI

Policy

Singapore Consensus on AI Safety Research Priorities