Growing the AI safety research community through funding, training, and outreach
International summits convening governments and AI labs to address AI safety
AI Safety via Debate uses adversarial AI systems arguing opposing positions to enable human oversight of superhuman AI. Recent empirical work shows...
Comprehensive guide to AI safety training programs including MATS (78% alumni in alignment work, 100+ scholars annually), Anthropic Fellows (\$2,10...
Comprehensive analysis of AI safety field-building showing growth from 400 to 1,100 FTEs (2022-2025) at 21-30% annual growth rates, with training p...
A well-organized reference overview of ~20 AI safety organizations categorized by function (alignment research, policy, field-building), with a com...
A comprehensive structured mapping of AI safety solution uncertainties across technical, alignment, governance, and agentic domains, using probabil...
Provides a strategic framework for AI safety resource allocation by mapping 13+ interventions against 4 risk categories, evaluating each on ITN dim...
Quantifies AI safety talent shortage: current 300-800 unfilled positions (30-50% gap) with training pipelines producing only 220-450 researchers an...
Safety cases are structured arguments adapted from nuclear/aviation to justify AI system safety, with UK AISI publishing templates in 2024 and 3 of...
Analysis of government AI Safety Institutes finding they've achieved rapid institutional growth (UK: 0→100+ staff in 18 months) and secured pre-dep...
Overview of national AI Safety Institutes (UK, US, and 11+ countries as of 2026) and intergovernmental bodies, covering budgets, mandates, and key ...
The May 2024 Seoul AI Safety Summit achieved voluntary commitments from 16 frontier AI companies (80% of development capacity) and established an 1...
The UK AI Safety Institute (renamed AI Security Institute in Feb 2025) operates with ~30 technical staff and 50M GBP annual budget, conducting fron...
Technical AI safety research encompasses six major agendas (mechanistic interpretability, scalable oversight, AI control, evaluations, agent founda...
Three international AI safety summits (2023-2025) achieved first formal recognition of catastrophic AI risks from 28+ countries, established 10+ AI...
The US AI Safety Institute (AISI), established November 2023 within NIST with \$10M budget (FY2025 request \$82.7M), conducted pre-deployment evalu...
This article synthesizes the relationship between political stability and AI safety across military, governance, and public trust dimensions, ident...
CAIS is a nonprofit research organization founded by Dan Hendrycks that has distributed compute grants to researchers, published technical AI safet...
NIST plays a central coordinating role in U.S. AI governance through voluntary standards and risk management frameworks, but faces criticism for te...
Economic model analyzing AI safety research returns, recommending 3-10x funding increases from current ~\$500M/year to \$2-5B, with highest margina...
The Singapore Consensus on Global AI Safety Research Priorities (arXiv:2506.20702) is a consensus document produced by the April 2025 SCAI conferen...
Major AI labs invest \$300-500M annually in safety (5-10% of R&D) through responsible scaling policies and dedicated teams, but face 30-40% safety ...
Comprehensive analysis showing open-source AI poses irreversible safety risks (fine-tuning removes safeguards with just 200 examples) while providi...
Formal verification seeks mathematical proofs of AI safety properties but faces a ~100,000x scale gap between verified systems (~10k parameters) an...
The Bletchley Declaration represents a significant diplomatic achievement in establishing international consensus on AI safety cooperation among 28...
Vitalik Buterin's 2021 donation of \$665.8M in cryptocurrency to FLI was one of the largest single donations to AI safety in history. Beyond this l...
Comprehensive timeline of AI safety's transition from niche to mainstream (2020-present), documenting ChatGPT's unprecedented growth (100M users in...
Coefficient Giving (formerly Open Philanthropy) has directed \$4B+ in grants since 2014, including \$336M to AI safety (~60% of external funding). ...
Timeline of AI safety as a field
Interventions and approaches to address AI safety risks
Dustin Moskovitz and Cari Tuna have given \$4B+ since 2011, with ~\$336M (12% of total) directed to AI safety through Coefficient Giving (formerly ...
CNAS is a moderately important national security think tank with substantive but secondary relevance to AI safety, approaching AI risks through a g...
Arb Research is a small AI safety consulting firm that produces methodologically rigorous research and evaluations, particularly known for their AI...
Lighthaven is a Berkeley conference venue operated by Lightcone Infrastructure that serves as physical infrastructure for AI safety, rationality, a...
Biographical profile of Eli Lifland, a top-ranked forecaster and AI safety researcher who co-authored the AI 2027 scenario forecast and co-founded ...
An interactive sortable table summarizing which AI safety approaches are likely to generalize to future architectures. Shows generalization level, ...
Lionheart Ventures is a small venture capital firm (\$25M inaugural fund) focused on AI safety and mental health investments, notable for its inves...
Timelines Wiki is a specialized MediaWiki project documenting chronological histories of AI safety and EA organizations, created by Issa Rice with ...
The FTX Future Fund was a major longtermist philanthropic initiative that distributed 132M USD in grants (including ~32M USD to AI safety) before c...
Profiles of influential researchers and leaders in AI safety
Holden Karnofsky directed \$300M+ in AI safety funding through Coefficient Giving (formerly Open Philanthropy), growing the field from ~20 to 400+ ...
FTX was a major crypto exchange that collapsed in November 2022 due to fraud, with its AI safety relevance stemming from FTX Future Fund grants to ...
Comprehensive analysis of coordination mechanisms for AI safety showing racing dynamics could compress safety timelines by 2-5 years, with \$500M+ ...
An interactive sortable table comparing 42 AI safety approaches on dimensions including safety uplift, capability uplift, net world safety, scalabi...
AI Watch is a tracking database by Issa Rice that monitors AI safety organizations, people, funding, and publications as part of his broader knowle...
Uncertainties that drive disagreement and prioritization in AI safety
Structured analysis of major disagreements in AI safety
Trackable signals for AI safety and risk dynamics
Comprehensive assessment of AI lab safety culture showing systematic failures: no company scored above C+ overall (FLI Winter 2025), all received D...