Skip to content
Longterm Wiki

Center for AI Safety (CAIS)

Safety Organization

Also known as: CAIS

Founded 2022 (4 years old)HQ: San Franciscosafe.aiWiki pageData

The Center for AI Safety (CAIS) is a nonprofit organization that works to reduce societal-scale risks from AI. CAIS combines research, field-building, and public communication to advance AI safety. Co-founded by Dan Hendrycks (Executive Director) and Oliver Zhang (Managing Director) in 2022.

Revenue
$10.2 million
as of 2024
Total Funding Raised
$33 million
as of 2025

Key Metrics

Revenue (ARR)

$10M2024
Revenue (ARR) chart. Annual run rate: $6.7M in 2022 to $10M in 2024.$0$4.6M$9.3M$14M$19M202220232024

Facts

15
Financial
Grant Received$1.1 million
Total Funding Raised$33 million
Net Assets$11.6 million
Annual Expenses$7.2 million
Revenue$10.2 million
General
Websitehttps://www.safe.ai/
Organization
HeadquartersSan Francisco
Founded Date2022
Other
Key PersonDan Hendrycks
CompensationDan Hendrycks takes $1 annual salary as Executive Director
PublicationThe WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning — benchmark for evaluating dual-use AI capabilities in biosecurity, cybersecurity, and chemical weapons
InfrastructureCompute cluster with 256 NVIDIA A100 GPUs available for AI safety researchers
ProgramML Safety Scholars — educational program training hundreds of students in AI safety fundamentals. Includes online course, reading groups, and mentorship.
Board MemberJaan Tallinn
CampaignStatement on AI Risk (May 2023): one-sentence statement 'Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.' Signed by 350+ AI leaders including Geoffery Hinton, Demis Hassabis, Sam Altman, and Dario Amodei.

Other Data

Publications
12 entries
TitlePublicationTypeAuthorsUrlPublishedDateIsFlagship
Humanity's Last ExampaperLong Phan, Alice Gatti, Ziwen Han, Nathaniel Li et al.arxiv.org2025-01
Introduction to AI Safety, Ethics, and SocietybookDan Hendrycksaisafetybook.com2024-06
The WMDP Benchmark: Measuring and Reducing Malicious Use With UnlearningpaperNathaniel Li, Alexander Pan, Anjali Gopal et al.wmdp.ai2024
Superintelligence StrategyreportDan Hendrycks, Eric Schmidt, Alexandr Wangnationalsecurity.ai2024
Improving Alignment and Robustness with Circuit BreakerspaperAndy Zou, Long Phan, Justin Wang et al.arxiv.org2024
HarmBench: A Standardized Evaluation Framework for Automated Red TeamingpaperMantas Mazeika, Long Phan, Xuwang Yin et al.harmbench.org2024
Representation Engineering: A Top-Down Approach to AI TransparencypaperAndy Zou, Long Phan, Sarah Chen et al.arxiv.org2023-10
An Overview of Catastrophic AI RiskspaperDan Hendrycks, Mantas Mazeika, Thomas Woodsidearxiv.org2023-06
Statement on AI Riskpolicy-briefCAISaistatement.com2023-05
Universal and Transferable Adversarial Attacks on Aligned Language ModelspaperAndy Zou, Zifan Wang, Nicholas Carlini et al.llm-attacks.org2023
Unsolved Problems in ML SafetypaperDan Hendrycks, Nicholas Carlini, John Schulman, Jacob Steinhardtarxiv.org2021-09
Measuring Massive Multitask Language Understanding (MMLU)paperDan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardtarxiv.org2020-09

Divisions

6
Program

3-month SF program. $25K stipend. PhD/JD researchers.

Program·Varun Krovi

501(c)(4) advocacy arm. DC-based. Co-sponsored SB 1047. Lobbying ~$490K/yr.

Lab

80 A100 GPUs. 150+ researchers. ~100 safety papers, 16,000+ citations. Free access. Schmidt Sciences partnership.

Program

Provides free compute access to academic AI safety researchers. One of the largest non-industry compute resources available for safety research.

Program

Programs to grow the AI safety research community, including the Statement on AI Risk signed by hundreds of researchers and the ML Safety course.

Team

Technical AI safety research on robustness, interpretability, and alignment. Led by Dan Hendrycks.

Related Wiki Pages

Top Related Pages

Approaches

Capability Unlearning / RemovalMAIM (Mutually Assured AI Malfunction)AI AlignmentCorporate AI Safety Responses

Analysis

AI Compute Scaling MetricsAI Safety Intervention Effectiveness MatrixAI Uplift Assessment Model

Policy

Safe and Secure Innovation for Frontier Artificial Intelligence Models Act

Organizations

AnthropicCenter for Human-Compatible AI (CHAI)Center for AI Safety Action FundGoogle DeepMindUS AI Safety Institute (now CAISI)Redwood Research

Other

Geoffrey HintonStuart Russell

Concepts

AGI Timeline

Key Debates

Is AI Existential Risk Real?

Risks

AI-Induced Irreversibility

Historical

The MIRI Era