Skip to content
Longterm Wiki

Center for AI Safety (CAIS)

Safety Organization
Founded 2022 (4 years old)HQ: San Franciscosafe.ai

Also known as: CAIS

Entity
Wiki
About
Business
Data

The Center for AI Safety (CAIS) is a nonprofit organization that works to reduce societal-scale risks from AI. CAIS combines research, field-building, and public communication to advance AI safety. Co-founded by Dan Hendrycks (Executive Director) and Oliver Zhang (Managing Director) in 2022.

Revenue
$10M
as of 2024
Total Funding Raised
$33M
as of 2025
Annual Expenses
$7.2M
as of 2024
Net Assets
$12M
as of 2024

Key Metrics

Revenue (ARR)

$10M2024
Revenue (ARR) chart. Annual run rate: $6.7M in 2022 to $10M in 2024.$0$4.6M$9.3M$14M$19M202220232024

Funding Rounds

$21Mtotal raised
Funding Rounds. FTX Future Fund Grant 2022 (2022): $6.5M raised; SFF General Support 2023-H2 (Jaan Tallinn) (2023): $909K raised; SFF General Support 2023-H1 (Jaan Tallinn) (2023): $22K raised; Open Philanthropy General Support 2023 (Apr 2023): $4M raised; Open Philanthropy General Support 2024 (2024): $8.5M raised; SFF General Support 2024 (Jaan Tallinn) (2024): $1.1M raised. Total: $21M.$0$2.4M$4.9M$7.3M$9.8M202220232024FTX Future Fund Grant 2022SFF General Support 2023-H2 (Jaan Tallinn)SFF General Support 2023-H1 (Jaan Tallinn)Open Philanthropy General Support 2023Open Philanthropy General Support 2024SFF General Support 2024 (Jaan Tallinn)
Per round
Total

Facts

15
Financial
Grant Received$1.1M
Total Funding Raised$33M
Net Assets$12M
Annual Expenses$7.2M
Revenue$10M
General
Websitehttps://www.safe.ai/
Organization
HeadquartersSan Francisco
Founded Date2022
Other
Key PersonDan Hendrycks
CompensationDan Hendrycks takes $1 annual salary as Executive Director
PublicationThe WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning — benchmark for evaluating dual-use AI capabilities in biosecurity, cybersecurity, and chemical weapons
InfrastructureCompute cluster with 256 NVIDIA A100 GPUs available for AI safety researchers
ProgramML Safety Scholars — educational program training hundreds of students in AI safety fundamentals. Includes online course, reading groups, and mentorship.
Board MemberJaan Tallinn
CampaignStatement on AI Risk (May 2023): one-sentence statement 'Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.' Signed by 350+ AI leaders including Geoffery Hinton, Demis Hassabis, Sam Altman, and Dario Amodei.

Other Data

Entity Events
6 entries
TitleDateEventTypeDescriptionSignificance
Reported revenue of $10.2M (FY2024)2024milestoneCumulative funding reaches ~$33M since founding ($6.7M in 2022, $16.1M in 2023, $10.2M in 2024).moderate
Statement on AI Risk released2023-05milestoneOne-sentence statement on AI extinction risk attracted signatures from over 350 AI researchers and industry figures, including Turing Award recipients (Hinton, Bengio, Russell) and CEOs of major AI labs (Altman, Amodei, Hassabis).major
MACHIAVELLI benchmark released2023publicationBenchmark for evaluating goal-directed and deceptive behavior in AI systems.moderate
Representation Engineering paper published2023publicationMethods for reading and steering model internal representations.major
"Unsolved Problems in ML Safety" published2022publicationTaxonomy of open technical challenges in machine learning safety, intended partly as a research agenda for the field.major
Founded by Dan Hendrycks and Oliver Zhang2022foundingNonprofit research organization (EIN 88-1751310) focused on technical AI safety research, field-building, and public communication.major
Publications
12 entries
TitlePublicationTypeAuthorsUrlPublishedDateIsFlagship
Humanity's Last ExampaperLong Phan, Alice Gatti, Ziwen Han, Nathaniel Li et al.arxiv.org2025-01
Introduction to AI Safety, Ethics, and SocietybookDan Hendrycksaisafetybook.com2024-06
The WMDP Benchmark: Measuring and Reducing Malicious Use With UnlearningpaperNathaniel Li, Alexander Pan, Anjali Gopal et al.wmdp.ai2024
Superintelligence StrategyreportDan Hendrycks, Eric Schmidt, Alexandr Wangnationalsecurity.ai2024
Improving Alignment and Robustness with Circuit BreakerspaperAndy Zou, Long Phan, Justin Wang et al.arxiv.org2024
HarmBench: A Standardized Evaluation Framework for Automated Red TeamingpaperMantas Mazeika, Long Phan, Xuwang Yin et al.harmbench.org2024
Representation Engineering: A Top-Down Approach to AI TransparencypaperAndy Zou, Long Phan, Sarah Chen et al.arxiv.org2023-10
An Overview of Catastrophic AI RiskspaperDan Hendrycks, Mantas Mazeika, Thomas Woodsidearxiv.org2023-06
Statement on AI Riskpolicy-briefCAISaistatement.com2023-05
Universal and Transferable Adversarial Attacks on Aligned Language ModelspaperAndy Zou, Zifan Wang, Nicholas Carlini et al.llm-attacks.org2023
Unsolved Problems in ML SafetypaperDan Hendrycks, Nicholas Carlini, John Schulman, Jacob Steinhardtarxiv.org2021-09
Measuring Massive Multitask Language Understanding (MMLU)paperDan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardtarxiv.org2020-09

Divisions

6
Program

3-month SF program. $25K stipend. PhD/JD researchers.

Program·Varun Krovi

501(c)(4) advocacy arm. DC-based. Co-sponsored SB 1047. Lobbying ~$490K/yr.

Lab

80 A100 GPUs. 150+ researchers. ~100 safety papers, 16,000+ citations. Free access. Schmidt Sciences partnership.

Program

Provides free compute access to academic AI safety researchers. One of the largest non-industry compute resources available for safety research.

Program

Programs to grow the AI safety research community, including the Statement on AI Risk signed by hundreds of researchers and the ML Safety course.

Team·Dan Hendrycks

Technical AI safety research on robustness, interpretability, and alignment. Led by Dan Hendrycks.

Related Wiki Pages

Top Related Pages

Approaches

Capability Unlearning / RemovalMAIM (Mutually Assured AI Malfunction)AI AlignmentCorporate AI Safety Responses

Analysis

AI Compute Scaling MetricsAI Safety Intervention Effectiveness MatrixAI Uplift Assessment Model

Policy

Safe and Secure Innovation for Frontier Artificial Intelligence Models Act

Organizations

AnthropicCenter for Human-Compatible AI (CHAI)Center for AI Safety Action FundGoogle DeepMindUS AI Safety Institute (now CAISI)Redwood Research

Other

Geoffrey HintonStuart Russell

Concepts

AGI Timeline

Key Debates

Is AI Existential Risk Real?

Risks

AI-Induced Irreversibility

Historical

The MIRI Era