Center for AI Safety (CAIS)

Safety Organization

Founded 2022 (4 years old)HQ: San Franciscosafe.ai ↗

Also known as: CAIS

Structured Facts

Database Records

Revenue

$10M

as of 2024

Total Funding Raised

$33M

as of 2025

Founded Date

2022

Key People

Varun Krovi

Executive Director, CAIS Action Fund; Director of Government Relations & Public Policy

Leads CAIS Action Fund and government relations. 15+ years policy/advocacy experience, former Capitol Hill Chief of Staff. Confirmed via LegiStorm and LinkedIn as of 2026-04-22.

Oliver ZhangFounder

Co-Founder and Managing Director

2022 – present

Co-founded CAIS in 2022 with Dan Hendrycks. Confirmed on Wikipedia and The Org as of 2026-04-22.

Andy ZouFounder

Co-Founder

2022 – present

Co-founded CAIS; also PhD student at CMU and founder of Gray Swan AI. Confirmed on personal site and Future of Life Institute as of 2026-04-22.

Andy Zou

Research Scientist

Lead author of adversarial attacks and representation engineering papers

Thomas Woodside

Policy Director

Scott Wiener

Key Legislator (SB 1047 sponsor)

Feb 2024 – present

California State Senator. Introduced SB 1047 February 2024. Passed legislature August 2024, vetoed September 2024. Also authored SB 53 (2025). Per Wikipedia.

Josué Estrada

Chief Operating Officer

Start date unknown; Confirmed on safe.ai/about as of 2026-03-16.

Funding History

Open Philanthropy General Support 2024grant2024

$8.5MLed by Open Philanthropy

$8,500,000 general support grant from Open Philanthropy in 2024. Confirmed via OP 2024 progress report.

openphilanthropy.org

SFF General Support 2024 (Jaan Tallinn)grant2024

$1.1MLed by Jaan Tallinn

$1,146,000 from Jaan Tallinn via Survival and Flourishing Fund, 2024.

survivalandflourishing.fund

Open Philanthropy General Support 2023grantApr 2023

$4MLed by Open Philanthropy

$4,000,000 general support grant from Open Philanthropy, April 2023.

openphilanthropy.org

SFF General Support 2023-H2 (Jaan Tallinn)grant2023

$909KLed by Jaan Tallinn

$909,000 from Jaan Tallinn via Survival and Flourishing Fund, H2 2023.

survivalandflourishing.fund

SFF General Support 2023-H1 (Jaan Tallinn)grant2023

$22KLed by Jaan Tallinn

$22,000 from Jaan Tallinn via Survival and Flourishing Fund, H1 2023.

survivalandflourishing.fund

FTX Future Fund Grant 2022grant2022

$6.5MLed by FTX Future Fund

$6.5M received in 2022. FTX bankruptcy estate issued subpoenas to CAIS in October 2023 seeking return of funds. Aligns with CAIS 2022 total revenue of $6.66M (IRS Form 990).

bloomberg.com

All Facts

Financial5 Organization2 General1 Other7

Financial

Annual Expenses$7.2M20243 pts▶

View property →

As Of	Value	Link
2024	$7.2M	view →
2023	$8.1M	view →
2022	$817K	view →

Grant Received$1.1M20254 pts▶

View property →

As Of	Value	Link
2025	$1.1M	view →
2024	$2.8M	view →
2023	$5.5M	view →
2022	$5.2M	view →

Net Assets$12M20243 pts▶

View property →

As Of	Value	Link
2024	$12M	view →
2023	$8.5M	view →
2022	$5.8M	view →

Revenue$10M20243 pts▶

View property →

As Of	Value	Link
2024	$10M	view →
2023	$16M	view →
2022	$6.7M	view →

Total Funding Raised$33M2025view →

Organization

Founded Date2022—view →

HeadquartersSan Francisco—view →

General

Websitehttps://www.safe.ai/—view →

Other

board-memberJaan Tallinn2024view →

campaignStatement on AI Risk (May 2023): one-sentence statement 'Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.' Signed by 350+ AI leaders including Geoffery Hinton, Demis Hassabis, Sam Altman, and Dario Amodei.May 2023view →

compensationDan Hendrycks takes $1 annual salary as Executive Director2025view →

infrastructureCompute cluster with 256 NVIDIA A100 GPUs available for AI safety researchers2024view →

key-personDan Hendrycks2025view →

programML Safety Scholars — educational program training hundreds of students in AI safety fundamentals. Includes online course, reading groups, and mentorship.2024view →

publicationThe WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning — benchmark for evaluating dual-use AI capabilities in biosecurity, cybersecurity, and chemical weaponsMar 20243 pts▶

View property →

As Of	Value	Link
Mar 2024	The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning — benchmark for evaluating dual-use AI capabilities in biosecurity, cybersecurity, and chemical weapons	view →
Oct 2023	Representation Engineering: A Top-Down Approach to AI Transparency — proposes methods to read and control LLM internal representations for safety	view →
Jan 2021	Measuring Massive Multitask Language Understanding (MMLU) — widely-used benchmark for evaluating LLM capabilities across 57 academic subjects	view →

Divisions

Name	DivisionType	Status	Source	Notes	Lead	Slug	StartDate	Website
Field-Building	program-area	active	safe.ai	Programs to grow the AI safety research community, including the Statement on AI Risk signed by hundreds of researchers and the ML Safety course.	—	—	—	—
Compute Cluster	program-area	active	safe.ai	Provides free compute access to academic AI safety researchers. One of the largest non-industry compute resources available for safety research.	—	—	—	—
Research	team	active	safe.ai	Technical AI safety research on robustness, interpretability, and alignment. Led by Dan Hendrycks.	—	—	—	—
Research	team	active	safe.ai	Technical AI safety research on robustness, interpretability, and alignment. Led by Dan Hendrycks (Executive & Research Director). Confirmed via CAIS about page as of 2026-04-23.	Dan Hendrycks	—	—	—
AI and Society Fellowship	program-area	active	safe.ai	3-month SF program. $25K stipend. PhD/JD researchers.	—	cais-fellowship	—	—
CAIS Compute Cluster	lab	active	safe.ai	80 A100 GPUs. 150+ researchers. ~100 safety papers, 16,000+ citations. Free access. Schmidt Sciences partnership.	—	cais-compute	—	—
CAIS Action Fund	program-area	active	action.safe.ai	501(c)(4) advocacy arm. DC-based. Co-sponsored SB 1047. Lobbying ~$490K/yr.	Varun Krovi	Center for AI Safety Action Fund	2023-07	action.safe.ai

Entity Events

Title	Date	EventType	Description	Significance	Source
Reported revenue of $10.2M (FY2024)	2024	milestone	Cumulative funding reaches ~$33M since founding ($6.7M in 2022, $16.1M in 2023, $10.2M in 2024).	moderate	projects.propublica.org
Statement on AI Risk released	2023-05	milestone	One-sentence statement on AI extinction risk attracted signatures from over 350 AI researchers and industry figures, including Turing Award recipients (Hinton, Bengio, Russell) and CEOs of major AI labs (Altman, Amodei, Hassabis).	major	—
MACHIAVELLI benchmark released	2023	publication	Benchmark for evaluating goal-directed and deceptive behavior in AI systems.	moderate	—
Representation Engineering paper published	2023	publication	Methods for reading and steering model internal representations.	major	—
"Unsolved Problems in ML Safety" published	2022	publication	Taxonomy of open technical challenges in machine learning safety, intended partly as a research agenda for the field.	major	—
Founded by Dan Hendrycks and Oliver Zhang	2022	founding	Nonprofit research organization (EIN 88-1751310) focused on technical AI safety research, field-building, and public communication.	major	—

Publications

Title	PublicationType	Authors	Url	PublishedDate	IsFlagship	Source	Notes
Humanity's Last Exam	paper	Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li et al.	arxiv.org	2025-01	✓	arxiv.org	—
Introduction to AI Safety, Ethics, and Society	book	Dan Hendrycks	aisafetybook.com	2024-06	✓	aisafetybook.com	Published by Routledge. Open-access online + audiobook.
The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning	paper	Nathaniel Li, Alexander Pan, Anjali Gopal et al.	wmdp.ai	2024	✓	wmdp.ai	ICML 2024. Biosecurity/cybersecurity knowledge unlearning.
Superintelligence Strategy	report	Dan Hendrycks, Eric Schmidt, Alexandr Wang	nationalsecurity.ai	2024	✓	nationalsecurity.ai	Co-authored with former Google CEO and Scale AI CEO
Improving Alignment and Robustness with Circuit Breakers	paper	Andy Zou, Long Phan, Justin Wang et al.	arxiv.org	2024	—	arxiv.org	ICML 2024
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming	paper	Mantas Mazeika, Long Phan, Xuwang Yin et al.	harmbench.org	2024	✓	harmbench.org	ICML 2024
Representation Engineering: A Top-Down Approach to AI Transparency	paper	Andy Zou, Long Phan, Sarah Chen et al.	arxiv.org	2023-10	✓	arxiv.org	—
An Overview of Catastrophic AI Risks	paper	Dan Hendrycks, Mantas Mazeika, Thomas Woodside	arxiv.org	2023-06	—	arxiv.org	—
Statement on AI Risk	policy-brief	CAIS	aistatement.com	2023-05	✓	aistatement.com	One-sentence statement signed by Hinton, Bengio, Altman, Amodei, Hassabis
Universal and Transferable Adversarial Attacks on Aligned Language Models	paper	Andy Zou, Zifan Wang, Nicholas Carlini et al.	llm-attacks.org	2023	✓	llm-attacks.org	Highly influential jailbreaking paper
Unsolved Problems in ML Safety	paper	Dan Hendrycks, Nicholas Carlini, John Schulman, Jacob Steinhardt	arxiv.org	2021-09	✓	arxiv.org	Defines 4 core challenges: robustness, monitoring, alignment, systemic safety
Measuring Massive Multitask Language Understanding (MMLU)	paper	Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt	arxiv.org	2020-09	✓	arxiv.org	Most widely used AI capability benchmark. ICLR 2021.

▶Internal Metadata

ID:	sid_y4bieqSeag
Stable ID:	sid_y4bieqSeag
Wiki ID:	E47
Type:	organization
YAML Source:	packages/factbase/data/fb-entities/cais.yaml
Facts:	26 structured (27 total)
Records:	38 in 5 collections