Red Teaming

Evaluationactive

Adversarial testing of AI systems to discover failure modes, safety issues, and vulnerabilities, both manual and automated.

Organizations

7

Grants

4

Total Funding

$198K

Risks Addressed

1

Cluster: Evaluation

Parent Area: AI Evaluations

Organizations3

Organization	Role
Anthropic	active
Google DeepMind	active
OpenAI	active

Grants4

Name	Recipient	Amount	Funder	Date
UC Berkeley — AI Red-teaming Bootcamp	University of California, Berkeley	$100K	Coefficient Giving	2025-01
Meta level adversarial evaluation of debate (scalable oversight technique) on simple math problems (MATS 5.0 project)	Yoav Tzfati	$62K	Long-Term Future Fund (LTFF)	2024-01
SoGive does EA analysis. We red-team EA analysis and conduct EA research to support donors. We also support EA talent	SoGive	$18K	Centre for Effective Altruism	2022-07
SoGive does EA analysis. We red-team EA analysis and conduct EA research to support donors. We also support EA talent	SoGive	$18K	Centre for Effective Altruism	2022-07

Funding by Funder

Funder	Grants	Total Amount
Coefficient Giving	1	$100K
Long-Term Future Fund (LTFF)	1	$62K
Centre for Effective Altruism	1	$18K
Centre for Effective Altruism	1	$18K

Tags

red-teamingadversarialsafety-testing