Constitutional AI

Alignment Trainingactive

Training approach where AI systems critique and revise their own outputs using a set of principles, reducing reliance on human feedback.

Organizations

Key Papers

Grants

Total Funding

$42K

First Proposed: 2022 (Bai et al., Anthropic)

Cluster: Alignment Training

Organizations1

Organization	Role
Anthropic	pioneer

Name	Recipient	Amount	Funder	Date
6-month 1 FTE funding to train Multi-Objective RLAIF models and compare their safety performance to standard RLAIF	Marcus Williams	$42K	Long-Term Future Fund (LTFF)	2023-10

Funder	Grants	Total Amount
Long-Term Future Fund (LTFF)	1	$42K

SEMINAL

Bai et al. (Anthropic)2022