Constitutional AI
Alignment TrainingactiveTraining methodology using explicit principles and AI-generated feedback (RLAIF) to train safer language models.
Organizations
1
Key Papers
1
Tags
function:specificationstage:trainingscope:technique
Organizations1
| Organization | Role |
|---|---|
| Anthropic | pioneer |
Key Papers & Resources1
SEMINAL
Constitutional AI: Harmlessness from AI Feedback
Bai et al. (Anthropic)2022