Longterm Wiki

Weak-to-Strong Generalization

Alignment Trainingactive

Using weaker models to supervise stronger ones as a proxy for scalable oversight research.

Key Papers
1
First Proposed: 2023 (Burns et al., OpenAI)
Cluster: Alignment Training

Tags

function:specificationstage:trainingscope:technique

Key Papers & Resources1

SEMINAL
Weak-to-Strong Generalization
Burns et al. (OpenAI)2023