Eliciting Latent Knowledge
Scalable OversightactiveGetting AI systems to honestly report what they know, even when deception would be rewarded.
Organizations
1
Key Papers
1
First Proposed: 2021 (Christiano et al., ARC)
Cluster: Scalable Oversight
Parent Area: Scalable Oversight
Tags
function:specificationscope:technique
Organizations1
| Organization | Role |
|---|---|
| Alignment Research Center | pioneer |
Key Papers & Resources1
SEMINAL
Eliciting Latent Knowledge
Christiano et al. (ARC)2021