Longterm Wiki

Eliciting Latent Knowledge

Scalable Oversightactive

Getting AI systems to honestly report what they know, even when deception would be rewarded.

Organizations
1
Key Papers
1
First Proposed: 2021 (Christiano et al., ARC)
Cluster: Scalable Oversight
Parent Area: Scalable Oversight

Tags

function:specificationscope:technique

Organizations1

OrganizationRole
Alignment Research Centerpioneer

Key Papers & Resources1

SEMINAL
Eliciting Latent Knowledge
Christiano et al. (ARC)2021