Scalable Oversight
Scalable OversightactiveResearch on supervising AI systems that approach or exceed human-level capabilities.
Organizations
3
Key Papers
2
Risks Addressed
2
First Proposed: 2018 (Irving et al.)
Cluster: Scalable Oversight
Tags
function:specificationscope:field
Organizations3
| Organization | Role |
|---|---|
| Anthropic | pioneer |
| Alignment Research Center | pioneer |
| Google DeepMind | active |
Key Papers & Resources2
SEMINAL
AI Safety via Debate
Irving et al.2018
SEMINAL
Scalable Agent Alignment via Reward Modeling
Leike et al.2018
Sub-Areas2
| Name | Status | Orgs | Papers |
|---|---|---|---|
| AI Safety via DebateUsing adversarial debate between AI systems to help humans evaluate complex claims. | active | 0 | 0 |
| Eliciting Latent KnowledgeGetting AI systems to honestly report what they know, even when deception would be rewarded. | active | 1 | 1 |