Longterm Wiki

AI Safety via Debate

Scalable Oversightactive

Using adversarial debate between AI systems to help humans evaluate complex claims.

First Proposed: 2018 (Irving et al.)
Cluster: Scalable Oversight
Parent Area: Scalable Oversight

Tags

function:specificationscope:technique