Provably Safe AI

Scalable Oversightactive

Davidad's agenda for building AI systems with mathematical safety guarantees from world models.

First Proposed: 2023 (davidad)

Cluster: Scalable Oversight