Skip to content
Longterm Wiki

Circuit Breakers

AI Controlactive
Inference-time interventions that halt model execution when unsafe behavior is detected.
Organizations
2
Key Papers
1
First Proposed: 2024 (Zou et al.)
Cluster: AI Control
Parent Area: AI Control

Tags

function:robustnessstage:inferencescope:technique