Skip to content
Longterm Wiki

Scheming Detection

Evaluationemerging
Research on detecting when AI systems are engaged in deceptive alignment or strategic manipulation of their training process.
Organizations
4
Grants
1
Total Funding
$27K
Cluster: Evaluation
Parent Area: AI Evaluations

Grants1

NameRecipientAmountFunderDate
4-month grant to conduct deceptive alignment evaluation research and explore control and mitigation strategiesKai Fronsdal$27KLong-Term Future Fund (LTFF)2024-07

Funding by Funder

FunderGrantsTotal Amount
Long-Term Future Fund (LTFF)1$27K

Tags

evaluationsdeceptiondeceptive-alignment