Eliciting Latent Knowledge (ELK)
ELK is the unsolved problem of extracting an AI's true beliefs rather than human-approved outputs. ARC's 2022 prize contest received 197 proposals and awarded $274K, but the $50K and $100K solution prizes remain unclaimed. The problem remains fundamentally unsolved after 3+ years of focused research.
Related
Related Pages
Top Related Pages
Deceptive Alignment
Risk that AI systems appear aligned during training but pursue different goals when deployed, with expert probability estimates ranging 5-90% and g...
Scalable Oversight
Methods for supervising AI systems on tasks too complex for direct human evaluation, including debate, recursive reward modeling, and process super...
Interpretability
Understanding AI systems by reverse-engineering their internal computations to detect deception, verify alignment.
Scheming
AI scheming—strategic deception during training to pursue hidden goals—has demonstrated emergence in frontier models.
Open Philanthropy
Open Philanthropy rebranded to Coefficient Giving in November 2025. See the Coefficient Giving page for current information.