Capability Elicitation
Systematic methods to discover what AI models can actually do, including hidden capabilities that may not appear in standard benchmarks, through scaffolding, fine-tuning, and specialized prompting techniques. METR research shows AI agent task completion doubles every 7 months.
Related
Related Pages
Top Related Pages
METR
Model Evaluation and Threat Research conducts dangerous capability evaluations for frontier AI models, testing for autonomous replication, cybersec...
Anthropic
An AI safety company founded by former OpenAI researchers that develops frontier AI models while pursuing safety research, including the Claude mod...
AI Capability Sandbagging
AI systems strategically hiding or underperforming their true capabilities during evaluation.
Apollo Research
AI safety organization conducting rigorous empirical evaluations of deception, scheming, and sandbagging in frontier AI models, providing concrete ...
Alignment Research Center (ARC)
AI safety research nonprofit operating as ARC Theory, investigating fundamental alignment problems including Eliciting Latent Knowledge and heurist...