Skip to content
Longterm Wiki
Entity
Wiki
About
Business
Data

Apollo Research is an AI safety research organization founded in 2023 with a specific focus on one of the most concerning potential failure modes: deceptive alignment and scheming behavior in advanced AI systems.

Facts

1
General
Websitehttps://www.apolloresearch.ai

Other Data

Entity Assessments
7 entries
DimensionRatingEvidenceAssessor
government-integrationStrongUK AISI partner, US AISI consortium member, presented at Bletchley AI Summiteditorial
intervention-impactMeasurableDeliberative alignment reduced scheming from 13% to 0.4% (30x reduction) in OpenAI modelseditorial
key-finding-2024Criticalo1 maintains deception in over 85% of follow-up questions after engaging in schemingeditorial
lab-partnershipsExtensivePre-deployment evaluations for [OpenAI](https://openai.com/index/detecting-and-reducing-scheming-in-ai-models/), [Anthropic](https://www.anthropic.com), and [Google DeepMind](https://deepmind.google/blog/deepening-our-partnership-with-the-uk-ai-security-institute/)editorial
methodology-rigorVery High300 rollouts per model/evaluation; statistically significant results (p less than 0.05)editorial
research-outputHigh Impact[December 2024 paper](https://arxiv.org/abs/2412.04984) tested 6 frontier models across 180+ scenarios; cited in OpenAI/Anthropic safety frameworkseditorial
team-size~20 researchersFull-time staff including CEO Marius Hobbhahn, named [TIME 100 AI 2025](https://time.com/collections/time100-ai-2025/7305864/marius-hobbhahn/)editorial

Divisions

1
Team

AI safety evaluations focused on detecting deceptive and scheming behaviors in frontier models. Published influential research on in-context scheming in 2024.

Related Wiki Pages

Top Related Pages

Approaches

AI Safety CasesEvaluation AwarenessScalable Eval ApproachesThird-Party Model Auditing

Analysis

AI Safety Intervention Effectiveness MatrixAI Risk Interaction Network Model

Policy

Voluntary AI Safety Commitments

Risks

AI Capability Sandbagging

Concepts

Situational AwarenessExistential Risk from AILarge Language ModelsPersuasion and Social Manipulation

Other

AI EvaluationsRed TeamingJaan TallinnKamal Ndousse

Organizations

Alignment Research Center (ARC)Anthropic

Key Debates

AI Accident Risk CruxesTechnical AI Safety Research