Longterm Wiki
EH

Evan Hubinger

Co-authored Risks from Learned Optimization (2019) introducing mesa-optimization and deceptive alignment; led Sleeper Agents and Alignment Faking research at Anthropic; 3,400+ citations

Current Role
Head of Alignment Stress-Testing
Organization
Anthropic

Expert Positions1 topics

TopicViewEstimateConfidenceDate
Likelihood of Deceptive AlignmentPossible40%medium2019

Sources: Risks from Learned Optimization

Education

Harvey Mudd College

Publications & Resources2

No career history recorded.

No funding connections recorded.

Facts4

People
Role / TitleHead of Alignment Stress-Testing
Employed ByAnthropic
Biographical
Notable ForCo-authored Risks from Learned Optimization (2019) introducing mesa-optimization and deceptive alignment; led Sleeper Agents and Alignment Faking research at Anthropic; 3,400+ citations
EducationHarvey Mudd College
View all facts in KB explorer →