Page Type:ContentStyle Guide →Standard knowledge base article
Quality:37 (Draft)⚠️
Importance:28 (Peripheral)
Last edited:2026-01-29 (3 days ago)
Words:1.6k
Structure:
📊 18📈 0🔗 35📚 0•15%Score: 10/15
LLM Summary:Conjecture is a 30-40 person London-based AI safety org founded 2021, pursuing Cognitive Emulation (CoEm) - building interpretable AI from ground-up rather than aligning LLMs - with $30M+ Series A funding. Founded by Connor Leahy (EleutherAI), they face high uncertainty about CoEm competitiveness (3-5 year timeline) and commercial pressure risks.
Issues (1):
QualityRated 37 but structure suggests 67 (underrated by 30 points)
Conjecture is an AI safety research organization founded in 2021 by Connor LeahyResearcherConnor LeahyBiography of Connor Leahy, CEO of Conjecture AI safety company, who transitioned from co-founding EleutherAI (open-source LLMs) to focusing on interpretability-first alignment. He advocates for ver...Quality: 19/100 and a team of researchers concerned about existential risks from advanced AI. The organization pursues a distinctive technical approach centered on “Cognitive Emulation” (CoEm) - building interpretable AI systems based on human cognition principles rather than aligning existing large language models.
Based in London with a team of 30-40 researchers, Conjecture raised over $10M in Series A funding in 2023. Their research agenda emphasizes mechanistic interpretability and understanding neural network internals, representing a fundamental alternative to mainstream prosaic alignment approachesArgumentWhy Alignment Might Be HardComprehensive synthesis of why AI alignment is fundamentally difficult, covering specification problems (value complexity, Goodhart's Law), inner alignment failures (mesa-optimization, deceptive al...Quality: 61/100 pursued by organizations like AnthropicLabAnthropicComprehensive profile of Anthropic tracking its rapid commercial growth (from $1B to $7B annualized revenue in 2025, 42% enterprise coding market share) alongside safety research (Constitutional AI...Quality: 51/100 and OpenAILabOpenAIComprehensive organizational profile of OpenAI documenting evolution from 2015 non-profit to commercial AGI developer, with detailed analysis of governance crisis, safety researcher exodus (75% of ...Quality: 46/100.
Conjecture emerged from the EleutherAI collective, an open-source AI research group that successfully recreated GPT-3 as open-source models (GPT-J, GPT-NeoX). Key founding factors:
Factor
Impact
Details
EleutherAI Experience
High
Demonstrated capability replication feasibility
Safety Concerns
High
Recognition of risks from capability proliferation
European Gap
Medium
Limited AI safety ecosystem outside Bay Area
Funding Availability
Medium
Growing investor interest in AI safety
Philosophical Evolution: The transition from EleutherAI’s “democratize AI” mission to Conjecture’s safety-focused approach represents a significant shift in thinking about AI development and publication strategies.
Interpretability researchSafety AgendaInterpretabilityMechanistic interpretability has extracted 34M+ interpretable features from Claude 3 Sonnet with 90% automated labeling accuracy and demonstrated 75-85% success in causal validation, though less th...Quality: 66/100 collaboration
From open-source advocacy to safety-focused research
Public Role
Active AI policy engagement, podcast appearances
Views
Short AI timelines, high P(doom), interpretability-necessary
Timeline Estimates: Leahy has consistently expressed short AI timelineAgi TimelineComprehensive synthesis of AGI timeline forecasts showing dramatic acceleration: expert median dropped from 2061 (2018) to 2047 (2023), Metaculus from 50 years to 5 years since 2020, with current p...Quality: 59/100 views, suggesting AGI within years rather than decades.
AnthropicLabAnthropicComprehensive profile of Anthropic tracking its rapid commercial growth (from $1B to $7B annualized revenue in 2025, 42% enterprise coding market share) alongside safety research (Constitutional AI...Quality: 51/100
Frontier models + interpretability
Post-hoc analysis of LLMs
ARCOrganizationARCComprehensive overview of ARC's dual structure (theory research on Eliciting Latent Knowledge problem and systematic dangerous capability evaluations of frontier AI models), documenting their high ...Quality: 43/100
Theoretical alignment
Evaluation and ELK research
RedwoodOrganizationRedwood ResearchRedwood Research is an AI safety lab founded in 2021 that has made significant contributions to mechanistic interpretability and, more recently, pioneered the "AI control" research agenda.
UK AISIOrganizationUK AI Safety InstituteThe UK AI Safety Institute (renamed AI Security Institute in Feb 2025) operates with ~30 technical staff and 50M GBP annual budget, conducting frontier model evaluations using its open-source Inspe...Quality: 52/100 consultation
Critical if AGI timelinesAgi TimelineComprehensive synthesis of AGI timeline forecasts showing dramatic acceleration: expert median dropped from 2061 (2018) to 2047 (2023), Metaculus from 50 years to 5 years since 2020, with current p...Quality: 59/100 short
Conjecture’s leadership has articulated clear views on AI timelines and safety approaches, which fundamentally motivate their Cognitive Emulation research agenda and organizational strategy:
Expert/Source
Estimate
Reasoning
Connor Leahy
AGI: 2-10 years
Leahy has consistently expressed short AI timeline views across multiple public statements and podcasts from 2023-2024, suggesting transformative AI systems could emerge within years rather than decades. These short timelines create urgency for developing interpretability-first approaches before AGI arrives.
Connor Leahy
P(doom): High without major changes
Leahy has expressed significant concern about the default trajectory of AI development in 2023 statements, arguing that prosaic alignment approaches pursued by frontier labs are insufficient to ensure safety. This pessimism about conventional alignment motivates Conjecture’s alternative CoEm approach.
Conjecture Research
Prosaic alignment: Insufficient
The organization’s core research direction reflects a fundamental assessment that post-hoc alignment of large language models through techniques like RLHF and Constitutional AI cannot provide adequate safety guarantees. This view, maintained since founding, drives their pursuit of interpretability-first system design.
Organization
Interpretability: Necessary for safety
Conjecture’s founding premise holds that mechanistic interpretability is not merely useful but necessary for AI safety verification. This fundamental research assumption distinguishes them from organizations pursuing behavioral safety approaches and shapes their entire technical agenda.
AnthropicLabAnthropicComprehensive profile of Anthropic tracking its rapid commercial growth (from $1B to $7B annualized revenue in 2025, 42% enterprise coding market share) alongside safety research (Constitutional AI...Quality: 51/100
Friendly competition
Interpretability research sharing
ARCOrganizationARCComprehensive overview of ARC's dual structure (theory research on Eliciting Latent Knowledge problem and systematic dangerous capability evaluations of frontier AI models), documenting their high ...Quality: 43/100
Complementary
Different technical approaches
MIRIOrganizationMIRIComprehensive organizational history documenting MIRI's trajectory from pioneering AI safety research (2000-2020) to policy advocacy after acknowledging research failure, with detailed financial da...Quality: 50/100
UK AI Safety InstituteOrganizationUK AI Safety InstituteThe UK AI Safety Institute (renamed AI Security Institute in Feb 2025) operates with ~30 technical staff and 50M GBP annual budget, conducting frontier model evaluations using its open-source Inspe...Quality: 52/100 consultation
GovAI Analysis↗🏛️ government★★★★☆Centre for the Governance of AIGovAIA research organization focused on understanding AI's societal impacts, governance challenges, and policy implications across various domains like workforce, infrastructure, and...governanceagenticplanninggoal-stability+1Source ↗Notes
Technical InterpretabilitySafety AgendaInterpretabilityMechanistic interpretability has extracted 34M+ interpretable features from Claude 3 Sonnet with 90% automated labeling accuracy and demonstrated 75-85% success in causal validation, though less th...Quality: 66/100
UK AISIOrganizationUK AI Safety InstituteThe UK AI Safety Institute (renamed AI Security Institute in Feb 2025) operates with ~30 technical staff and 50M GBP annual budget, conducting frontier model evaluations using its open-source Inspe...Quality: 52/100
EU AI Office↗🔗 web★★★★☆European Union**EU AI Office**risk-factorcompetitiongame-theorycascades+1Source ↗Notes