Skip to content

Connor Leahy

📋Page Status
Page Type:ContentStyle Guide →Standard knowledge base article
Quality:19 (Stub)
Importance:12 (Peripheral)
Last edited:2026-01-29 (3 days ago)
Words:1.3k
Backlinks:1
Structure:
📊 1📈 0🔗 0📚 062%Score: 3/15
LLM Summary:Biography of Connor Leahy, CEO of Conjecture AI safety company, who transitioned from co-founding EleutherAI (open-source LLMs) to focusing on interpretability-first alignment. He advocates for very short AGI timelines (2-5 years) and high existential risk, emphasizing mechanistic understanding over empirical tinkering.
Researcher

Connor Leahy

Importance12
RoleCEO & Co-founder
Known ForFounding Conjecture, AI safety advocacy, interpretability research
Related
Safety Agendas

Connor Leahy is the CEO and co-founder of Conjecture, an AI safety company focused on interpretability and “prosaic” approaches to AGI alignment. He represents a new generation of AI safety researchers who are building organizations specifically to tackle alignment.

Background:

  • Largely self-taught in AI and machine learning
  • Co-founder of EleutherAI (open-source AI research collective)
  • Founded Conjecture in 2022
  • Active public communicator on AI risk

Leahy’s journey from open-source AI contributor to safety company founder reflects growing concern about AI risks among those building the technology.

Co-founded EleutherAI, which:

  • Created GPT-Neo and GPT-J (open-source language models)
  • Demonstrated capabilities research outside major labs
  • Showed small teams could train large models
  • Made AI research more accessible

The shift: Working on capabilities research convinced Leahy that AI risk was severe and urgent.

Founded Conjecture because:

  • Believed prosaic AGI was coming soon
  • Thought existing safety work insufficient
  • Wanted to work on alignment with urgency
  • Needed independent organization focused solely on safety

Conjecture aims to:

  • Understand how AI systems work (interpretability)
  • Build safely aligned AI systems
  • Prevent catastrophic outcomes from AGI
  • Work at frontier of capabilities to ensure safety relevance

Interpretability:

  • Understanding neural networks mechanistically
  • Automated interpretability methods
  • Scaling understanding to large models

Alignment:

  • Prosaic alignment techniques
  • Testing alignment on current systems
  • Building aligned systems from scratch

Capability evaluation:

  • Understanding what models can really do
  • Detecting dangerous capabilities early
  • Red-teaming and adversarial testing

Connor Leahy’s public statements and interviews reveal a notably urgent perspective on AI risk compared to many researchers in the field. He combines very short timelines with high existential risk estimates, arguing that the default trajectory leads to catastrophic outcomes without significant changes to current approaches. His position emphasizes the need for immediate technical work on alignment rather than relying on slower governance interventions.

AssessmentEstimateReasoning
AGI timelineCould be 2-5 years (2023)Leahy believes AGI could arrive much sooner than mainstream estimates, pointing to rapid capability gains in language models and fewer remaining barriers than most researchers assume. His direct work on capabilities at EleutherAI gave him firsthand experience with how quickly scaling can produce surprising jumps in performance, making him skeptical of longer timeline projections.
P(doom)High without major changes (2023)Leahy expresses very high concern about default outcomes if alignment research doesn’t advance dramatically. He argues that current prosaic approaches to AI development naturally lead to misaligned systems, and that existing safety techniques are fundamentally insufficient for systems approaching AGI capabilities. His transition from capabilities work to founding a safety company reflects deep worry about the baseline trajectory.
UrgencyExtreme (2024)Leahy emphasizes the need for immediate action on alignment, arguing that the window for developing adequate safety measures is closing rapidly. He believes the field cannot afford to wait for theoretical breakthroughs or gradual governance changes, instead requiring urgent empirical work on interpretability and alignment with current systems to prepare for imminent advanced AI.
  1. AGI is very near: Could be 2-10 years, possibly sooner
  2. Default outcome is bad: Without major changes, things go poorly
  3. Prosaic alignment is crucial: Need to align systems similar to current ones
  4. Interpretability is essential: Can’t align what we don’t understand
  5. Need to move fast: Limited time before dangerous capabilities emerge

Leahy is notably more pessimistic about timelines than most:

  • Believes AGI could be very close
  • Points to rapid capability gains
  • Sees fewer barriers than many assume
  • Emphasizes uncertainty but leans short

Different from slowdown advocates:

  • Doesn’t think we’ll successfully slow down
  • Believes we need solutions that work in fast-moving world
  • Focuses on technical alignment over governance alone

Different from race-to-the-top:

  • Very concerned about safety
  • Skeptical of “building AGI to solve alignment”
  • Wants fundamental understanding first

Leahy is very active in public discourse:

  • Regular podcast appearances
  • Social media presence (Twitter/X)
  • Interviews and talks
  • Blog posts and essays

On urgency:

  • AGI could arrive much sooner than people think
  • We’re not prepared
  • Need to take this seriously now

On capabilities:

  • Current systems are more capable than commonly believed
  • Emergent capabilities make prediction hard
  • Safety must account for rapid jumps

On solutions:

  • Need mechanistic understanding
  • Can’t rely on empirical tinkering alone
  • Interpretability is make-or-break

Known for:

  • Direct, sometimes blunt language
  • Willingness to express unpopular views
  • Engaging in debates
  • Not mincing words about risks

Believes:

  • Can’t safely deploy what we don’t understand
  • Black-box approaches fundamentally insufficient
  • Need to open the black box before scaling further
  • Interpretability isn’t optional

Working on:

  • Systems similar to current architectures
  • Alignment techniques that work today
  • Scaling understanding to larger models
  • Not waiting for theoretical breakthroughs

Emphasizes:

  • Testing ideas on real systems
  • Learning from current models
  • Rapid iteration
  • Building working systems

Automated Interpretability:

  • Using AI to help understand AI
  • Scaling interpretability techniques
  • Finding circuits and features automatically

Capability Evaluation:

  • Understanding what models can do
  • Red-teaming frontier systems
  • Developing evaluation frameworks

Alignment Testing:

  • Empirical evaluation of alignment techniques
  • Stress-testing proposed solutions
  • Finding failure modes

Conjecture has:

  • Published research on interpretability
  • Released tools for safety research
  • Engaged in public discourse
  • Contributed to alignment community

Leahy’s advocacy has:

  • Brought attention to short timelines
  • Emphasized severity of risk
  • Recruited people to safety work
  • Influenced discourse on urgency

Conjecture demonstrates:

  • Can build safety-focused company
  • Don’t need to be at frontier labs
  • Independent safety research viable
  • Multiple organizational models possible

Active in:

  • Alignment research community
  • Public communication about AI risk
  • Mentoring and advising
  • Connecting researchers

Critics argue:

  • May be too pessimistic about timelines
  • Some statements are inflammatory
  • Conjecture’s approach might not scale
  • Public communication sometimes counterproductive

Supporters argue:

  • Better to be cautious about timelines
  • Direct communication is valuable
  • Conjecture doing important work
  • Field needs diverse voices

Leahy’s position:

  • Prefers to be wrong about urgency than complacent
  • Believes directness is necessary
  • Open to criticism and debate
  • Focused on solving problem

EleutherAI era:

  • Focused on democratizing AI
  • Excited about capabilities
  • Less concerned about risk

Transition:

  • Growing concern from working with models
  • Seeing rapid capability gains
  • Understanding alignment difficulty

Current:

  • Very concerned about risk
  • Focused entirely on safety
  • Urgent timeline beliefs
  • Public advocacy

At Conjecture:

  1. Interpretability research: Understanding how models work
  2. Capability evaluation: Knowing what’s possible
  3. Alignment testing: Validating proposed solutions
  4. Public communication: Raising awareness
  5. Team building: Growing safety research capacity

Leahy’s experience building language models convinced him:

  • Capabilities can surprise
  • Scaling works better than expected
  • Safety is harder than it looks
  • Need fundamental understanding

Observations about AI safety:

  • Not enough urgency
  • Too much theorizing, not enough empirical work
  • Need more attempts at solutions
  • Can’t wait for perfect understanding