Skip to content
Longterm Wiki
Back

Author

Credibility Rating

3/5
Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: Alignment Forum

Written by Marius Hobbhahn of Apollo Research in January 2025, this post synthesizes near-term AI safety priorities under short-timeline assumptions and is notable for prompting community discussion on the absence of detailed public safety roadmaps.

Metadata

Importance: 72/100blog postanalysis

Summary

Marius Hobbhahn outlines a two-layer safety plan for scenarios where transformative AI arrives soon, arguing that current publicly available strategies are insufficiently detailed. Layer 1 focuses on near-term controls like CoT monitoring, AI control, and evals; Layer 2 addresses deeper alignment research including interpretability and scalable oversight.

Key Points

  • Short AI timelines are treated as plausible, requiring concrete safety plans that go beyond vague aspirations.
  • Layer 1 priorities: maintaining human-legible chain-of-thought, improved monitoring, AI control methods, scheming detection, robust evals, and security.
  • Layer 2 priorities: improved near-term alignment strategies, interpretability, scalable oversight, reasoning transparency, and safety-first organizational culture.
  • The post expresses concern that detailed, publicly available short-timeline safety plans are largely absent from the AI safety community.
  • The author acknowledges known limitations and open questions, framing the post as a prompt for community discussion rather than a finished blueprint.

Cited by 1 page

PageTypeQuality
Short AI Timeline Policy ImplicationsAnalysis62.0

Cached Content Preview

HTTP 200Fetched Apr 7, 202655 KB
Dec
 JAN
 Feb
 

 
 

 
 16
 
 

 
 

 2025
 2026
 2027
 

 
 
 

 

 

 
 
success

 
fail

 
 
 
 
 
 
 
 
 
 
 

 

 
 
 
 
 
 
 
 
 

 

 About this capture
 

 

 

 

 

 

 
COLLECTED BY

 

 

 
 
Collection: Common Crawl

 

 

 Web crawl data from Common Crawl.
 

 

 

 

 

 
TIMESTAMPS

 

 

 

 

 

 

The Wayback Machine - https://web.archive.org/web/20260116190821/https://www.alignmentforum.org/posts/bb5Tnjdrptu89rcyY/what-s-the-short-timeline-plan

 

x

 This website requires javascript to properly function. Consider activating javascript to get access to all site functionality. 

AI ALIGNMENT FORUM

AF

Login

What’s the short timeline plan? — AI Alignment Forum

AI ControlAI EvaluationsDeceptive AlignmentAI TimelinesAI
Frontpage

2025 Top Fifty: 5%

130

What’s the short timeline plan?

by Marius Hobbhahn

2nd Jan 2025

28 min read

51

130

This is a low-effort post (at least, it was intended as such ...). I mostly want to get other people’s takes and express concern about the lack of detailed and publicly available plans so far. This post reflects my personal opinion and not necessarily that of other members of Apollo Research. I’d like to thank Ryan Greenblatt, Bronson Schoen, Josh Clymer, Buck Shlegeris, Dan Braun, Mikita Balesni, Jérémy Scheurer, and Cody Rushing for comments and discussion.

I think short timelines, e.g. AIs that can replace a top researcher at an AGI lab without losses in capabilities by 2027, are plausible. Some people have posted ideas on what a reasonable plan to reduce AI risk for such timelines might look like (e.g. Sam Bowman’s checklist, or Holden Karnofsky’s list in his 2022 nearcast), but I find them insufficient for the magnitude of the stakes (to be clear, I don’t think these example lists were intended to be an extensive plan).

If we take AGI seriously, I feel like the AGI companies and the rest of the world should be significantly more prepared, and I think we’re now getting into the territory where models are capable enough that acting without a clear plan is irresponsible. 

In this post, I want to ask what such a short timeline plan could look like. Intuitively, if an AGI lab came to me today and told me, “We really fully believe that we will build AGI by 2027, and we will enact your plan, but we aren’t willing to take more than a 3-month delay,” I want to be able to give the best possible answer. I list some suggestions but I don’t think they are anywhere near sufficient. I’d love to see more people provide their answers. If a funder is interested in funding this, I’d also love to see some sort of “best short-timeline plan prize” where people can win money for the best plan as judged by an expert panel. 

In particular, I think the AGI companies should publish their detailed plans (minus secret information) so that governments, academics, and civil society can criticize and improve them. I think RSPs were a great step in the right direction and did improve their reasoning transparency

... (truncated, 55 KB total)
Resource ID: 145e6d684253d6f0 | Stable ID: sid_wSirZtFZVz