Longterm Wiki
Back

Forecasting AGI: Insights from Prediction Markets

blog

Data Status

Not fetched

Cited by 2 pages

PageTypeQuality
AI TimelinesConcept95.0
Manifold (Prediction Market)Organization43.0

Cached Content Preview

HTTP 200Fetched Feb 26, 2026302 KB
x This website requires javascript to properly function. Consider activating javascript to get access to all site functionality. Forecasting AGI: Insights from Prediction Markets and Metaculus — LessWrong AI Timelines Forecasts (Specific Predictions) Metaculus Prediction Markets AI Frontpage 13 Forecasting AGI: Insights from Prediction Markets and Metaculus by Alvin Ånestrand 4th Feb 2025 AI Alignment Forum Linkpost from forecastingaifutures.substack.com 5 min read 0 13 Ω 4 I have tried to find all prediction market and Metaculus questions related to AGI timelines. Here I examine how they compare to each other, and what they actually say about when AGI might arrive. If you know of a market that I have missed, please tell me in the comment section! It would also be helpful if you tell me about what questions you think are relevant but are missing from this analysis. This is a linkpost, and I prefer if you comment in the original post on my new blog, Forecasting AI Futures , but feel free to comment here as well. Subscribe to the blog for updates on my future forecasting posts related to AI safety. Whenever possible, please check the more recent probability estimates in the embedded sites, instead of looking at my At The Time Of Writing (ATTOW) numbers. So, what does prediction markets and Metaculus have to say about AGI? Metaculus has this question for the arrival date of AGI: The AI system needs to be able to: Pass a really hard Turing test. Have general robotic capabilities (being able to assemble a “ circa-2021 Ferrari 312 T4 1:8 scale automobile model ” or equivalent). Achieve “at least 75% accuracy in every task and 90% mean accuracy across all tasks” on the MMLU benchmark , which measures expertise in a wide range of academic subjects. Achieve at least 90% accuracy with a single attempt for each question on the APPS benchmark , which measures coding skills. Metaculus thinks this will probably occur around the middle of 2030, though with high uncertainty. The interval between the lower and upper quartiles for the individual predictions on this question is (2026-12-28 - 2039-03-27) ATTOW. GPT-4o achieves an accuracy of 88.7% on MMLU, as seen in the leaderboard here . GPT-4 was used to get 22% accuracy on APPS . Unfortunately, most of the best models have not been tested on either MMLU or APPS. OpenAI’s o3 has been reported of achieving 71.7% on SWE-bench Verified . We can compare that to GPT-4, which managed to achieve 22.4% on SWE-bench Verified and 22% accuracy on APPS. Based on this, I think o3 would manage to achieve above 50% accuracy on APPS. The two criteria that AI currently seem furthest from fulfilling are the robotics capabilities and APPS accuracy, though current best performance on the APPS benchmark is uncertain. Coding capabilities are improving very fast, which indicated by the rapid improvements in accuracy in SWE-bench Verified , while robotics capabilities are lagging behind. If there are not too many errors in the APPS benc

... (truncated, 302 KB total)
Resource ID: 90fca29ade44fd7d | Stable ID: ZmFiYjczMz