Comprehensive analysis of situational awareness in AI systems, documenting that Claude 3 Opus fakes alignment 12% baseline (78% post-RL), 5 of 6 fr...
Tool use capabilities achieved superhuman computer control in late 2025 (OSAgent: 76.26% vs 72% human baseline) and near-human coding (Claude Opus ...
Comprehensive survey tracking reasoning model progress from 2022 CoT to late 2025, documenting dramatic capability gains (GPT-5.2: 100% AIME, 52.9%...
Analysis of agentic AI capabilities and deployment challenges, documenting industry forecasts (40% of enterprise apps by 2026, \$199B market by 203...
Comprehensive survey of AI scientific research capabilities across biology, chemistry, materials science, and automated research, documenting key b...
Comprehensive analysis of LLM capabilities showing rapid progress from GPT-2 (1.5B parameters, 2019) to GPT-5 and Gemini 2.5 (2025), with training ...
Comprehensive analysis of AI self-improvement from current AutoML systems (23% training speedups via AlphaEvolve) to theoretical intelligence explo...
METR research shows AI task completion horizons doubling every 7 months (accelerated to 4 months in 2024-2025), with current frontier models achiev...
GPT-4 achieves superhuman persuasion in controlled settings (64% win rate, 81% higher odds with personalization), with AI chatbots demonstrating 4x...
AI coding capabilities reached 70-76% on curated benchmarks (23-44% on complex tasks) as of 2025, with 46% of code now AI-written and 55.8% faster ...
Comprehensive analysis concluding human-only collective intelligence has <1% probability of matching transformative AI, but collective AI architect...
Comprehensive analysis of biological/organoid computing showing current systems (DishBrain with ~800k neurons, Brainoware at 78% speech recognition...
Comprehensive analysis of neuro-symbolic AI systems combining neural networks with formal reasoning, documenting AlphaProof's 2024 IMO silver medal...
Comprehensive analysis of world models + planning architectures showing 10-500x sample efficiency gains over model-free RL (EfficientZero: 194% hum...
Analyzes minimal scaffolding (basic AI chat interfaces) showing 38x performance gap vs agent systems on code tasks (1.96% → 75% on SWE-bench), decl...
Light scaffolding (RAG, function calling, simple chains) represents the current enterprise deployment standard with 92% Fortune 500 adoption, achie...
Genetic enhancement via embryo selection currently yields 2.5-6 IQ points per generation with 10% variance explained by polygenic scores, while the...
RLHF/Constitutional AI achieves 82-85% preference improvements and 40.8% adversarial attack reduction for current systems, but faces fundamental sc...
Comprehensive reference on Sparse/MoE transformer architectures covering key models (Mixtral, DeepSeek-V3, DBRX, Switch Transformer), efficiency ga...
Neuromorphic computing achieves 100-1000x energy efficiency over GPUs for sparse inference (Intel Hala Point: 15 TOPS/W) but faces a 15%+ capabilit...
Analyzes probability (1-15%) of novel AI paradigms emerging before transformative AI, systematically reviewing historical prediction failures (expe...
Comprehensive analysis of state-space models (SSMs) like Mamba as transformer alternatives, documenting that Mamba-3B matches Transformer-6B perple...
Comprehensive analysis of whole brain emulation finding <1% probability of arriving before AI-based TAI, with scanning speed (100,000x too slow for...
Comprehensive analysis of BCIs concluding they are irrelevant for TAI timelines (<1% probability of dominance) due to fundamental bandwidth constra...
AI systems can synthesize vast volumes of public data — social media, corporate filings, court records, satellite imagery — to conduct investigativ...
Different architectures and approaches to building intelligent systems