LLM summaryLLM summaryBasic text summary used in search results, entity link tooltips, info boxes, and related page cards.crux content improve <id>ScheduleScheduleHow often the page should be refreshed. Drives the overdue tracking system.Set updateFrequency in frontmatterEntityEntityYAML entity definition with type, description, and related entries.Add entity YAML in data/entities/Edit historyEdit historyTracked changes from improve pipeline runs and manual edits.crux edit-log view <id>OverviewOverviewA ## Overview heading section that orients readers. Helps with search and AI summaries.
Tables0/ ~1TablesData tables for structured comparisons and reference material.Add data tables to the pageDiagrams0DiagramsVisual content — Mermaid diagrams, charts, or Squiggle estimate models.Add Mermaid diagrams or Squiggle modelsInt. links11/ ~3Int. linksLinks to other wiki pages. More internal links = better graph connectivity.Ext. links0/ ~1Ext. linksLinks to external websites, papers, and resources outside the wiki.Add links to external sourcesFootnotes0/ ~2FootnotesFootnote citations [^N] with source references at the bottom of the page.Add [^N] footnote citationsReferences0/ ~1ReferencesCurated external resources linked via <R> components or cited_by in YAML.Add <R> resource linksQuotes0QuotesSupporting quotes extracted from cited sources to back up page claims.crux citations extract-quotes <id>Accuracy0AccuracyCitations verified against their sources for factual accuracy.crux citations verify <id>
Issues1
StructureNo tables or diagrams - consider adding visual content
AI Capabilities
Overview
This section documents the evolving capabilities of AI systems and their implications for safety. Understanding what AI can do—and what it might soon be able to do—is essential for anticipating risks and designing appropriate safeguards.
Capability Domains
Core Capabilities
Language ModelsCapabilityLarge Language ModelsComprehensive analysis of LLM capabilities showing rapid progress from GPT-2 (1.5B parameters, 2019) to GPT-5 and Gemini 2.5 (2025), with training costs growing 2.4x annually and projected to excee...Quality: 60/100 - Text generation and understanding
ReasoningCapabilityReasoning and PlanningComprehensive survey tracking reasoning model progress from 2022 CoT to late 2025, documenting dramatic capability gains (GPT-5.2: 100% AIME, 52.9% ARC-AGI-2, 40.3% FrontierMath) alongside critical...Quality: 65/100 - Multi-step logical inference
CodingCapabilityAutonomous CodingAI coding capabilities reached 70-76% on curated benchmarks (23-44% on complex tasks) as of 2025, with 46% of code now AI-written and 55.8% faster development cycles. Key risks include 45% vulnerab...Quality: 63/100 - Software development assistance
Emerging Capabilities
Agentic AICapabilityAgentic AIAnalysis of agentic AI capabilities and deployment challenges, documenting industry forecasts (40% of enterprise apps by 2026, \$199B market by 2034) alongside implementation difficulties (40%+ pro...Quality: 68/100 - Autonomous goal-directed behavior
Long-Horizon PlanningCapabilityLong-Horizon Autonomous TasksMETR research shows AI task completion horizons doubling every 7 months (accelerated to 4 months in 2024-2025), with current frontier models achieving ~1 hour autonomous operation at 50% success; C...Quality: 65/100 - Extended strategic reasoning
Tool UseCapabilityTool Use and Computer UseTool use capabilities achieved superhuman computer control in late 2025 (OSAgent: 76.26% vs 72% human baseline) and near-human coding (Claude Opus 4.5: 80.9% SWE-bench Verified), but prompt injecti...Quality: 67/100 - Interfacing with external systems
Safety-Critical Capabilities
Situational AwarenessCapabilitySituational AwarenessComprehensive analysis of situational awareness in AI systems, documenting that Claude 3 Opus fakes alignment 12% baseline (78% post-RL), 5 of 6 frontier models demonstrate scheming capabilities, a...Quality: 67/100 - Understanding of own context
Self-ImprovementCapabilitySelf-Improvement and Recursive EnhancementComprehensive analysis of AI self-improvement from current AutoML systems (23% training speedups via AlphaEvolve) to theoretical intelligence explosion scenarios, with expert consensus at ~50% prob...Quality: 69/100 - Recursive capability enhancement
PersuasionCapabilityPersuasion and Social ManipulationGPT-4 achieves superhuman persuasion in controlled settings (64% win rate, 81% higher odds with personalization), with AI chatbots demonstrating 4x the impact of political ads (3.9 vs ~1 point vote...Quality: 63/100 - Influencing human decisions
Applied Capabilities
Scientific ResearchCapabilityScientific Research CapabilitiesComprehensive survey of AI scientific research capabilities across biology, chemistry, materials science, and automated research, documenting key benchmarks (AlphaFold's 214M structures, GNoME's 2....Quality: 68/100 - Accelerating discovery
PersuasionCapabilityPersuasion and Social ManipulationGPT-4 achieves superhuman persuasion in controlled settings (64% win rate, 81% higher odds with personalization), with AI chatbots demonstrating 4x the impact of political ads (3.9 vs ~1 point vote...Quality: 63/100 - Influence and manipulation potential
Why Capabilities Matter for Safety
Capability levels determine:
Which risks become active - Many risks only emerge at certain capability thresholds
How much time remains - Faster capability growth compresses safety timelines
What interventions are viable - Some approaches only work before certain capabilities emerge
Capability Profiles Include
Current state - What models can do today
Trajectory - How capability is improving
Safety implications - What risks this enables
Measurement approaches - How to evaluate this capability
Self-Improvement and Recursive EnhancementCapabilitySelf-Improvement and Recursive EnhancementComprehensive analysis of AI self-improvement from current AutoML systems (23% training speedups via AlphaEvolve) to theoretical intelligence explosion scenarios, with expert consensus at ~50% prob...Quality: 69/100Long-Horizon Autonomous TasksCapabilityLong-Horizon Autonomous TasksMETR research shows AI task completion horizons doubling every 7 months (accelerated to 4 months in 2024-2025), with current frontier models achieving ~1 hour autonomous operation at 50% success; C...Quality: 65/100Large Language ModelsCapabilityLarge Language ModelsComprehensive analysis of LLM capabilities showing rapid progress from GPT-2 (1.5B parameters, 2019) to GPT-5 and Gemini 2.5 (2025), with training costs growing 2.4x annually and projected to excee...Quality: 60/100Autonomous CodingCapabilityAutonomous CodingAI coding capabilities reached 70-76% on curated benchmarks (23-44% on complex tasks) as of 2025, with 46% of code now AI-written and 55.8% faster development cycles. Key risks include 45% vulnerab...Quality: 63/100Persuasion and Social ManipulationCapabilityPersuasion and Social ManipulationGPT-4 achieves superhuman persuasion in controlled settings (64% win rate, 81% higher odds with personalization), with AI chatbots demonstrating 4x the impact of political ads (3.9 vs ~1 point vote...Quality: 63/100