Knowledge Base

Overview

The LongtermWiki Knowledge Base provides structured documentation of the AI safety landscape, covering risks, interventions, organizations, and key debates. Content is organized to help researchers, funders, and policymakers understand the current state of AI safety and make informed decisions about resource allocation.

Content Categories

Risks

Documentation of potential failure modes and hazards from advanced AI systems, organized by type:

Accident Risks - Unintended failures like scheming, deceptive alignment, mesa-optimization
Misuse Risks - Deliberate harmful applications like bioweapons, cyberweapons, disinformation
Structural Risks - Systemic issues like racing dynamics, concentration of power, lock-in
Epistemic Risks - Threats to knowledge and truth like authentication collapse, trust erosion

Responses

Interventions and approaches to address AI risks:

Technical Alignment - Interpretability, RLHF, constitutional AI, AI control
Governance - Compute governance, international coordination, legislation
Institutional - AI safety institutes, standards bodies
Epistemic Tools - Prediction markets, content authentication, coordination technologies

Models

Analytical frameworks for understanding AI risk dynamics:

Framework Models - Carlsmith's six premises, instrumental convergence
Risk Models - Deceptive alignment decomposition, scheming likelihood
Dynamics Models - Racing dynamics impact, feedback loops
Societal Models - Trust erosion, lock-in mechanisms

Organizations

Profiles of key actors in AI development and safety:

AI Labs - OpenAI, Anthropic, DeepMind, xAI
Safety Research Orgs - MIRI, ARC, Redwood, Apollo Research
Government Bodies - US AISI, UK AISI

People

Profiles of influential researchers and leaders in AI safety.

Capabilities

Documentation of AI capability domains and their safety implications.

Debates

Structured analysis of key disagreements in the field.

Cruxes

Key uncertainties that drive disagreement and prioritization decisions.

How to Use This Knowledge Base

Exploring risks: Start with the scheming page for the most discussed risk, then browse related accident risks
Understanding responses: See interpretability for a well-documented technical approach
Analytical depth: The Carlsmith six-premise model provides a rigorous framework for AI risk estimation
Browse everything: Use the Browse page to search and filter all entries

Quality Indicators

Pages include quality and importance ratings:

Quality (0-100): How well-developed and accurate the content is
Importance (0-100): How significant the topic is for AI safety decisions

High-priority pages (quality < importance) are actively being improved.

Knowledge Base

Overview

Content Categories

Risks

Responses

Models

Organizations

People

Capabilities

Debates

Cruxes

How to Use This Knowledge Base

Quality Indicators

Related Wiki Pages

Top Related Pages

Cyberweapons Risk

AI Governance Coordination Technologies

Bioweapons Risk

AI Control

Scheming

Organizations

Risks

Key Debates

Policy

Concepts

Approaches

Other

Analysis

Knowledge Base

Overview

Content Categories

RisksRiskSchemingScheming—strategic AI deception during training—has transitioned from theoretical concern to observed behavior across all major frontier models (o1: 37% alignment faking, Claude: 14% harmful compli...Quality: 74/100

ResponsesResearch AreaInterpretabilityMechanistic interpretability has extracted 34M+ interpretable features from Claude 3 Sonnet with 90% automated labeling accuracy and demonstrated 75-85% success in causal validation, though less th...Quality: 66/100

ModelsAnalysisCarlsmith's Six-Premise ArgumentCarlsmith's framework decomposes AI existential risk into six conditional premises (timelines, incentives, alignment difficulty, power-seeking, disempowerment scaling, catastrophe), yielding ~5% ri...Quality: 65/100

OrganizationsOrganizationOpenAIComprehensive organizational profile of OpenAI documenting evolution from 2015 non-profit to Public Benefit Corporation, with detailed analysis of governance crisis, 2024-2025 ownership restructuri...Quality: 62/100

PeoplePersonPaul ChristianoComprehensive biography of Paul Christiano documenting his technical contributions (IDA, debate, scalable oversight), risk assessment (~10-20% P(doom), AGI 2030s-2040s), and evolution from higher o...Quality: 39/100

CapabilitiesCapabilityAgentic AIAnalysis of agentic AI capabilities and deployment challenges, documenting industry forecasts (40% of enterprise apps by 2026, $199B market by 2034) alongside implementation difficulties (40%+ proj...Quality: 68/100

DebatesCruxIs AI Existential Risk Real?Covers the foundational AI x-risk debate across four core cruxes: instrumental convergence, warning sign availability, corrigibility achievability, and timeline urgency. Incorporates quantitative e...Quality: 12/100

CruxesCruxAI Accident Risk CruxesComprehensive survey of AI safety researcher disagreements on accident risks, quantifying probability ranges for mesa-optimization (15-55%), deceptive alignment (15-50%), and P(doom) (5-35% median ...Quality: 67/100

How to Use This Knowledge Base

Quality Indicators

Related Wiki Pages

Top Related Pages

Cyberweapons Risk

AI Governance Coordination Technologies

Bioweapons Risk

AI Control

Scheming

Organizations

Risks

Key Debates

Policy

Concepts

Approaches

Other

Analysis

Risks

Responses

Models

Organizations

People

Capabilities

Debates

Cruxes