Analytical Models

Overview

This section contains analytical models that provide structured ways to think about AI risks, their interactions, and potential interventions. These models help quantify uncertainties, map causal relationships, and identify leverage points.

Model Categories

Framework Models

Foundational frameworks for AI risk analysis:

Carlsmith's Six Premises - Probability decomposition for AI x-risk
Instrumental Convergence Framework - Why AI might seek power
Defense in Depth Model - Layered safety approaches
Capability Threshold Model - When risks become acute

Risk Models

Models of specific risk mechanisms:

Scheming Likelihood Model - When AI might deceive
Deceptive Alignment Decomposition - Components of deception risk
Mesa-Optimization Analysis - Inner optimizer emergence
Power-Seeking Conditions - When power-seeking emerges

Dynamics Models

Models of how factors evolve and interact:

Racing Dynamics Impact - Competition effects on safety
Feedback Loops - Self-reinforcing dynamics
Risk Interaction Matrix - How risks compound
Lab Incentives Model - What drives lab behavior

Societal Models

Models of broader societal impacts:

Trust Erosion Dynamics - How trust degrades
Lock-in Mechanisms - What creates irreversibility
Expertise Atrophy Progression - Skill loss trajectories

Intervention Models

Models for evaluating and prioritizing responses:

Intervention Effectiveness Matrix - Comparing approaches
Safety Research Value - Research prioritization

Using These Models

Models include:

Quantitative estimates with uncertainty ranges
Causal diagrams showing factor relationships
Scenario analysis exploring different assumptions
Key cruxes that most affect conclusions

See individual model pages for detailed methodology and limitations.

Analytical Models

Overview

Model Categories

Framework Models

Risk Models

Dynamics Models

Societal Models

Intervention Models

Using These Models

Related Wiki Pages

Top Related Pages

AI Safety Intervention Effectiveness Matrix

AI Capability Threshold Model

AI Safety Defense in Depth Model

AI Risk Interaction Matrix

Mesa-Optimization Risk Analysis

Risks

Analysis

Analytical Models

Overview

Model Categories

Framework ModelsAnalysisCarlsmith's Six-Premise ArgumentCarlsmith's framework decomposes AI existential risk into six conditional premises (timelines, incentives, alignment difficulty, power-seeking, disempowerment scaling, catastrophe), yielding ~5% ri...Quality: 65/100

Risk ModelsAnalysisScheming Likelihood AssessmentProbabilistic framework decomposing AI scheming risk into four multiplicative components (misalignment, situational awareness, instrumental rationality, feasibility), estimating current systems at ...Quality: 61/100

Dynamics ModelsAnalysisRacing Dynamics Impact ModelThis model quantifies how competitive pressure between AI labs reduces safety investment by 30-60% compared to coordinated scenarios and increases alignment failure probability by 2-5x through pris...Quality: 61/100

Societal ModelsAnalysisTrust Erosion Dynamics ModelAnalyzes how AI systems erode institutional trust through deepfakes, disinformation, and authentication collapse, finding trust erodes 3-10x faster than it builds, with US institutional trust at 18...Quality: 59/100

Intervention ModelsAnalysisAI Safety Intervention Effectiveness MatrixQuantitative analysis mapping 15+ AI safety interventions to specific risks reveals critical misallocation: 40% of 2024 funding ($400M+) flows to RLHF methods showing only 10-20% effectiveness agai...Quality: 73/100

Using These Models

Related Wiki Pages

Top Related Pages

AI Safety Intervention Effectiveness Matrix

AI Capability Threshold Model

AI Safety Defense in Depth Model

AI Risk Interaction Matrix

Mesa-Optimization Risk Analysis

Risks

Analysis

Framework Models

Risk Models

Dynamics Models

Societal Models

Intervention Models