Skip to content

AI Risks

Overview

This section documents the potential risks from advanced AI systems, organized into four major categories based on the source and nature of the risk.

Risk Categories

Accident Risks

Unintended failures from AI systems pursuing misaligned goals:

Scheming - AI strategically concealing misaligned goals
Deceptive Alignment - Models appearing aligned during training
Mesa-Optimization - Learned optimizers with misaligned objectives
Goal Misgeneralization - Objectives that fail in deployment
Power-Seeking - Instrumental convergence toward acquiring resources

Misuse Risks

Deliberate harmful applications of AI capabilities:

Bioweapons - AI-assisted biological weapon development
Cyberweapons - Automated cyber attacks and vulnerabilities
Disinformation - Large-scale manipulation campaigns
Autonomous Weapons - Lethal autonomous systems

Structural Risks

Systemic issues from how AI development is organized:

Racing Dynamics - Competitive pressure reducing safety investment
Concentration of Power - Dangerous accumulation of AI capabilities
Lock-in - Irreversible entrenchment of values or structures
Economic Disruption - Labor market and economic instability

Epistemic Risks

Threats to society’s ability to know and reason:

Trust Decline - Erosion of institutional and interpersonal trust
Authentication Collapse - Inability to verify authentic content
Expertise Atrophy - Loss of human capability through AI dependence

How Risks Connect

Many risks interact and compound. For example:

Racing dynamics → reduced safety testing → higher accident risk
Disinformation → trust decline → reduced coordination capacity
Power concentration → lock-in potential → governance failures

See the Risk Interaction Matrix for detailed analysis.