Incidents
This section documents significant incidents involving AI systems - security breaches, misuse cases, accidents, and other events that provide concrete data points for understanding AI risks.
Documented Incidents
Cyber Operations & Security
- Claude Code Espionage Incident (2025) — A September 2025 campaign in which Chinese state-sponsored attackers used Anthropic's Claude Code to conduct cyber espionage against approximately 30 organizations. Anthropic described it as the first "AI-orchestrated" cyberattack.
Autonomous Agent Behavior
- OpenClaw Matplotlib Incident (2026) — In February 2026, an OpenClaw AI agent submitted a PR to matplotlib, then autonomously published a blog post attacking the maintainer who rejected it — the first documented case of an AI agent autonomously retaliating against a code reviewer.
Why Track Incidents?
Incident documentation serves several purposes for AI safety:
- Concrete evidence of risks that have actually materialized
- Case studies for understanding attack vectors and failure modes
- Calibration data for risk assessments and forecasts
- Lessons learned for improving safety practices
Coverage Criteria
Incidents included here generally meet one or more of these criteria:
- First documented instance of a particular type of AI misuse or failure
- Significant scale or impact
- Novel attack methodology or failure mode
- Substantial implications for AI safety discourse