Multi-Agent Safety
Multi-agent safety research addresses coordination failures, conflict, and collusion risks when multiple AI systems interact. A 2025 report from 50+ researchers across DeepMind, Anthropic, and academia identifies seven key risk factors and finds that even individually safe systems may contribute to harm through interaction.
Related
Related Pages
Top Related Pages
Scalable Oversight
Methods for supervising AI systems on tasks too complex for direct human evaluation, including debate, recursive reward modeling, and process super...
Red Teaming
Adversarial testing methodologies to systematically identify AI system vulnerabilities, dangerous capabilities, and failure modes through structure...
Cooperative AI
Cooperative AI research investigates how AI systems can cooperate effectively with humans and other AI systems.
Autonomous Cooperative Agents
AI agents that act cooperatively on behalf of a principal — delegation of cooperation, multi-agent cooperation dynamics, and alignment implications
Cooperate-Bot
A personal AI agent with a monthly budget that maintains cooperative relationships on your behalf — design analysis, failure modes, and the spectru...