Emerging capabilities and techniques in AI systems. Unlike architecture choices, these are not mutually exclusive - a single system can exhibit multiple innovations simultaneously. The safety assessment indicates the outlook for each innovation.
| Innovation | Interp. | Channel | Safety | Prevalence | Timeline | Tractability | Key Risks | Key Opportunities |
|---|---|---|---|---|---|---|---|---|
Neuralese | Opaque | Internal | 3/10challenging | 60-80% | Now (dominant) | LOW |
|
|
Interpretable-by-Design Architectures | Inherent | Human-Facing | 8/10favorable | 5-15% | 2027+ | MEDIUM |
|
|
Shared Latent Spaces | Opaque | AI-AI Overt | 3/10challenging | 30-50% | 2025-2030 | LOW |
|
|
Chain-of-Thought Reasoning | Hybrid | Human-Facing | 6/10mixed | 20-40% | 2025-2030 | HIGH |
|
|
Steganographic Capacity | Opaque | AI-AI Covert | 2/10challenging | 30-50% | 2025-2030 | MEDIUM |
|
|
Emergent Communication Protocols | Opaque | AI-AI Covert | 2/10challenging | 20-40% | 2026-2032 | LOW |
|
|
Mechanistic Interpretability | Post-hoc | Internal | 5/10mixed | 40-60% | Now - 2030 | HIGH |
|
|
Situational Awareness | Opaque | Internal | 2/10challenging | 50-70% | Now (emerging) | MEDIUM |
|
|
Explicit World Models | Hybrid | Internal | 7/10favorable | 15-30% | 2027-2035 | MEDIUM |
|
|