Paradigms for transformative intelligence. Structure: We separate deployment patterns (minimal → heavy scaffolding) from base architectures (transformers, SSMs, etc.). These are orthogonal - real systems combine both. E.g., "Heavy scaffolding + MoE transformer" is one concrete system.
Key insight: Scaffold code is actually more interpretable than model internals. We can read and verify orchestration logic; we can't read transformer weights.
Probability this becomes dominant at TAI | Trend | Overall safety assessment | Interpretability of internals | Training approach | Behavior predictability | Component separation | Formal verification possible | Key Papers | Labs | Safety Pros | Safety Cons | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Direct model API/chat with basic prompting. No persistent memory, minimal tools. Like ChatGPT web interface. | Deployment Patterns | 5-15% Unlikely to stay dominant - scaffolding adds clear value | (illustrative) | 5/10 MixedEasy to study but limited interpretability; low capability ceiling reduces risk | LOW Model internals opaque; just see inputs/outputs | HIGH Standard RLHF on base model | MEDIUM Single forward pass, somewhat predictable | LOW Monolithic model | LOW Model itself unverifiable | + Simple to analyze + No tool access = limited harm | − Model internals opaque − Limited capability ceiling | ||
Model + basic tool use + simple chains. RAG, function calling, single-agent loops. Like GPT with plugins. | Deployment Patterns | 15-25% Current sweet spot; but heavy scaffolding catching up | (illustrative) | 5/10 MixedTool use adds capability and risk; scaffold provides some inspection | MEDIUM Scaffold code readable; model still opaque | HIGH Model trained; scaffold is code | MEDIUM Tool calls add some unpredictability | MEDIUM Clear tool boundaries | PARTIAL Scaffold code can be verified | + Scaffold logic inspectable + Tool permissions controllable | − Tool use enables real-world harm − Model decisions still opaque | ||
Multi-agent systems, complex orchestration, persistent memory, autonomous operation. Like Claude Code, Devin. | Deployment Patterns | 25-40% Strong trend; scaffolding getting cheaper and more valuable | (illustrative) | 4/10 ChallengingHigh capability with emergent behavior; scaffold helps but autonomy is risky | MEDIUM-HIGH Scaffold code fully readable; model calls are black boxes | LOW Models trained separately; scaffold is engineered code | LOW Multi-step plans diverge unpredictably | HIGH Explicit component architecture | PARTIAL Scaffold verifiable; model calls not | + Scaffold code auditable + Can add safety checks in code + Modular | − Emergent multi-step behavior − Autonomous = less oversight − Tool use risk | ||
Standard transformer architecture. All parameters active. Current GPT/Claude/Llama architecture. | Base Architectures | (base arch) Orthogonal to deployment - combined with scaffolding choices | (illustrative) | 5/10 MixedMost studied but still opaque; interpretability improving but slowly | LOW Weights exist but mech interp still primitive | HIGH Well-understood pretraining + RLHF | LOW-MED Emergent capabilities, phase transitions | LOW Monolithic, end-to-end trained | LOW Billions of parameters, no formal guarantees | + Most studied architecture + Some interp tools exist | − Internals still opaque − Emergent deception possible − Scale makes analysis hard | ||
Mixture-of-Experts or other sparse architectures. Only subset of params active per token. | Base Architectures | (base arch) May become default for efficiency; orthogonal to scaffolding | (illustrative) | 4/10 MixedEfficiency gains good for safety research budget, but routing adds complexity | LOW Same opacity as dense + routing complexity | HIGH Standard + load balancing | LOW Routing adds another layer of unpredictability | MEDIUM Expert boundaries exist but interact | LOW Combinatorial explosion of expert paths | + Can study individual experts + More efficient = more testing budget | − Routing is another black box − Hard to cover all expert combinations | ||
State-space models or SSM-transformer hybrids with linear-time inference. | Base Architectures | 5-15% Promising efficiency but transformers still dominate benchmarks | (illustrative) | Unknown Too early to assess; different internals may help or hurt | MEDIUM Different internals, less studied | HIGH Still gradient-based | MEDIUM Recurrence adds complexity | LOW Similar to transformers | UNKNOWN Recurrence may help or hurt | CartesiaTogether AIPrinceton | + More efficient + Linear complexity | − Interp tools don't transfer − Less studied | |
Explicit learned world model with search/planning. More like AlphaGo than GPT. | Base Architectures | 5-15% LeCun advocates; not yet competitive for general tasks | (illustrative) | 6/10 MixedExplicit structure helps inspection but goal misgeneralization risks higher | PARTIAL World model inspectable but opaque | HIGH Model-based RL, self-play | MEDIUM Explicit planning but model errors compound | MEDIUM Separate world model, policy, value | PARTIAL Planning verifiable, world model less so | + Explicit goals + Can inspect beliefs | − Goal misgeneralization − Mesa-optimization | ||
Neural + symbolic reasoning, knowledge graphs, or program synthesis. | Base Architectures | 3-10% Long-promised, rarely delivered at scale | (illustrative) | 7/10 FavorableSymbolic components enable formal verification; hybrid boundaries a challenge | PARTIAL Symbolic parts clear, neural parts opaque | COMPLEX Neural trainable, symbolic often hand-crafted | MEDIUM Explicit reasoning more auditable | HIGH Clear neural/symbolic separation | PARTIAL Symbolic parts formally verifiable | Neural Theorem Provers AlphaProof (2024) | + Auditable reasoning + Formal verification possible | − Brittleness − Hard to scale − Boundary problems | |
Formally verified AI with mathematical safety guarantees. Davidad's agenda. | Base Architectures | 1-5% Ambitious; unclear if achievable for general capabilities | (illustrative) | 9/10 FavorableIf achievable, best safety properties by design; uncertainty about feasibility | HIGH Designed for formal analysis | DIFFERENT Verified synthesis, not just SGD | HIGH Behavior bounded by proofs | HIGH Compositional by design | HIGH This is the point | ARIA (Davidad)MIRI | + Mathematical guarantees + Auditable by construction | − May not scale − Capability tax − World model verification hard | |
Something we haven't thought of yet. Placeholder for model uncertainty. | Base Architectures | 5-15% Epistemic humility; history suggests surprises | (illustrative) | Unknown Cannot assess; all current safety research may or may not transfer | ??? Depends on what emerges | ??? Unknown | ??? No basis for prediction | ??? Unknown | ??? Unknown | None listed | Unknown | + Fresh start possible | − All current work may not transfer |
Actual biological neurons, brain organoids, or wetware computing. | Alternative Compute | <1% Fascinating but far from TAI-relevant scale | (illustrative) | 3/10 ChallengingDeeply opaque; no existing safety tools apply; ethical complexities | LOW Biological systems inherently opaque | UNKNOWN Biological learning rules | LOW Noisy and variable | LOW Highly interconnected | LOW Too complex | Cortical LabsVarious academic | + May have human-like values + Energy efficient | − Ethical concerns − No interp tools − Slow iteration | |
Spiking neural networks on specialized chips. Event-driven, analog. | Alternative Compute | 1-3% Efficiency gains real but not on path to TAI | (illustrative) | Unknown Different substrate with different properties; too early to assess | PARTIAL Architecture known, dynamics complex | DIFFERENT Spike-timing plasticity | MEDIUM More brain-like | MEDIUM Modular chip designs possible | LOW Analog dynamics hard to verify | Intel LabsIBM ResearchSynSense | + Energy efficient + Robust | − Current tools don't transfer − Less mature | |
Upload/simulate a complete biological brain at sufficient fidelity. Requires scanning + simulation tech. | Non-AI Paradigms | <1% Probably slower than AI; scanning tech far away | (illustrative) | 5/10 MixedHuman values by default, but speed-up and copy-ability create novel risks | LOW Brain structure visible but not interpretable | N/A Copied from biological learning | LOW Human-like = unpredictable | LOW Brains are highly interconnected | LOW Too complex, poorly understood | CarboncopiesAcademic neuroscience | + Human values by default + Understood entity type | − Ethics of copying minds − Could run faster than real-time − Identity issues | |
IQ enhancement via embryo selection, polygenic screening, or direct genetic engineering. | Non-AI Paradigms | <0.5% Too slow for TAI race; incremental gains only | (illustrative) | 7/10 FavorableSlow and controllable; enhanced humans still have human values | LOW Genetic effects poorly understood | N/A Biological development | MEDIUM Still human, but smarter | LOW Integrated biological system | LOW Biological complexity | Genomic PredictionAcademic genetics | + Human values + Slow/controllable + Socially legible | − Ethical concerns − Too slow to matter for TAI − Inequality risks | |
Neural interfaces that augment human cognition with AI/compute. Neuralink-style. | Non-AI Paradigms | <1% Bandwidth limits; AI likely faster standalone | (illustrative) | 5/10 MixedHuman oversight built-in, but security risks and bandwidth limits | PARTIAL Interface visible, brain opaque | HYBRID Human learning + AI training | LOW Human in the loop = unpredictable | MEDIUM Clear human/AI boundary | LOW Human component unverifiable | NeuralinkSynchronBrainGate | + Human oversight built-in + Gradual augmentation | − Bandwidth limits − Security risks − Human bottleneck | |
Human-AI teams, prediction markets, deliberative democracy augmented by AI. Intelligence from coordination. | Non-AI Paradigms | (overlay) Not exclusive; already happening | (illustrative) | 7/10 FavorableHuman oversight natural; slower pace; but coordination challenges | PARTIAL Process visible, emergent behavior less so | N/A Coordination protocols, not training | MEDIUM Depends on protocol design | HIGH Explicitly modular by design | PARTIAL Protocols can be analyzed | Collective Intelligence papers | + Human oversight + Diverse perspectives + Slower = more controllable | − Coordination failures − Vulnerable to manipulation − May not scale |