Back
What Failure Looks Like
blogAuthor
paulfchristiano
Credibility Rating
3/5
Good(3)Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.
Rating inherited from publication venue: Alignment Forum
Data Status
Not fetched
Cited by 2 pages
| Page | Type | Quality |
|---|---|---|
| Paul Christiano | Person | 39.0 |
| AI Doomer Worldview | Concept | 38.0 |
Cached Content Preview
HTTP 200Fetched Feb 26, 202665 KB

[What failure looks like](https://www.alignmentforum.org/posts/HBxe6wdjxK239zajf/what-failure-looks-like#)
10 min read
•
[Part I: You get what you measure](https://www.alignmentforum.org/posts/HBxe6wdjxK239zajf/what-failure-looks-like#Part_I__You_get_what_you_measure)
•
[Part II: influence-seeking behavior is scary](https://www.alignmentforum.org/posts/HBxe6wdjxK239zajf/what-failure-looks-like#Part_II__influence_seeking_behavior_is_scary)
[Best of LessWrong 2019](https://www.alignmentforum.org/bestoflesswrong?year=2019&category=all)
[AI Risk](https://www.alignmentforum.org/w/ai-risk)[Threat Models (AI)](https://www.alignmentforum.org/w/threat-models-ai)[AI Takeoff](https://www.alignmentforum.org/w/ai-takeoff)[More Dakka](https://www.alignmentforum.org/w/more-dakka)[AI](https://www.alignmentforum.org/w/ai)[World Modeling](https://www.alignmentforum.org/w/world-modeling)[World Optimization](https://www.alignmentforum.org/w/world-optimization) [Curated](https://www.alignmentforum.org/recommendations)
# 106
# [What failure lookslike](https://www.alignmentforum.org/posts/HBxe6wdjxK239zajf/what-failure-looks-like)
by [paulfchristiano](https://www.alignmentforum.org/users/paulfchristiano?from=post_header)
17th Mar 2019
10 min read
[55](https://www.alignmentforum.org/posts/HBxe6wdjxK239zajf/what-failure-looks-like#comments)
# 106
[Review by\\
\\
orthonormal](https://www.alignmentforum.org/posts/HBxe6wdjxK239zajf/what-failure-looks-like#dFcfbCL5xW6SfPRqo)
The stereotyped image of AI catastrophe is a powerful, malicious AI system that takes its creators by surprise and quickly achieves a decisive advantage over the rest of humanity.
I think this is probably not what failure will look like, and I want to try to paint a more realistic picture. I’ll tell the story in two parts:
- **Part I**: machine learning will increase our ability to “get what we can measure,” which could cause a slow-rolling catastrophe. ("Going out with a whimper.")
- **Part II**: ML training, like competitive economies or natural ecosystems, can give rise to “greedy” patterns that try to expand their own influence. Such patterns can ultimately dominate the behavior of a system and cause sudden breakdowns. ("Going out with a bang," an instance of [optimization daemons](https://www.alignmentforum.org/w/daemons).)
I think these are the most important problems if we fail to solve [intent alignment](https://ai-alignment.com/clarifying-ai-alignment-cec47cd69dd6).
In practice these problems will interact with each other, and with other disruptions/instability caused by rapid progress. These problems are worse in worlds where progress is relatively fast, and fast takeoff can be a key risk factor, but I’m scared even if we have several ye
... (truncated, 65 KB total)Resource ID:
6807a8a8f2fd23f3 | Stable ID: OGFlNWM2Yz