MIRI/Open Philanthropy exchange on decision theory
blogAuthor
Credibility Rating
Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.
Rating inherited from publication venue: Alignment Forum
This exchange is a rare public record of institutional disagreement between MIRI and Open Philanthropy on decision theory, making it valuable for understanding the landscape of foundational agent-design debates in AI alignment research.
Metadata
Summary
This post documents a substantive dialogue between MIRI and Open Philanthropy researchers comparing decision theories (CDT, EDT, TDT, UDT, FDT) and their relevance to AI alignment. The exchange focuses on whether updateless decision theories outperform updateful variants on key philosophical dilemmas such as counterfactual mugging and Troll Bridge. It serves as a useful reference for understanding where these organizations agree and disagree on foundational decision-theoretic questions.
Key Points
- •Clarifies distinctions between CDT, EDT, TDT, UDT, and FDT, providing a structured comparison of major decision theory frameworks.
- •Debates whether updateless approaches (UDT, updateless FDT) systematically outperform updateful versions on canonical dilemmas like counterfactual mugging.
- •Explores the Troll Bridge problem as a stress test for decision theories, highlighting edge cases where standard frameworks struggle.
- •Reflects genuine disagreement between MIRI and Open Philanthropy researchers, making institutional perspectives on foundational AI alignment questions explicit.
- •Relevant to AI alignment because the choice of decision theory for AI agents may have significant implications for their behavior in strategic or adversarial settings.
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| Agent Foundations | Approach | 59.0 |
Cached Content Preview
Nov
DEC
Jan
10
2024
2025
2026
success
fail
About this capture
COLLECTED BY
Collection: Common Crawl
Web crawl data from Common Crawl.
TIMESTAMPS
The Wayback Machine - https://web.archive.org/web/20251210051102/https://www.alignmentforum.org/posts/FBbHEjkZzdupcjkna/miri-op-exchange-about-decision-theory-1
x
This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
MIRI/OP exchange about decision theory — AI Alignment Forum
Decision theoryFunctional Decision TheoryCausal Decision TheoryEmbedded AgencyEvidential Decision TheoryAIRationality
Frontpage
22
MIRI/OP exchange about decision theory
by Rob Bensinger
25th Aug 2021
12 min read
7
22
Decision theoryFunctional Decision TheoryCausal Decision TheoryEmbedded AgencyEvidential Decision TheoryAIRationality
Frontpage
MIRI/OP exchange about decision theory
8riceissa
11Joe Carlsmith
4Ben Pace
3Daniel Kokotajlo
1Chris_Leong
New Comment
Submit
5 comments, sorted by
top scoring
Click to highlight new comments since: Today at 5:11 AM
[-]riceissa4y
8
0
Rob, are you able to disclose why people at Open Phil are interested in learning more decision theory? It seems a little far away from the AI strategy reports they've been publishing in recent years, and it also seemed like they were happy to keep funding MIRI (via their Committee for Effective Altruism Support) despite disagreements about the value of HRAD research, so the sudden interest in decision theory is intriguing.
Reply
[-]Joe Carlsmith4y
11
0
Mostly personal interest on my part (I was working on a blog post on the topic, now up), though I do think that the topic has broader relevance.
Reply
[-]Ben Pace4y
4
0
I was in the chat and don't have anything especially to "disclose". Joe and Nick are both academic philosophers who've studied at Oxford and been at FHI, with a wide range of interests. And Abram and Scott are naturally great people to chat about decision theory with when they're available.
Reply
[-]Daniel Kokotajlo4y
3
0
My own answer would be the EDT answer: how much does your decision correlate with theirs? Modulated by ad-hoc updatelessness: how much does that correlation change if we forget "some" relevant information? (It usually increases a lot.)
I found this part particularly interesting and would love to see a fleshed-out example of this reasoning so I can understand it better.
Reply
[-]Chris_Leong4y*
1
0
How would I in principle estimate how many more votes go to my favored presidential candidate in a presidential election (beyond the standard answer of "1")?
I'm happy to see Abram Demski mention this as I've long seen this as a crucial case for trying to understand subjunctive
... (truncated, 24 KB total)db5e810911f924b1 | Stable ID: sid_hgb3Thlxpx