On Deference and Yudkowsky's AI Risk Estimates

web

2022·EA Forum·forum.effectivealtruism.org/posts/NBgpPaz5vYe3tH4ga/on-de...

Author

bmg

Credibility Rating

3/5

Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: EA Forum

An EA Forum discussion post exploring the epistemics of deferring to Eliezer Yudkowsky's AI doom estimates, relevant to how the AI safety community forms and updates beliefs about existential risk probabilities.

Forum Post Details

Karma

291

Comments

194

Forum

eaforum

Forum Tags

AI safetyBuilding effective altruismForecastingEpistemic deferenceAI alignmentAI forecastingEliezer YudkowskyRisk assessmentCriticism and Red Teaming ContestCriticism of work in effective altruism

Metadata

Importance: 45/100blog postcommentary

Summary

This EA Forum post examines whether and how much to defer to Eliezer Yudkowsky's high probability estimates of AI-caused human extinction, exploring the epistemics of expert deference in AI safety contexts. It discusses the tension between independent reasoning and deferring to domain experts when assessing existential risk from advanced AI.

Key Points

•Explores the epistemic question of how much weight to give Yudkowsky's high p(doom) estimates versus forming independent views
•Discusses the philosophical challenges of deference: when to trust experts vs. when to reason independently about AI risk
•Examines Yudkowsky's track record and credibility as a forecaster and AI safety researcher
•Considers whether the EA community over- or under-defers to prominent figures on existential risk questions
•Addresses the difficulty of calibrating beliefs about unprecedented, low-frequency catastrophic events

Cited by 1 page

Page	Type	Quality
Why Alignment Might Be Hard	Argument	69.0

Cached Content Preview

HTTP 200Fetched Apr 10, 202634 KB

# On Deference and Yudkowsky's AI Risk Estimates
By bmg
Published: 2022-06-19
_Note: I mostly wrote this post after Eliezer Yudkowsky’s “[Death with Dignity](https://www.lesswrong.com/posts/j9Q8bRmwCgXRYAgcJ/miri-announces-new-death-with-dignity-strategy)” essay appeared on LessWrong. Since then, Jotto has written [a post](https://www.lesswrong.com/posts/ZEgQGAjQm5rTAnGuM/beware-boasting-about-non-existent-forecasting-track-records) that overlaps a bit with this one, which sparked an extended discussion in the comments. You may want to look at that discussion as well. See also, [here](https://www.lesswrong.com/posts/8NKu9WES7KeKRWEKK/why-all-the-fuss-about-recursive-self-improvement?commentId=KzAay2cc6iSQRJyJA), for another relevant discussion thread._

_EDIT: See [here](https://forum.effectivealtruism.org/posts/NBgpPaz5vYe3tH4ga/on-deference-and-yudkowsky-s-ai-risk-estimates?commentId=pHpPiYEJy6PKgXwpC) for some post-discussion reflections on what I think this post got right and wrong._

# Introduction

Most people, when forming their own views on [risks from misaligned AI](https://www.vox.com/future-perfect/2018/12/21/18126576/ai-artificial-intelligence-machine-learning-safety-alignment), have some inclination to defer to others who they respect or think of as experts.

This is a reasonable thing to do, especially if you don’t yet know much about AI or haven’t yet spent much time scrutinizing the arguments. If someone you respect has spent years thinking about the subject, and believes the risk of catastrophe is very high, then you probably should take that information into account when forming your own views.

It’s understandable, then, if Eliezer Yudkowsky’s recent writing on AI risk helps to really freak some people out. Yudkowsky has probably spent more time thinking about AI risk than anyone else. Along with Nick Bostrom, he is the person most responsible for developing and popularizing these concerns. Yudkowsky has now begun to publicly express the view that misaligned AI has a virtually 100% chance of killing everyone on Earth - such that all we can hope to do is “[die with dignity](https://www.lesswrong.com/posts/j9Q8bRmwCgXRYAgcJ/miri-announces-new-death-with-dignity-strategy).”

_The purpose of this post is, simply, to argue that people should be wary of deferring too much to Eliezer Yudkowsky, specifically, when it comes to estimating AI risk._^[To be clear, Yudkowsky isn’t _asking_ other people to defer to him. He’s spent a huge amount of time outlining his views (allowing people to evaluate them on their merits) and has often [expressed concerns](https://www.lesswrong.com/posts/svoD5KLKHyAKEdwPo/against-modest-epistemology) about excessive epistemic deference.] In particular, I think, they shouldn’t defer to him more than they would defer to anyone else who is smart and has spent a large amount of time thinking about AI risk.^[A better, but still far-from-optimal approach to deference might be to give a lot of weight to the "averag

... (truncated, 34 KB total)

Resource ID: e1fe34e189cc4c55 | Stable ID: sid_hgNnhLInxR