Back
Adversarial Policies Beat Superhuman Go AIs | FAR.AI
webData Status
Not fetched
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| FAR AI | Organization | 76.0 |
Cached Content Preview
HTTP 200Fetched Feb 22, 20263 KB
Adversarial Policies Beat Superhuman Go AIs | FAR.AI
We updated our website and would love your feedback!
Events
Events
Programs
Programs
Blog
About
About
Careers Donate
All Research
/ Robustness
Adversarial Policies Beat Superhuman Go AIs
Full PDF
Project
Source
Blog
Citation
January 9, 2023
Tony Wang
Adam Gleave
Tom Tseng
Kellin Pelrine
Nora Belrose
Joseph Miller
Michael D. Dennis
Yawen Duan
Viktor Pogrebniak
Sergey Levine
Stuart Russell
abstract
We attack the state-of-the-art Go-playing AI system, KataGo, by training adversarial policies that play against frozen KataGo victims. Our attack achieves a >99% win rate when KataGo uses no tree-search, and a >77% win rate when KataGo uses enough search to be superhuman. Notably, our adversaries do not win by learning to play Go better than KataGo -- in fact, our adversaries are easily beaten by human amateurs. Instead, our adversaries win by tricking KataGo into making serious blunders. Our results demonstrate that even superhuman AI systems may harbor surprising failure modes. Example games are available at [goattack.far.ai](goattack.far.ai).
Share on:
Research
Our research explores a portfolio
of high-potential agendas.
Events
Our events bring together
global leaders in AI.
Programs
Our programs build the field of trustworthy and secure AI
Subscribe
Subscribe to our newsletter
Organization About Team Programs News Search
Events All Events Alignment Workshops Specialized Workshops All Event Recordings
Research All Publications Research Overview
Robustness
Interpretability
Model Evaluation
Alignment
Get involved Careers Contact Donate Newsletter
Financial Reports / 990s Privacy Policy Terms of Service
Cookies Notice: This website uses cookies to identify pages that are being used most frequently. This helps us analyze web page traffic and improve our website. We do not and will never sell user data. Read more about our cookie policy on our privacy policy . Please contact us if you have any questions.
© 2025 FAR AI, Inc.
Website by ODW
FAR.AI only uses cookies essential for website functionality and anonymous usage analytics.
I understand
... (truncated, 3 KB total)Resource ID:
90d7373e3af4f090 | Stable ID: YjA3NDE4N2