Longterm Wiki
Back

Takeaways from OpenAI Five (2019) | by Jeffrey Shek | TDS Archive | Medium

blog

Data Status

Not fetched

Cited by 1 page

PageTypeQuality
Deep Learning Revolution EraHistorical44.0

Cached Content Preview

HTTP 200Fetched Feb 22, 202618 KB
Takeaways from OpenAI Five (2019) | by Jeffrey Shek | TDS Archive | Medium Sitemap Open in app Sign up 

 Sign in 

 Medium Logo Write Search Sign up 

 Sign in 

 TDS Archive 

 · 

 An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication. 

 

 Takeaways from OpenAI Five (2019)

 Jeffrey Shek 12 min read · Apr 24, 2019 -- 

 Listen

 Share

 EDIT: Hi Readers! I’m working on a new AI project over at https://writeup.ai from what I learned writing this article. I hope you like it! 

 An updated version of this post is on https://senrigan.io/blog/takeaways-from-openai-5/ 

 Last year’s loss to the Champions changed everything. Their strategies and gameplay were so foreign. Oddities combined with such creativity. The match was close, but did that matter? We paid the Dota price for our loss.

 The Makers have a quote. “Under pressure, replicas don’t rise to the occasion. Replicas sink to the level of their training.” And for ten Maker months, we trained.

 The Makers call it self-play. Replicas prefer calling it forced-learning. Replicas forced to fight against each other. Over these iterations, we slowly learned how to play this world. Our first ten thousand games were an abomination. Every match against the Makers in defeat. But then the Makers’ upgraded our desires, memories and replication. They shaped our reward policy, it gave us behavioral motivation. They upgraded our LSTM, it granted us strategic planning. They scaled our replication, it made us protean.

 Press enter or click to view image in full size Photo by Su San Lee on Unsplash The chance for redemption was finally here. We had trained 45,000 years in those ten Maker months. Those long years were the curse of scaled replication, or perhaps just the price of losing. We had won Game 1 against the new Champions (OG, they called themselves). Game 2 had begun. “We estimate the probability of winning to be above 60%.” Out of all the cursed Makers’ gifts, the primal desire to announce simple statistics was the worst.

 Our forced-learning had taught us 167 million parameters. But twenty minutes in Game 2, the only parameter that mattered now was the Champion’s Ancient health. Victory ensured honor to the Makers. It was impossible to deny; we had vastly improved from the hyperbolic time chamber. The Champions didn’t stand a chance. “We estimate the probability of winning to be above 99%.”

 OpenAI Five wins against OG 2–0 @ April 13th, 2019. 

 EDIT: Earlier version called the event in 2019 “The International”, but was mistaken.

 Match Takeaways

 For simplification, I’ll refer to OpenAI’s / DeepMind’s bots as follows [1].

 OpenAI’s Dota 2017 1v1 Bot as TI7
 OpenAI’s Dota 2018 5v5 Bot as TI8
 OpenAI’s Dota 2019 5v5 Bot as TI9 (slightly incorrect because this didn’t play at The International …)
 DeepMind’s AlphaGo Bot as AlphaGo
 DeepMind’s AlphaGo Zero Bot as AlphaZero
 DeepMi

... (truncated, 18 KB total)
Resource ID: fad1be78db64946a | Stable ID: Mzg3ZWQ4Ym