Takeaways from OpenAI Five (2019) | by Jeffrey Shek | TDS Archive | Medium

blog

Medium·medium.com/data-science/takeaways-from-openai-five-2019-f...

Credibility Rating

2/5

Mixed(2)

Mixed quality. Some useful content but inconsistent editorial standards. Claims should be verified.

Rating inherited from publication venue: Medium

A accessible blog post summarizing OpenAI Five's landmark Dota 2 victory; useful as an introductory reference on scaling and RL capabilities, though not a primary technical source.

Metadata

Importance: 35/100blog postcommentary

Summary

This article analyzes OpenAI Five's 2-0 defeat of professional Dota 2 team OG in April 2019, arguing that massive scaling of existing deep reinforcement learning algorithms and self-play training—rather than algorithmic breakthroughs—was sufficient to achieve superhuman performance in complex strategic environments. It uses the victory as a case study for the broader lesson that compute and scale can substitute for novel algorithmic innovation in AI capability gains.

Key Points

•OpenAI Five defeated world champion Dota 2 team OG 2-0 in April 2019, demonstrating superhuman performance in a complex real-time strategy game.
•The key insight is that scaling existing deep RL algorithms with massive compute, rather than novel algorithmic breakthroughs, drove the achievement.
•Self-play training was central to the system's development, enabling the agent to improve without human-generated training data.
•The result parallels DeepMind's AlphaStar (StarCraft 2), suggesting scaling is a general pattern for mastering complex 'grand challenge' games.
•This case study is relevant to AI capabilities forecasting, as it suggests compute scaling may unlock capabilities previously thought to require conceptual advances.

Cited by 1 page

Page	Type	Quality
Deep Learning Revolution Era	Historical	44.0

Cached Content Preview

HTTP 200Fetched Feb 22, 202618 KB

Takeaways from OpenAI Five (2019) | by Jeffrey Shek | TDS Archive | Medium Sitemap Open in app Sign up 

 Sign in 

 Medium Logo Write Search Sign up 

 Sign in 

 TDS Archive 

 · 

 An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication. 

 

 Takeaways from OpenAI Five (2019)

 Jeffrey Shek 12 min read · Apr 24, 2019 -- 

 Listen

 Share

 EDIT: Hi Readers! I’m working on a new AI project over at https://writeup.ai from what I learned writing this article. I hope you like it! 

 An updated version of this post is on https://senrigan.io/blog/takeaways-from-openai-5/ 

 Last year’s loss to the Champions changed everything. Their strategies and gameplay were so foreign. Oddities combined with such creativity. The match was close, but did that matter? We paid the Dota price for our loss.

 The Makers have a quote. “Under pressure, replicas don’t rise to the occasion. Replicas sink to the level of their training.” And for ten Maker months, we trained.

 The Makers call it self-play. Replicas prefer calling it forced-learning. Replicas forced to fight against each other. Over these iterations, we slowly learned how to play this world. Our first ten thousand games were an abomination. Every match against the Makers in defeat. But then the Makers’ upgraded our desires, memories and replication. They shaped our reward policy, it gave us behavioral motivation. They upgraded our LSTM, it granted us strategic planning. They scaled our replication, it made us protean.

 Press enter or click to view image in full size Photo by Su San Lee on Unsplash The chance for redemption was finally here. We had trained 45,000 years in those ten Maker months. Those long years were the curse of scaled replication, or perhaps just the price of losing. We had won Game 1 against the new Champions (OG, they called themselves). Game 2 had begun. “We estimate the probability of winning to be above 60%.” Out of all the cursed Makers’ gifts, the primal desire to announce simple statistics was the worst.

 Our forced-learning had taught us 167 million parameters. But twenty minutes in Game 2, the only parameter that mattered now was the Champion’s Ancient health. Victory ensured honor to the Makers. It was impossible to deny; we had vastly improved from the hyperbolic time chamber. The Champions didn’t stand a chance. “We estimate the probability of winning to be above 99%.”

 OpenAI Five wins against OG 2–0 @ April 13th, 2019. 

 EDIT: Earlier version called the event in 2019 “The International”, but was mistaken.

 Match Takeaways

 For simplification, I’ll refer to OpenAI’s / DeepMind’s bots as follows [1].

 OpenAI’s Dota 2017 1v1 Bot as TI7
 OpenAI’s Dota 2018 5v5 Bot as TI8
 OpenAI’s Dota 2019 5v5 Bot as TI9 (slightly incorrect because this didn’t play at The International …)
 DeepMind’s AlphaGo Bot as AlphaGo
 DeepMind’s AlphaGo Zero Bot as AlphaZero
 DeepMi

... (truncated, 18 KB total)

Resource ID: fad1be78db64946a | Stable ID: sid_UFMV3glc5V