Back
Visualizing the deep learning revolution | by Richard Ngo | Medium
blogData Status
Not fetched
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| Deep Learning Revolution Era | Historical | 44.0 |
Cached Content Preview
HTTP 200Fetched Feb 22, 202621 KB
Visualizing the deep learning revolution | by Richard Ngo | Medium Sitemap Open in app Sign up
Sign in
Medium Logo Write Search Sign up
Sign in
Visualizing the deep learning revolution
Richard Ngo 14 min read · Jan 5, 2023 --
7
Listen
Share
The field of AI has undergone a revolution over the last decade, driven by the success of deep learning techniques. This post aims to convey three ideas using a series of illustrative examples:
There have been huge jumps in the capabilities of AIs over the last decade, to the point where it’s becoming hard to specify tasks that AIs can’t do.
This progress has been primarily driven by scaling up a handful of relatively simple algorithms (rather than by developing a more principled or scientific understanding of deep learning).
Very few people predicted that progress would be anywhere near this fast; but many of those who did also predict that we might face existential risk from AGI in the coming decades.
I’ll focus on four domains: vision, games, language-based tasks, and science. The first two have more limited real-world applications, but provide particularly graphic and intuitive examples of the pace of progress.
Vision
Image recognition
Image recognition has been a focus of AI for many decades. Early research focused on simple domains like handwriting; performance has now improved significantly, beating human performance on many datasets. However, it’s hard to interpret scores on benchmarks in an intuitive sense, so we’ll focus on domains where progress can be visualized more easily.
Image generation
In 2014, AI image generation advanced significantly with the introduction of Generative Adversarial Networks (GANs). However, the first GANs could only generate very simple or blurry images, like the ones below.
Press enter or click to view image in full size Images with yellow borders are real, all others are GAN-generated. Over the next 8 years, image generation progressed at a very rapid rate; the figure below shows images generated by state-of-the-art systems in each year. Over the last two years in particular, these systems made a lot of progress in generating complex creative scenes in response to language prompts.
Press enter or click to view image in full size This is an astounding rate of progress. What drove it? In part, it was the development of new algorithms — most notably GANs, transformers and diffusion models. However, the key underlying factor was scaling up the amount of compute and data used during training. One demonstration of this comes from the Parti series of image models, which includes four networks of different sizes (with parameter counts ranging from 350 million to 20 billion). Although they were all trained in the same way, for the three prompts below you can clearly see how much better the bigger models are than the smaller ones (e.g. by watching the ability to portray text gradually emerge).
Press enter or click to view image in full s
... (truncated, 21 KB total)Resource ID:
e1307340f0f963ef | Stable ID: NjI1NjcwZD