Skip to content
Longterm Wiki
Back

[2007.05558] The Computational Limits of Deep Learning

paper

Authors

Neil Thompson·Kristjan Greenewald·Keeheon Lee·Gabriel F. Manso

Credibility Rating

3/5
Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: arXiv

Foundational analysis examining the computational scaling requirements of deep learning progress, arguing that current trends are economically and environmentally unsustainable, which directly informs discussions of AI capabilities development and resource constraints relevant to AI safety planning.

Paper Details

Citations
122
12 influential
Year
2020
Methodology
peer-reviewed
Categories
Ninth Computing within Limits 2023

Metadata

arxiv preprintanalysis

Summary

This paper by Thompson et al. documents deep learning's heavy dependence on computational power for recent progress across applications like Go, image classification, and translation. The authors demonstrate that progress across diverse domains is strongly correlated with increases in computing resources and argue that extrapolating current trends reveals this reliance is becoming economically, technically, and environmentally unsustainable. They conclude that continued progress requires either dramatically more computationally-efficient deep learning methods or a shift toward alternative machine learning approaches.

Cited by 1 page

PageTypeQuality
Deep Learning Revolution EraHistorical44.0

Cached Content Preview

HTTP 200Fetched Apr 9, 202698 KB
[2007.05558] The Computational Limits of Deep Learning 
 
 
 
 
 
 
 
 
 
 
 

 
 

 
 
 
 
 
 
 The Computational Limits of Deep Learning

 
 
 Neil C. Thompson 1∗ , Kristjan Greenewald 2 , Keeheon Lee 3 , Gabriel F. Manso 4 
 
 1 MIT Computer Science and A.I. Lab,
 MIT Initiative on the Digital Economy, Cambridge, MA USA
 2 MIT-IBM Watson AI Lab, Cambridge MA, USA
 3 Underwood International College, Yonsei University, Seoul, Korea 
 4 FGA, University of Brasilia, Brasilia, Brazil 
 
 ∗ To whom correspondence should be addressed; E-mail: neil_t@mit.edu
 
 

 
 Abstract

 Deep learning’s recent history has been one of achievement: from triumphing over humans in the game of Go to world-leading performance in image classification, voice recognition, translation, and other tasks. But this progress has come with a voracious appetite for computing power. This article catalogs the extent of this dependency, showing that progress across a wide variety of applications is strongly reliant on increases in computing power. Extrapolating forward this reliance reveals that progress along current lines is rapidly becoming economically, technically, and environmentally unsustainable. Thus, continued progress in these applications will require dramatically more computationally-efficient methods, which will either have to come from changes to deep learning or from moving to other machine learning methods.

 
 
 K eywords  Deep Learning   ⋅ ⋅ \cdot 
Computing Power   ⋅ ⋅ \cdot 
Computational Burden   ⋅ ⋅ \cdot 
Scaling   ⋅ ⋅ \cdot 
Machine Learning

 
 
 
 1 Introduction

 
 In this article, we present a comprehensive meta-analysis of how deep learning progress depends on growing computational power and use this to understand not just how particular models scale, but how the field as a whole does. Our analysis differs from previous ones in that we are (i) more precise in the models we compare than are many high-level historical analyses, which allows us to better understand how performance changes as computing scales up, and (ii) better able to account for innovation in the field than estimates where researchers have tested scaling by varying the compute used in training their own models.

 
 
 To understand scaling in deep learning, we analyze 1 , 527 1 527 1{,}527 research papers found in the arXiv pre-print repository, as well as other sources, in the domains of image classification, object detection, question answering, named entity recognition, machine translation, speech recognition, face detection, image generation, and pose estimation. We find that computational requirements have escalated dramatically and that increases in computing power have been central to performance improvements.

 
 
 This finding has important public policy implications: if current trends continue, the growing “computational burden” of deep learning will rapidly become technically and economically prohibitive. Such a rapid escalation in computing needed also implies alarming growth

... (truncated, 98 KB total)
Resource ID: 676a61fd3e474b32 | Stable ID: sid_giii8XNe6q