ImageNet Classification with Deep CNNs

web

papers.nips.cc·papers.nips.cc/paper/2012/hash/c399862d3b9d6b76c8436e924a...

AlexNet is widely considered the paper that launched the modern deep learning era; relevant to AI safety discussions about rapid capability jumps, scaling laws, and the difficulty of anticipating transformative AI progress.

Metadata

Importance: 72/100conference paperprimary source

Summary

This landmark 2012 paper by Krizhevsky, Sutskever, and Hinton introduced AlexNet, a deep convolutional neural network that dramatically outperformed prior methods on the ImageNet Large Scale Visual Recognition Challenge. It demonstrated that deep CNNs trained on GPUs could achieve state-of-the-art image classification, catalyzing the modern deep learning revolution. The techniques introduced—ReLU activations, dropout regularization, and GPU training—became foundational to subsequent AI progress.

Key Points

•AlexNet achieved top-5 error of 15.3% on ImageNet 2012, far surpassing the runner-up at 26.2%, demonstrating a qualitative leap in vision capabilities.
•Introduced or popularized key architectural innovations: ReLU activations, dropout regularization, data augmentation, and multi-GPU training.
•Marked the beginning of the modern deep learning era, directly inspiring rapid capability scaling across vision, NLP, and other domains.
•Demonstrated that increased compute (GPU training) combined with larger datasets could unlock qualitatively superior AI performance.
•Highly relevant to AI safety as a case study in rapid, unexpected capability jumps that outpaced theoretical understanding.

Cited by 1 page

Page	Type	Quality
Geoffrey Hinton	Person	42.0

Cached Content Preview

HTTP 200Fetched Apr 10, 20261 KB

ImageNet Classification with Deep Convolutional Neural Networks

Bibtex Metadata Paper Supplemental

Abstract

We trained a large, deep convolutional neural network to classify the 1.3 million high-resolution images in the LSVRC-2010 ImageNet training set into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 39.7\% and 18.9\% which is considerably better than the previous state-of-the-art results. The neural network, which has 60 million parameters and 500,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and two globally connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implementation of convolutional nets. To reduce overfitting in the globally connected layers we employed a new regularization method that proved to be very effective.

Name Change Policy

&times;

Requests for name changes in the electronic proceedings will be accepted with no questions asked. However name changes may cause bibliographic tracking issues. Authors are asked to consider this carefully and discuss it with their co-authors prior to requesting a name change in the electronic proceedings.

Use the "Report an Issue" link to request a name change.

Do not remove: This comment is monitored to verify that the site is working properly

Resource ID: f942901a4b4246c9 | Stable ID: sid_BTz1Dj9HxJ