Radford et al., "Improving Language Understanding by Generative Pre-Training" (OpenAI, 2018).
webCredibility Rating
High quality. Established institution or organization with editorial oversight and accountability.
Rating inherited from publication venue: OpenAI
GPT-1 is the foundational paper that launched the GPT series; understanding it is essential context for evaluating the capabilities trajectory of large language models and their AI safety implications.
Metadata
Summary
This paper introduces GPT-1, demonstrating that generative pre-training of a language model on large unlabeled text corpora followed by discriminative fine-tuning on specific tasks yields strong performance across diverse NLP benchmarks. It established the foundational paradigm of unsupervised pre-training plus supervised fine-tuning that underpins modern large language models. The work showed that transformer-based models can learn general-purpose language representations transferable to downstream tasks with minimal task-specific architecture changes.
Key Points
- •Introduced GPT-1: a transformer decoder pre-trained with a language modeling objective on BooksCorpus (~7,000 books).
- •Demonstrated that generative pre-training followed by task-specific fine-tuning outperforms task-specific architectures on 9 of 12 NLP benchmarks.
- •Established the foundational pre-train/fine-tune paradigm that directly led to GPT-2, GPT-3, and subsequent large language models.
- •Showed that unsupervised pre-training enables models to learn useful linguistic and world-knowledge representations without labeled data.
- •Highlighted the importance of model scale and training data diversity for transferable language representations.
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| Large Language Models | Capability | 60.0 |
Cached Content Preview
-->
Ask the publishers to restore access to 500,000+ books.
Hamburger icon
An icon used to represent a menu that can be
toggled by interacting with this icon.
Internet Archive logo
A line drawing of the Internet Archive headquarters
building façade.
Web icon
An illustration of a computer
application window
Wayback Machine
Texts icon
An illustration of an open book.
Texts
Video icon
An illustration of two cells of a film
strip.
Video
Audio icon
An illustration of an audio speaker.
Audio
Software icon
An illustration of a 3.5" floppy
disk.
Software
Images icon
An illustration of two photographs.
Images
Donate icon
An illustration of a heart shape
Donate
Ellipses icon
An illustration of text ellipses.
More
Donate icon
An illustration of a heart shape
"Donate to the archive"
User icon
An illustration of a person's head and chest.
Sign up
|
Log in
Upload icon
An illustration of a horizontal line over an up
pointing arrow.
Upload
Search icon
An illustration of a magnifying glass.
Search the Archive
Search icon
An illustration of a magnifying glass.
Internet Archive Audio
Live Music
Archive
Librivox
Free Audio
Featured
All Audio
Grateful Dead
Netlabels
Old Time Radio
78 RPMs
and Cylinder Recordings
Top
Audio Books
& Poetry
Computers,
Technology and Science
Music, Arts
& Culture
News &
Public Affairs
Spirituality
& Religion
Podcasts
Radio News
Archive
Images
Metropolitan Museum
Cleveland
Museum of Art
Featured
All Images
Flickr Commons
Occupy Wall
Street Flickr
Cover Art
USGS Maps
Top
NASA Images
Solar System
Collection
Ames Research
Center
Software
Internet
Arcade
Console Living Room
Featured
All Software
Old School
Emulation
MS-DOS Games
Historical
Software
Classic PC
Games
Software
Library
Top
Kodi
Archive and Support File
Vintage
Software
APK
MS-DOS
CD-ROM
Software
CD-ROM
Software Library
Software Sites
Tucows
Software Library
Shareware
CD-ROMs
Software
Capsules Compilation
CD-ROM Images
ZX Spectrum
DOOM Level CD
... (truncated, 6 KB total)8a20a28f94410f17 | Stable ID: MTVhNmE0OT