Radford et al., "Learning to Generate Reviews and Discovering Sentiment" (OpenAI, 2017).
webCredibility Rating
High quality. Established institution or organization with editorial oversight and accountability.
Rating inherited from publication venue: OpenAI
An early influential result showing that large neural networks trained on prediction tasks can spontaneously develop interpretable internal features; relevant to mechanistic interpretability research and debates about emergent representations in AI systems.
Metadata
Summary
Radford et al. trained a multiplicative LSTM on 82 million Amazon reviews to predict the next character, discovering that the model unsupervised learned a single 'sentiment neuron' highly predictive of sentiment. This representation achieves state-of-the-art accuracy on Stanford Sentiment Treebank (91.8%) and can match fully supervised systems with 30-100x fewer labeled examples, suggesting large neural networks spontaneously develop interpretable internal representations.
Key Points
- •A multiplicative LSTM trained purely on next-character prediction emergently developed a single neuron encoding almost all sentiment signal, without explicit supervision.
- •The learned representation achieved 91.8% accuracy on Stanford Sentiment Treebank, surpassing the previous best of 90.2% with a simple linear model on top.
- •The model can match supervised baselines with as few as 11-232 labeled examples, demonstrating extreme label efficiency via unsupervised pretraining.
- •The sentiment neuron can be manually overwritten to controllably steer generated text toward positive or negative sentiment.
- •Authors hypothesize this emergent interpretability is a general property of large neural networks trained to predict sequential inputs, not specific to their architecture.
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| Large Language Models | Capability | 60.0 |
Cached Content Preview
-->
Ask the publishers to restore access to 500,000+ books.
Hamburger icon
An icon used to represent a menu that can be
toggled by interacting with this icon.
Internet Archive logo
A line drawing of the Internet Archive headquarters
building façade.
Web icon
An illustration of a computer
application window
Wayback Machine
Texts icon
An illustration of an open book.
Texts
Video icon
An illustration of two cells of a film
strip.
Video
Audio icon
An illustration of an audio speaker.
Audio
Software icon
An illustration of a 3.5" floppy
disk.
Software
Images icon
An illustration of two photographs.
Images
Donate icon
An illustration of a heart shape
Donate
Ellipses icon
An illustration of text ellipses.
More
Donate icon
An illustration of a heart shape
"Donate to the archive"
User icon
An illustration of a person's head and chest.
Sign up
|
Log in
Upload icon
An illustration of a horizontal line over an up
pointing arrow.
Upload
Search icon
An illustration of a magnifying glass.
Search the Archive
Search icon
An illustration of a magnifying glass.
Internet Archive Audio
Live Music
Archive
Librivox
Free Audio
Featured
All Audio
Grateful Dead
Netlabels
Old Time Radio
78 RPMs
and Cylinder Recordings
Top
Audio Books
& Poetry
Computers,
Technology and Science
Music, Arts
& Culture
News &
Public Affairs
Spirituality
& Religion
Podcasts
Radio News
Archive
Images
Metropolitan Museum
Cleveland
Museum of Art
Featured
All Images
Flickr Commons
Occupy Wall
Street Flickr
Cover Art
USGS Maps
Top
NASA Images
Solar System
Collection
Ames Research
Center
Software
Internet
Arcade
Console Living Room
Featured
All Software
Old School
Emulation
MS-DOS Games
Historical
Software
Classic PC
Games
Software
Library
Top
Kodi
Archive and Support File
Vintage
Software
APK
MS-DOS
CD-ROM
Software
CD-ROM
Software Library
Software Sites
Tucows
Software Library
Shareware
CD-ROMs
Software
Capsules Compilation
CD-ROM Images
ZX Spectrum
DOOM Level CD
... (truncated, 6 KB total)370658949c0fdca7 | Stable ID: NTk2MGRjZT