Back
Zoom In: An Introduction to Circuits
webdistill.pub·distill.pub/2020/circuits/zoom-in/
A seminal Distill.pub paper by Olah et al. (OpenAI, 2020) that launched the 'circuits' research thread, widely considered foundational reading for mechanistic interpretability research in AI safety.
Metadata
Importance: 90/100blog postprimary source
Summary
This foundational Distill article introduces the 'circuits' framework for neural network interpretability, arguing that by studying connections between neurons we can reverse-engineer meaningful algorithms in neural network weights. It proposes three speculative claims: that features are the fundamental units of neural networks, that features are connected by circuits, and that similar features and circuits recur across different models and tasks.
Key Points
- •Introduces the 'circuits' approach to mechanistic interpretability, treating neural networks as reverse-engineerable computational systems with meaningful internal structure.
- •Proposes that neural networks contain interpretable 'features' (e.g., curve detectors, high-low frequency detectors) as fundamental units of computation.
- •Argues that features are connected by 'circuits'—subgraphs of the network that implement identifiable algorithms.
- •Claims universality: similar features and circuits appear across different architectures and tasks, suggesting convergent computational solutions.
- •Uses the analogy of scientific 'zooming in' (microscopes→cells, crystallography→DNA) to frame mechanistic interpretability as a paradigm shift in understanding AI.
Cited by 2 pages
| Page | Type | Quality |
|---|---|---|
| Chris Olah | Person | 27.0 |
| Mechanistic Interpretability | Research Area | 59.0 |
Cached Content Preview
HTTP 200Fetched Apr 9, 202661 KB
Zoom In: An Introduction to Circuits
Zoom In: An Introduction to Circuits
Distill
{ "title": "Zoom In: An Introduction to Circuits", "description": "By studying the connections between neurons, we can find meaningful algorithms in the weights of neural networks.",
"authors": [
{ "author": "Chris Olah", "authorURL": "https://colah.github.io", "affiliation": "OpenAI", "affiliationURL": "https://openai.com" },
{ "author": "Nick Cammarata", "authorURL": "http://nickcammarata.com", "affiliation": "OpenAI", "affiliationURL": "https://openai.com" },
{ "author": "Ludwig Schubert", "authorURL": "https://schubert.io/", "affiliation": "OpenAI", "affiliationURL": "https://openai.com" },
{ "author": "Gabriel Goh", "authorURL": "http://gabgoh.github.io", "affiliation": "OpenAI", "affiliationURL": "https://openai.com" },
{ "author": "Michael Petrov", "authorURL": "https://twitter.com/mpetrov", "affiliation": "OpenAI", "affiliationURL": "https://openai.com" },
{ "author": "Shan Carter", "authorURL": "http://shancarter.com", "affiliation": "OpenAI", "affiliationURL": "https://openai.com" }
] }
Zoom In: An Introduction to Circuits
By studying the connections between neurons, we can find meaningful algorithms in the weights of neural networks.
-->
Authors
Affiliations
Chris Olah
OpenAI
Nick Cammarata
OpenAI
Ludwig Schubert
OpenAI
Gabriel Goh
OpenAI
Michael Petrov
OpenAI
Shan Carter
OpenAI
Published
March 10, 2020
DOI
10.23915/distill.00024.001
This article is part of the Circuits thread , an experimental format collecting invited short articles and critical commentary delving into the inner workings of neural networks.
-->
Circuits Thread
An Overview of Early Vision in InceptionV1
Introduction
-->
Many important transition points in the history of science have been moments when science “zoomed in.”
At these points, we develop a visualization or tool that allows us to see the world in a new level of detail, and a new field of science develops to study the world through this lens.
For example, microscopes let us see cells, leading to cellular biology. Science zoomed in. Several techniques including x-ray crystallography let us see DNA, leading to the molecular revolution. Science zoomed in. Atomic theory. Subatomic particles. Neuroscience. Science zoomed in.
These transitions weren’t just a change in precision: they were qualitative changes in what the object
... (truncated, 61 KB total)Resource ID:
346b1574c0c3ce67 | Stable ID: sid_DlMLRNm3TO