A Mathematical Framework

web

Transformer Circuits·transformer-circuits.pub/2021/framework/index.html

Credibility Rating

4/5

High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: Transformer Circuits

Data Status

Not fetched

Cited by 2 pages

Page	Type	Quality
Is Interpretability Sufficient for Safety?	Crux	49.0
AI Alignment	Approach	91.0

Cached Content Preview

HTTP 200Fetched Feb 26, 202698 KB

[Transformer Circuits Thread](https://transformer-circuits.pub/)

# A Mathematical Framework for Transformer Circuits

### Authors

[Nelson Elhage∗†](https://nelhage.com/),[Neel Nanda∗](https://www.neelnanda.io/),Catherine Olsson∗,[Tom Henighan†](https://tomhenighan.com/),Nicholas Joseph†,[Ben Mann†](https://benjmann.net/),Amanda Askell,Yuntao Bai,Anna Chen,Tom Conerly,Nova DasSarma,Dawn Drain,Deep Ganguli,[Zac Hatfield-Dodds](https://zhd.dev/),Danny Hernandez,Andy Jones,Jackson Kernion,Liane Lovitt,Kamal Ndousse,Dario Amodei,Tom Brown,Jack Clark,Jared Kaplan,Sam McCandlish,[Chris Olah‡](https://colah.github.io/)

### Affiliation

[Anthropic](https://www.anthropic.com/)

### Published

Dec 22, 2021

\\* Core Research Contributor;† Core Infrastructure Contributor;‡ Correspondence to [colah@anthropic.com](https://transformer-circuits.pub/2021/framework/colah@anthropic.com);[Author contributions statement below](https://transformer-circuits.pub/2021/framework/index.html#author-contributions).

### Authors

### Affiliations

### Published

_Not published yet._

### DOI

_No DOI yet._

Transformer

- **Attention is all you need** [\[PDF\]](https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf)

  A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, L. Kaiser, I. Polosukhin.

  Advances in neural information processing systems, pp. 5998--6008. 2017.

\[1\]

 language models are an emerging technology that is gaining increasingly broad real-world use, for example in systems like GPT-3

- **Language models are few-shot learners** [\[PDF\]](https://arxiv.org/pdf/2005.14165.pdf)

  T.B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, others.

  arXiv preprint arXiv:2005.14165. 2020.

\[2\]

, LaMDA

- **LaMDA: our breakthrough conversation technology** [\[link\]](https://blog.google/technology/ai/lamda/)

  E. Collins, Z. Ghahramani. 2021.

\[3\]

, Codex

- **Evaluating large language models trained on code**

  M. Chen, J. Tworek, H. Jun, Q. Yuan, H.P.d.O. Pinto, J. Kaplan, H. Edwards, Y. Burda, N. Joseph, G. Brockman, others.

  arXiv preprint arXiv:2107.03374. 2021.

\[4\]

, Meena

- **Towards a human-like open-domain chatbot**

  D. Adiwardana, M. Luong, D.R. So, J. Hall, N. Fiedel, R. Thoppilan, Z. Yang, A. Kulshreshtha, G. Nemade, Y. Lu, others.

  arXiv preprint arXiv:2001.09977. 2020.

\[5\]

, Gopher

- **Scaling Language Models: Methods, Analysis & Insights from Training Gopher** [\[PDF\]](https://storage.googleapis.com/deepmind-media/research/language-research/Training%20Gopher.pdf)

  J.W. Rae, S. Borgeaud, T. Cai, K. Millican, J. Hoffmann, F. Song, J. Aslanides, S. Henderson, R. Ring, S. Young, E. Rutherford, T. Hennigan, J. Menick, A. Cassirer, R. Powell, G.v.d. Driessche, L.A. Hendricks, M. Rauh, P. Huang, A. Glaese, J. Welbl, S. Dathathri, S. Huang, J. Uesato, J. Mellor, I. Higgins, A. Creswell, N. McAleese, A. Wu, E. Elsen,

... (truncated, 98 KB total)

Resource ID: b948d6282416b586 | Stable ID: ZDRhYWNiMT