Skip to content
Longterm Wiki
publication

Measuring Massive Multitask Language Understanding (MMLU)

Metadata

Source Tablepublications
Source IDQNbZISNZtJ
DescriptionDan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt, 2020-09
Source URLarxiv.org/abs/2009.03300
ParentCenter for AI Safety (CAIS)
Children
CreatedMar 23, 2026, 2:46 PM
UpdatedMar 23, 2026, 2:46 PM
SyncedMar 23, 2026, 2:46 PM

Record Data

idQNbZISNZtJ
entityIdCenter for AI Safety (CAIS)(organization)
entityDisplayName
resourceId
titleMeasuring Massive Multitask Language Understanding (MMLU)
authorsDan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt
urlarxiv.org/abs/2009.03300
venue
publishedDate2020-09
publicationTypepaper
citationCount
isFlagshipYes
abstract
sourcearxiv.org/abs/2009.03300
notesMost widely used AI capability benchmark. ICLR 2021.

Source Check Verdicts

confirmed95% confidence

Last checked: 4/3/2026

All key fields in the record are confirmed by the source text. The title, all seven authors in the correct order, and the publication date (2020-09, corresponding to arxiv submission September 2020) are explicitly stated. The arxiv URL format matches the standard pattern for arxiv papers. The publication type as 'paper' is appropriate for an arxiv preprint. No contradictions or discrepancies were found.

Debug info

Thing ID: QNbZISNZtJ

Source Table: publications

Source ID: QNbZISNZtJ