Skip to content
Longterm Wiki
Index
Publication·QNbZISNZtJ·Record

Publication: Measuring Massive Multitask Language Understanding (MMLU) by Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt (2020-09)

Verdictconfirmed95%
1 check · 4/3/2026

All key fields in the record are confirmed by the source text. The title, all seven authors in the correct order, and the publication date (2020-09, corresponding to arxiv submission September 2020) are explicitly stated. The arxiv URL format matches the standard pattern for arxiv papers. The publication type as 'paper' is appropriate for an arxiv preprint. No contradictions or discrepancies were found.

Our claim

entire record
Title
Measuring Massive Multitask Language Understanding (MMLU)
Authors
Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt
Published Date
September 2020
Publication Type
paper
Is Flagship
Yes
Notes
Most widely used AI capability benchmark. ICLR 2021.

Source evidence

1 src · 1 check
confirmed95%Haiku 4.5 · 4/3/2026

NoteAll key fields in the record are confirmed by the source text. The title, all seven authors in the correct order, and the publication date (2020-09, corresponding to arxiv submission September 2020) are explicitly stated. The arxiv URL format matches the standard pattern for arxiv papers. The publication type as 'paper' is appropriate for an arxiv preprint. No contradictions or discrepancies were found.

Case № QNbZISNZtJFiled 4/3/2026Confidence 95%