Publication: Measuring Massive Multitask Language Understanding (MMLU) by Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt (2020-09)
All key fields in the record are confirmed by the source text. The title, all seven authors in the correct order, and the publication date (2020-09, corresponding to arxiv submission September 2020) are explicitly stated. The arxiv URL format matches the standard pattern for arxiv papers. The publication type as 'paper' is appropriate for an arxiv preprint. No contradictions or discrepancies were found.
Our claim
entire record- Title
- Measuring Massive Multitask Language Understanding (MMLU)
- Authors
- Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt
- Published Date
- September 2020
- Publication Type
- paper
- Is Flagship
- Yes
- Notes
- Most widely used AI capability benchmark. ICLR 2021.
Source evidence
1 src · 1 checkNoteAll key fields in the record are confirmed by the source text. The title, all seven authors in the correct order, and the publication date (2020-09, corresponding to arxiv submission September 2020) are explicitly stated. The arxiv URL format matches the standard pattern for arxiv papers. The publication type as 'paper' is appropriate for an arxiv preprint. No contradictions or discrepancies were found.