Skip to content
Longterm Wiki
benchmark

MMLU

Metadata

Source Tablebenchmarks
Source IDizV3Xk98se
DescriptionMassive Multitask Language Understanding — a multiple-choice benchmark covering 57 academic subjects from STEM to humanities.
Source URLgithub.com/hendrycks/test
Wiki IDmmlu
Children
CreatedMar 14, 2026, 12:43 AM
UpdatedMar 24, 2026, 11:24 PM
SyncedMar 24, 2026, 11:24 PM

Record Data

idizV3Xk98se
slugmmlu
nameMMLU
categoryknowledge
descriptionMassive Multitask Language Understanding — a multiple-choice benchmark covering 57 academic subjects from STEM to humanities.
websitegithub.com/hendrycks/test
scoringMethodaccuracy
higherIsBetterYes
introducedDate2021-01
maintainerDan Hendrycks et al.
sourcearxiv.org/abs/2009.03300
Debug info

Thing ID: izV3Xk98se

Source Table: benchmarks

Source ID: izV3Xk98se

Wiki ID: mmlu