MMLU

benchmark

Metadata

Source Table	`benchmarks`
Source ID	`izV3Xk98se`
Description	Massive Multitask Language Understanding — a multiple-choice benchmark covering 57 academic subjects from STEM to humanities.
Source URL	github.com/hendrycks/test
Wiki ID	mmlu
Children	—
Created	Mar 14, 2026, 12:43 AM
Updated	Mar 24, 2026, 11:24 PM
Synced	Mar 24, 2026, 11:24 PM

Record Data

`id`	izV3Xk98se
`slug`	mmlu
`name`	MMLU
`category`	knowledge
`subCategory`	—
`description`	Massive Multitask Language Understanding — a multiple-choice benchmark covering 57 academic subjects from STEM to humanities.
`website`	github.com/hendrycks/test
`scoringMethod`	accuracy
`higherIsBetter`	Yes
`introducedDate`	2021-01
`maintainer`	Dan Hendrycks et al.
`source`	arxiv.org/abs/2009.03300

Debug info

Thing ID: izV3Xk98se

Source Table: benchmarks

Source ID: izV3Xk98se

Wiki ID: mmlu