Skip to content
Longterm Wiki
benchmark

MATH

Metadata

Source Tablebenchmarks
Source IDq6rR1sbyZG
DescriptionA dataset of 12,500 competition mathematics problems testing mathematical reasoning across difficulty levels 1-5.
Wiki IDmath-benchmark
Children
CreatedMar 14, 2026, 12:43 AM
UpdatedMar 24, 2026, 11:24 PM
SyncedMar 24, 2026, 11:24 PM

Record Data

idq6rR1sbyZG
slugmath-benchmark
nameMATH
categorymath
descriptionA dataset of 12,500 competition mathematics problems testing mathematical reasoning across difficulty levels 1-5.
website
scoringMethodaccuracy
higherIsBetterYes
introducedDate2021-03
maintainerDan Hendrycks et al.
sourcearxiv.org/abs/2103.03874
Debug info

Thing ID: q6rR1sbyZG

Source Table: benchmarks

Source ID: q6rR1sbyZG

Wiki ID: math-benchmark