benchmark
MATH
Metadata
| Source Table | benchmarks |
| Source ID | q6rR1sbyZG |
| Description | A dataset of 12,500 competition mathematics problems testing mathematical reasoning across difficulty levels 1-5. |
| Wiki ID | math-benchmark |
| Children | — |
| Created | Mar 14, 2026, 12:43 AM |
| Updated | Mar 24, 2026, 11:24 PM |
| Synced | Mar 24, 2026, 11:24 PM |
Record Data
id | q6rR1sbyZG |
slug | math-benchmark |
name | MATH |
category | math |
description | A dataset of 12,500 competition mathematics problems testing mathematical reasoning across difficulty levels 1-5. |
website | — |
scoringMethod | accuracy |
higherIsBetter | Yes |
introducedDate | 2021-03 |
maintainer | Dan Hendrycks et al. |
source | arxiv.org/abs/2103.03874 |
Debug info
Thing ID: q6rR1sbyZG
Source Table: benchmarks
Source ID: q6rR1sbyZG
Wiki ID: math-benchmark