MATH
MathA dataset of 12,500 competition mathematics problems testing mathematical reasoning across difficulty levels 1-5.
Models Tested
0
Scoring: accuracy
Introduced: 2021-03
Maintainer: Dan Hendrycks et al.
No model scores recorded for this benchmark yet.