WinoGrande
ReasoningA large-scale commonsense reasoning benchmark with 44,000 Winograd-schema-style problems, using adversarial filtering to reduce annotation artifacts.
Models Tested
0
Scoring: accuracy
Introduced: 2019-07
Maintainer: AI2
No model scores recorded for this benchmark yet.