WinoGrande

Reasoning

A large-scale commonsense reasoning benchmark with 44,000 Winograd-schema-style problems, using adversarial filtering to reduce annotation artifacts.

Models Tested

Scoring: accuracy

Introduced: 2019-07

Maintainer: AI2

No model scores recorded for this benchmark yet.