Skip to content
Longterm Wiki
benchmark-result

Grok on GSM8K: 89.3

Child of GSM8K

Metadata

Source Tablebenchmark_results
Source IDfSReF5SE3c
ParentGSM8K
Children
CreatedApr 24, 2026, 7:13 PM
UpdatedApr 24, 2026, 7:13 PM
SyncedApr 24, 2026, 7:13 PM

Record Data

idfSReF5SE3c
benchmarkIdfjjBrOI3p2
modelIdGrok(ai-model)
score89.3
unitpercent
date2025-02-19
sourceUrl
notesGrok 3 - Grade School Math 8K word problems with multi-step reasoning

Source Check Verdicts

confirmed95% confidence

Last checked: 4/24/2026

Inline sourcing: confirmed

Debug info

Thing ID: fSReF5SE3c

Source Table: benchmark_results

Source ID: fSReF5SE3c

Parent Thing ID: fjjBrOI3p2