Metadata
| Source Table | benchmark_results |
| Source ID | i5f76T7Z3K |
| Parent | GSM8K |
| Children | — |
| Created | Apr 24, 2026, 7:31 PM |
| Updated | Apr 24, 2026, 7:31 PM |
| Synced | Apr 24, 2026, 7:31 PM |
Record Data
id | i5f76T7Z3K |
benchmarkId | fjjBrOI3p2 |
modelId | Claude 3.5 Sonnet(ai-model) |
score | 96.4 |
unit | percent |
date | 2024-06-21 |
sourceUrl | — |
notes | Grade school math word problems, multi-step reasoning |
Source Check Verdicts
confirmed95% confidence
Last checked: 4/24/2026
Inline sourcing: confirmed
Debug info
Thing ID: i5f76T7Z3K
Source Table: benchmark_results
Source ID: i5f76T7Z3K
Parent Thing ID: fjjBrOI3p2