Metadata
| Source Table | benchmark_results |
| Source ID | BDgqDpG3vh |
| Parent | HumanEval |
| Children | — |
| Created | Apr 24, 2026, 7:13 PM |
| Updated | Apr 24, 2026, 7:13 PM |
| Synced | Apr 24, 2026, 7:13 PM |
Record Data
id | BDgqDpG3vh |
benchmarkId | vxX2rorgxU |
modelId | Grok(ai-model) |
score | 86.5 |
unit | percent |
date | 2025-02-19 |
sourceUrl | — |
notes | Grok 3 - Code generation from Python function docstrings with unit tests |
Source Check Verdicts
confirmed95% confidence
Last checked: 4/24/2026
Inline sourcing: confirmed
Debug info
Thing ID: BDgqDpG3vh
Source Table: benchmark_results
Source ID: BDgqDpG3vh
Parent Thing ID: vxX2rorgxU