benchmark-result

GPT on HumanEval: 90.2

Metadata

`id`	G2od24L4z9
`benchmarkId`	vxX2rorgxU
`modelId`	GPT(ai-model)
`score`	90.2
`unit`	percent
`date`	2024-05-19
`sourceUrl`	—
`notes`	GPT-4o scores on HumanEval code generation benchmark

confirmed99% confidence

Last checked: 4/24/2026

Inline sourcing: confirmed

Debug info

Thing ID: G2od24L4z9

Source Table: benchmark_results

Source ID: G2od24L4z9

Parent Thing ID: vxX2rorgxU