Index
sid_tppPAkJqjQ / HumanEval: 92
Verdictconfirmed99%
1 check · 4/24/2026Inline sourcing: confirmed
Our claim
entire record- Benchmark
- vxX2rorgxU
- Model
- Claude Opus 4.5
- Score
- 92
- Unit
- percent
- Date
- November 24, 2025
- Notes
- HumanEval - Python function implementation benchmark
Source evidence
1 src · 1 checkconfirmed99%inline-submission · 4/24/2026
Case № I4H3zQ7VJKFiled 4/24/2026Confidence 99%