Index
sid_dHgSM46fMw / GPQA Diamond: 80.9
Verdictconfirmed95%
· 4/24/2026Inline sourcing: confirmed
Our claim
entire record- Benchmark
- bdDmOTMoX8
- Model
- Claude Opus 4.1
- Score
- 80.9
- Unit
- percent
- Date
- August 5, 2025
- Notes
- Graduate-level reasoning benchmark. Reported with extended thinking mode (up to 64K tokens).
Source evidence
0 src · 0 checksNo evidence on file.
Case № Rsc7oluuIdFiled 4/24/2026Confidence 95%