Skip to content
Longterm Wiki
Index
Citation·page:anthropic:fn17

Anthropic - Footnote 17

Verdictpartial85%
1 check · 4/3/2026

The source does not mention the specific percentage achieved on SWE-bench Verified (80.9%) or OSWorld (61.4%). The source does not state that Claude Opus 4.5 is the first AI model to exceed 80% on SWE-bench Verified or 60% on Terminal-Bench 2.0. The source does not provide the next-best model's score on OSWorld (7.8%). The source only mentions Terminal Bench, not Terminal-Bench 2.0.

Our claim

entire record

No record data available.

Source evidence

1 src · 1 check
partial85%Haiku 4.5 · 4/3/2026

NoteThe source does not mention the specific percentage achieved on SWE-bench Verified (80.9%) or OSWorld (61.4%). The source does not state that Claude Opus 4.5 is the first AI model to exceed 80% on SWE-bench Verified or 60% on Terminal-Bench 2.0. The source does not provide the next-best model's score on OSWorld (7.8%). The source only mentions Terminal Bench, not Terminal-Bench 2.0.

Case № page:anthropic:fn17Filed 4/3/2026Confidence 85%