Claude Haiku 4.5 on OSWorld: 50.7

benchmark-result

Metadata

`id`	DrtKMMtM7g
`benchmarkId`	Hpb8OjdhT9
`modelId`	Claude Haiku 4.5(ai-model)
`score`	50.7
`unit`	percent
`date`	2025-10-15
`sourceUrl`	—
`notes`	OSWorld-Verified computer-use benchmark. Outperforms Sonnet 4 (42.2%) and far exceeds Sonnet 3.5 (14%). Reported by Anthropic at release.
`testedBy`	unknown
`testedByOrgId`	—
`evaluationDate`	—
`methodologyNotes`	—

unverifiable95% confidence

Last checked: 4/24/2026

Inline sourcing: unverifiable

Debug info

Thing ID: DrtKMMtM7g

Source Table: benchmark_results

Source ID: DrtKMMtM7g

Parent Thing ID: Hpb8OjdhT9