Skip to content
Longterm Wiki

BrowseComp

Agentic

A benchmark evaluating AI systems' ability to find hard-to-locate information on the web, testing browsing, search, and information synthesis capabilities across difficult queries.

Models Tested
0
Scoring: accuracy
Introduced: 2025-04
Maintainer: OpenAI
No model scores recorded for this benchmark yet.