Longterm Wiki
Back

RE-Bench: Evaluating frontier AI R&D capabilities

web

Credibility Rating

4/5
High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: METR

Data Status

Not fetched

Cited by 3 pages

Resource ID: 056e0ff33675b825 | Stable ID: OGExZTMwN2