benchmark
Humanity's Last Exam
Metadata
| Source Table | benchmarks |
| Source ID | Xt4Dv7KAey |
| Description | A benchmark of 2,500+ expert-level questions across dozens of academic disciplines, designed to be the hardest public AI evaluation. Questions contributed by domain experts worldwide. |
| Source URL | lastexam.ai/ |
| Wiki ID | humanitys-last-exam |
| Children | — |
| Created | Mar 14, 2026, 12:43 AM |
| Updated | Mar 24, 2026, 11:24 PM |
| Synced | Mar 24, 2026, 11:24 PM |
Record Data
id | Xt4Dv7KAey |
slug | humanitys-last-exam |
name | Humanity's Last Exam |
category | reasoning |
description | A benchmark of 2,500+ expert-level questions across dozens of academic disciplines, designed to be the hardest public AI evaluation. Questions contributed by domain experts worldwide. |
website | lastexam.ai/ |
scoringMethod | accuracy |
higherIsBetter | Yes |
introducedDate | 2025-01 |
maintainer | Scale AI / Center for AI Safety |
source | arxiv.org/abs/2501.14249 |
Debug info
Thing ID: Xt4Dv7KAey
Source Table: benchmarks
Source ID: Xt4Dv7KAey
Wiki ID: humanitys-last-exam