Fact·f_tHAA1W30dw·Fact

xAI — Benchmark Score: 93.3

Verdictconfirmed95%

1 check · 7/20/2026

1 → confirmed

Our claim

entire record

Subject: xAI
Property: Benchmark Score
Value: 93.3
As Of: 2025
Source: https://x.ai/news/grok-3
Notes: Grok 3 AIME 2025 benchmark: 93.3% success rate

Source evidence

1 src · 1 check

x.ai/news/grok-3 resource

confirmed95%primaryHaiku 4.5 · 7/20/2026

NoteThe source directly confirms the claim. The xAI announcement states: 'We tested these models on the 2025 American Invitational Mathematics Examination (AIME), which was released just 7 days ago on Feb 12th. With our highest level of test-time compute (cons@64), Grok 3 (Think) achieved 93.3% on this competition.' The benchmark score of 93.3% for Grok 3 on AIME 2025 is explicitly stated in the source text, matching the claim exactly. The date context (February 19, 2025) aligns with the 'as of 2025' qualifier in the claim.

Case № f_tHAA1W30dwFiled 7/20/2026Confidence 95%