Skip to content
Longterm Wiki
benchmark

IFEval

Metadata

Source Tablebenchmarks
Source IDz0wtR8wrVv
DescriptionInstruction-Following Evaluation benchmark testing whether LLMs can follow explicit formatting constraints (e.g., 'write exactly 3 paragraphs', 'include these keywords').
Wiki IDifeval
Children
CreatedMar 14, 2026, 12:43 AM
UpdatedMar 24, 2026, 11:24 PM
SyncedMar 24, 2026, 11:24 PM

Record Data

idz0wtR8wrVv
slugifeval
nameIFEval
categorygeneral
descriptionInstruction-Following Evaluation benchmark testing whether LLMs can follow explicit formatting constraints (e.g., 'write exactly 3 paragraphs', 'include these keywords').
website
scoringMethodaccuracy
higherIsBetterYes
introducedDate2023-11
maintainerGoogle Research
sourcearxiv.org/abs/2311.07911
Debug info

Thing ID: z0wtR8wrVv

Source Table: benchmarks

Source ID: z0wtR8wrVv

Wiki ID: ifeval