Llama 4 Scout was released April 5, 2025 as a 17B active parameter mixture-of-experts model (109B total, 16 experts). Featured a 10M token context window — the longest of any production model at launch. Natively multimodal (text + image). Scored 89.3% on MMLU. Beat Gemini 2.0 Flash and GPT-4o on multiple benchmarks while being deployable on a single H100 GPU.
Modality
text, image
Capabilities3
tool-usevisionlong-context
Details
Model FamilyLlama
Generation4
Release Date2025-04-05
Parameters109B
Context Window10M tokens
Open WeightYes
Tags
llamametaopen-weightmixture-of-expertslong-context