Llama 4
Leading Intelligence. Unrivaled speed and efficiency.The most accessible and scalable generation of Llama is here.Download models
LATEST
Llama 4 models
Our models are optimized for easy deployment, cost efficiency, and performance that scales to billions of users.Llama 4 Scout
Class-leading natively multimodal model that offers superior text and visual intelligence, single H100 GPU efficiency, and a 10M context window for seamless long document analysis.
Download
Llama 4 Maverick
Industry-leading natively multimodal model for image and text understanding with groundbreaking intelligence and fast responses at a low cost.
Download
Llama 4 capabilities
Start building with Llama 4
SAFETY
Protections in the era of generative AI.
Llama 4 benchmark
Task
Metric
Llama 4 Maverick
Llama 4 Scout
Image Reasoning
MMMU
73.4
69.4
MathVista
73.7
70.7
Image Understanding
ChartQA
90
88.8
DocVQA (test)
94.4
94.4
Coding
LiveCodeBench (10.01.2024 - 02.01.2025)
43.4
32.8
Reasoning & Knowledge
MMLU Pro
80.5
74.3
DocVQA (test)
69.8
57.2
Multilingual
Multilingual MMLU
84.6
-
Long Context
MTOB (half book)
eng->kgv/kgv->eng
54.0 / 46.4
42.2 / 36.6
MTOB (full book)
eng->kgv/kgv->eng
50.8 / 46.7
39.7 / 36.3
Methodology & Notes1. For Llama model results, we report 0 shot evaluation with temperature = 0 and no majority voting or parallel test time compute. For high-variance benchmarks (GPQA Diamond, LiveCodeBench), we average over multiple generations to reduce uncertainty.2. Specialized long context evals are not traditionally reported for generalist models, so we share internal runs to showcase llama's frontier performance.3. $0.19/Mtok (3:1 blended) is our cost estimate for Llama 4 Maverick assuming distributed inference. On a single host, we project the model can be served at $0.30 - $0.49/Mtok (3:1 blended).