Llama 4

Llama 4

Leading Intelligence. Unrivaled speed and efficiency.
The most accessible and scalable generation of Llama is here.
Learn more

Latest models

Our models are optimized for easy deployment, cost efficiency, and performance that scales to billions of users. What will you build?
Llama 4 Scout
Class-leading natively multimodal model that offers superior text and visual intelligence, single H100 GPU efficiency, and a 10M context window for seamless long document analysis.
Llama 4 Maverick
Industry-leading natively multimodal model for image and text understanding with groundbreaking intelligence and fast responses at a low cost.
Llama 4 Behemoth Preview
An early preview (it’s still training!) of the Llama 4 teacher model used to distill Llama 4 Scout and Llama 4 Maverick. Learn more about it.

Capabilities

Llama 4 Behemoth, Llama 4 Scout and Llama 4 Maverick offer class-leading capabilities.
Natively Multimodal
Unparalleled Long Context
Expert Image Grounding
Natively Multimodal
Natively Multimodal
All Llama 4 models are designed with native multimodality, leveraging early fusion that allows us to pre-train the model with large amounts of unlabeled text and vision tokens - a step change in intelligence from separate, frozen multimodal weights.
Unparalleled Long Context
Unparalleled Long Context
Llama 4 Scout supports up to 10M tokens of context - the longest context length available in the industry - unlocking new use cases around memory, personalization, and multi-modal applications.
Expert Image Grounding
Expert Image Grounding
Llama 4 is also best-in-class on image grounding, able to align user prompts with relevant visual concepts and anchor model responses to regions in the image.
Natively Multimodal
All Llama 4 models are designed with native multimodality, leveraging early fusion that allows us to pre-train the model with large amounts of unlabeled text and vision tokens - a step change in intelligence from separate, frozen multimodal weights.
Horizon banner image

Start building with Llama 4

The tools and resources you need to build with Llama.
Github

Benchmarks

We evaluated model performance on a suite of common benchmarks across a wide range of languages, testing for coding, reasoning, knowledge, vision understanding, multilinguality, and long context.
Category
Benchmark

Inference Cost

Cost per 1M input & output tokens (3:1 blended)

Image Reasoning

MMMU
MathVista

Image Understanding

ChartQA
DocVQA
(test)

Coding

LiveCodeBench
(10/01/2024-02/01/2025)

Reasoning & Knowledge

MMLU Pro
GPQA Diamond

Multilingual

Multilingual MMLU

Long context

MTOB (half book)
eng->kgv/kgv->eng
MTOB (full book)
eng->kgv/kgv->eng

Llama 4 Maverick

$0.19-$0.49⁵

73.4

73.7

90.0

94.4

43.4

80.5

69.8

84.6

54.0/46.4

50.8/46.7

Gemini 2.0 Flash

$0.17

71.7

73.1

88.3

-

34.5

77.6

60.1

-

48.4/39.80⁴

45.5/39.6⁴

DeepSeek v3.1

$0.48

No multimodal support

45.8/49.2³

81.2

68.4

-

Context window is 128K

GPT-4o

$4.38

69.1

63.8

85.7

92.8

32.3³

-

53.6

81.5

Context window is 128K

  1. For Llama model results, we report 0 shot evaluation with temperature = 0 and no majority voting or parallel test time compute. For high-variance benchmarks (GPQA Diamond, LiveCodeBench), we average over multiple generations to reduce uncertainty.

  2. For non-Llama models, we source the highest available self-reported eval results, unless otherwise specified. We only include evals from models that have reproducible evals (via API or open weights), and we only include non-thinking models. Cost estimates are sourced from Artificial Analysis for non-llama models.

  3. DeepSeek v3.1's date range is unknown (49.2), so we provide our internal result (45.8) on the defined date range. Results for GPT-4o are sourced from the LCB leaderboard.

  4. Specialized long context evals are not traditionally reported for generalist models, so we share internal runs to showcase llama's frontier performance.

  5. $0.19/Mtok (3:1 blended) is our cost estimate for Llama 4 Maverick assuming distributed inference. On a single host, we project the model can be served at $0.30 - $0.49/Mtok (3:1 blended)

Category
Benchmark

Inference Cost

Cost per 1M input & output tokens (3:1 blended)

Image Reasoning

MMMU
MathVista

Image Understanding

ChartQA
DocVQA
(test)

Coding

LiveCodeBench
(10/01/2024-02/01/2025)

Reasoning & Knowledge

MMLU Pro
GPQA Diamond

Multilingual

Multilingual MMLU

Long context

MTOB (half book)
eng->kgv/kgv->eng
MTOB (full book)
eng->kgv/kgv->eng

Llama 4 Maverick

$0.19-$0.49⁵

73.4

73.7

90.0

94.4

43.4

80.5

69.8

84.6

54.0/46.4

50.8/46.7

Gemini 2.0 Flash

$0.17

71.7

73.1

88.3

-

34.5

77.6

60.1

-

48.4/39.80⁴

45.5/39.6⁴

DeepSeek v3.1

$0.48

No multimodal support

45.8/49.2³

81.2

68.4

-

Context window is 128K

GPT-4o

$4.38

69.1

63.8

85.7

92.8

32.3³

-

53.6

81.5

Context window is 128K

  1. For Llama model results, we report 0 shot evaluation with temperature = 0 and no majority voting or parallel test time compute. For high-variance benchmarks (GPQA Diamond, LiveCodeBench), we average over multiple generations to reduce uncertainty.

  2. For non-Llama models, we source the highest available self-reported eval results, unless otherwise specified. We only include evals from models that have reproducible evals (via API or open weights), and we only include non-thinking models. Cost estimates are sourced from Artificial Analysis for non-llama models.

  3. DeepSeek v3.1's date range is unknown (49.2), so we provide our internal result (45.8) on the defined date range. Results for GPT-4o are sourced from the LCB leaderboard.

  4. Specialized long context evals are not traditionally reported for generalist models, so we share internal runs to showcase llama's frontier performance.

  5. $0.19/Mtok (3:1 blended) is our cost estimate for Llama 4 Maverick assuming distributed inference. On a single host, we project the model can be served at $0.30 - $0.49/Mtok (3:1 blended)

Horizon banner image
llama protections

Llama Protections

Making safety tools accessible to everyone.
Enabling developers, advancing safety, and building an open ecosystem.
Learn more
Horizon banner image

Stay up-to-date

Our latest updates delivered to your inbox

Subscribe to our newsletter to keep up with the latest Llama updates, releases and more.