Meta

Meta
FacebookXYouTubeLinkedIn
Documentation
OverviewModels Getting the Models Running Llama How-To Guides Integration Guides Community Support

Community
Community StoriesOpen Innovation AI Research CommunityLlama Impact Grants

Resources
AI at Meta BlogMeta NewsroomFAQPrivacy PolicyTermsCookies

Trust & Safety
OverviewResponsible Use Guide

Documentation
Overview
Models
Getting the Models
Running Llama
How-To Guides
Integration Guides
Community Support
Community
Community Stories
Open Innovation AI Research Community
Llama Impact Grants
Resources
AI at Meta Blog
Meta Newsroom
FAQ
Privacy Policy
Terms
Cookies
Trust & Safety
Overview
Responsible Use Guide
Documentation
Overview
Models
Getting the Models
Running Llama
How-To Guides
Integration Guides
Community Support
Community
Community Stories
Open Innovation AI Research Community
Llama Impact Grants
Resources
AI at Meta Blog
Meta Newsroom
FAQ
Privacy Policy
Terms
Cookies
Trust & Safety
Overview
Responsible Use Guide
Documentation
Overview
Models
Getting the Models
Running Llama
How-To Guides
Integration Guides
Community Support
Community
Community Stories
Open Innovation AI Research Community
Llama Impact Grants
Resources
AI at Meta Blog
Meta Newsroom
FAQ
Privacy Policy
Terms
Cookies
Trust & Safety
Overview
Responsible Use Guide
Skip to main content
Meta
Documentation
Trust & Safety
Community
Try Llama
Download models

Llama

Llama

The open-source AI models you can fine-tune, distill and deploy anywhere. Choose from our collection of models: Llama 3.1, Llama 3.2, Llama 3.3.
Download models
Try Llama on Meta AI

See how Llama is the leading open source model family

Learn more
Horizon banner image

LlamaCon

Save the date for an exclusive event exploring the exciting possibilities and potential of Llama.
4.29.25
Sign up

Latest models

Llama includes multilingual text-only models (1B, 3B), including quantized versions, text-image models (11B, 90B) and Llama 3.3 70B model offering similar performance to the Llama 3.1 405B model, allowing developers to achieve greater quality and performance on text-based applications at a fraction of the cost.
Start building
Multilingual

Llama 3.1

•
8B: Light-weight, ultra-fast model you can run anywhere.
•
405B: Flagship foundation model driving widest variety of use cases
Download models
Lightweight and Multimodal

Llama 3.2

•
1B and 3B: Light-weight, efficient models you can run everywhere on mobile and on edge devices.
•
11B and 90B: Multimodal models that are flexible and can reason on high resolution images.
Download models
Multilingual

Llama 3.3

•
70B: Experience leading performance and quality at a fraction of the cost with our latest release.
Download models

Download our flagship foundation 405B model

Download models

Do more with Llama 3.2

Develop highly performative and efficient applications from our latest release.
Learn more
On-Device
Multimodal
Llama Stack

On-device
On-device
Use our 1B or 3B models for on device applications such as summarizing a discussion from your phone or calling on-device tools like calendar.
Learn more

Multimodal
Multimodal
Use our 11B or 90B models for image use cases such as transforming an existing image into something new or getting more information from an image of your surroundings.
Learn more

Llama Stack
Llama Stack
Seamlessly build agentic applications from a comprehensive toolchain.
Learn more
On-device
Use our 1B or 3B models for on device applications such as summarizing a discussion from your phone or calling on-device tools like calendar.
Learn more

Llama Stack: a streamlined developer experience

Build faster, deploy anywhere and get the most out of the latest Llama models on day 1.
Learn more

For developers

Best practices included

Optimized support for agentic tool calling, safety guardrails, inference and much more, significantly lowering development costs.

Develop in your preferred language

Choose from python, node, kotlin, and swift programming languages to quickly build
your applications.
Choose from python, node, kotlin, and swift programming languages to quickly build your applications.

Develop & deploy anywhere

With a common API, choose any distribution and deploy on-prem, locally hosted, or even on-device at the edge.
placeholder-image

For partners & distributors

A standard API

Requires fewer model level changes across versions accelerating time to market for new models and lowering engineering investment.

Interoperability with the ecosystem

Leverage the fast moving Llama ecosystem by building on a common API and incorporate new components faster.

Support for agentic components

Llama Stack releases natively support tool calling, safety guardrails, retrieval augmented generation, an inference loop and other agentic functionality.
Llama Stack releases natively support tool calling, safety guardrails, retrieval augmented generation, an inference loop and other agentic functionality.

For developers

Best practices included

Optimized support for agentic tool calling, safety guardrails, inference and much more, significantly lowering development costs.

Develop in your preferred language

Choose from python, node, kotlin, and swift programming languages to quickly build your applications.

Develop & deploy anywhere

With a common API, choose any distribution and deploy on-prem, locally hosted, or even on-device at the edge.

For Partners & Distributors

A standard API

Requires fewer model level changes across versions accelerating time to market for new models and lowering engineering investment.

Interoperability with the ecosystem

Leverage the fast moving Llama ecosystem by building on a common API and incorporate new components faster.

Support for agentic components

Llama Stack releases natively support tool calling, safety guardrails, retrieval augmented generation, an inference loop and other agentic functionality.

Model evaluations

We evaluated performance on over 150 benchmark datasets that span a wide range of languages. For the vision LLMs, we evaluated performance on benchmarks for image understanding and visual reasoning. In addition, we performed extensive human evaluations that compare Llama with competing models in real-world scenarios.

Learn more

Instruction-tuned benchmarks

Llama 3.3 instruction-tuned benchmarks
Lightweight instruction-tuned benchmarks
Vision instruction-tuned benchmarks
Category
Benchmark

General

MMLU Chat

(0-shot, CoT)

MMLU PRO

(5-shot, CoT)

Instruction Following

IFEval

Code

HumanEval

(0-shot)

MBPP EvalPlus

(base) (0-shot)

Math

MATH

(0-sho, CoT)

Reasoning

GPQA Diamond

(0-shot, CoT)

Tool use

BFCL v2

(0-shot)

Long context

NIH/Multi-needle

Multilingual

Multilingual MGSM

(0-shot)

Pricing*

1M Input tokens

(Cheapest among providers)*

1M Output tokens

(Cheapest among providers)*

Llama 3.1 70B

86.0

66.4

87.5

80.5

86.0

67.8

48.0

77.5

97.5

86.9

$0.1

$0.4

Llama 3.3 70B

86.0

68.9

92.1

88.4

87.6

77.0

50.5

77.3

97.5

91.1

$0.1

$0.4

Amazon Nova
Pro

85.9

-

92.1

89.0

-

76.6

-

-

-

-

$0.80

$3.20

Llama 3.1 405B

88.6

73.4

88.6

89.0

88.6

73.9

49.0

81.1

98.1

91.6

$1.0

$1.8

Gemini Pro
1.5

87.1

76.1

81.9

89.0

87.8

82.9

53.5

80.3

94.7

89.6

$1.30

$5.0

GPT-4o

87.5

73.8

84.6

86.0

83.9

76.9

47.5

74.0

-

90.6

2.5$

10.0$

Claude 3.5
Sonnet

88.9

77.8

89.3

93.7

86.8

78.3

65.0

79.3

99.4

92.8

$3.0

$15.0

* API Pricing based on publicly available data on Artificial Analysis as of 12/3/24.

Llama 3.3
Lightweight
Vision
Category
Benchmark

General

MMLU Chat

(0-shot, CoT)

MMLU PRO

(5-shot, CoT)

Instruction Following

IFEval

Code

HumanEval

(0-shot)

MBPP EvalPlus

(base) (0-shot)

Math

MATH

(0-sho, CoT)

Reasoning

GPQA Diamond

(0-shot, CoT)

Tool use

BFCL v2

(0-shot)

Long context

NIH/Multi-needle

Multilingual

Multilingual MGSM

(0-shot)

Pricing*

1M Input tokens

(Cheapest among providers)*

1M Output tokens

(Cheapest among providers)*

Llama 3.1 70B

86.0

66.4

87.5

80.5

86.0

67.8

48.0

77.5

97.5

86.9

$0.1

$0.4

Llama 3.3 70B

86.0

68.9

92.1

88.4

87.6

77.0

50.5

77.3

97.5

91.1

$0.1

$0.4

Amazon Nova
Pro

85.9

-

92.1

89.0

-

76.6

-

-

-

-

$0.80

$3.20

Llama 3.1 405B

88.6

73.4

88.6

89.0

88.6

73.9

49.0

81.1

98.1

91.6

$1.0

$1.8

Gemini Pro
1.5

87.1

76.1

81.9

89.0

87.8

82.9

53.5

80.3

94.7

89.6

$1.30

$5.0

GPT-4o

87.5

73.8

84.6

86.0

83.9

76.9

47.5

74.0

-

90.6

2.5$

10.0$

Claude 3.5
Sonnet

88.9

77.8

89.3

93.7

86.8

78.3

65.0

79.3

99.4

92.8

$3.0

$15.0

* API Pricing based on publicly available data on Artificial Analysis as of 12/3/24.

Leading with open source

Llama models have been downloaded over 600 million times on Hugging Face alone, making Llama the leading open source model family. Our partner ecosystem is helping build on this momentum by offering services through our Llama Stack, so anyone can build fast with Llama. And with this release of Llama 3.2, even more use cases can be supported.
Learn more
placeholder-image

600M

downloads on Hugging Face to date

placeholder-image

10x

growth since 2023


Partners enabling Llama

ARM, MediaTek and Qualcomm now allow you to run our lightweight models on your mobile or on-edge devices for the most capable "local agentic systems”. Dell is also offering their distribution with Llama Stack to help developers integrate their tool capabilities more seamlessly.
image 4
Llama Stack represents a significant leap in simplifying and standardizing the application of AI within enterprises across various use cases. With Llama Stack integrated into Dell AI Factory, we're setting the stage for widespread adoption of open models on-premises.
“Llama Stack represents a significant leap in simplifying and standardizing the application of AI within enterprises across various use cases. With Llama Stack integrated into Dell AI Factory, we're setting the stage for widespread adoption of open models on-premises.”

Ihab Tarazi, CTO, Dell Technologies

Community stories

Learn how partners across the community are putting Llama to use in real life.

Learn more
placeholder-image

Data privacy

AI Companion, a generative AI assistant leveraging Zoom's LLM built on Llama 2 and third-party models, enhances productivity and collaboration through chat, email and meeting summaries, with data privacy and AI control.
Zoom
placeholder-image

Productivity

DoorDash uses Llama to streamline and accelerate daily tasks, such as leveraging its internal knowledge base to answer complex questions for the team and delivering actionable pull request reviews to improve its codebase.
DoorDash
placeholder-image

Contextual Understanding

The creator of Pokémon GO launched their AR-first game Peridot, which uses Llama 2 to generate environment-specific reactions and animations based on what the pet characters are interacting with and seeing in the real world.
Niantic
placeholder-image

Solving business needs

KPMG leveraged Llama for multiple use cases across industries. They helped a US bank's wholesale credit team explore secure open-source LLMs options to help enable faster and more efficient review of complex loan applications to help position them to take automation to the next level.
KPMG

Our partner ecosystem

Partner logo collagePartner logo collagePartner logo collagePartner logo collagePartner logo collagePartner logo collagePartner logo collagePartner logo collage
Latest Llama updates
5 Steps to Getting Started with Llama 2 graphic

Introducing Multimodal Llama 3.2 with Amit Sangani

Learn more
5 Steps to Getting Started with Llama 2 graphic

Llama developer video: Knowledge distillation with 405B

Learn more
Llama 2 ecosystem graphic

Meta Connect 2024 AI session videos

Watch now
5 Steps to Getting Started with Llama 2 graphic

Llama 3.2 1B and 3B models now quantized

Learn more

Stay up-to-date

Our latest updates delivered to your inbox

Subscribe to our newsletter to keep up with the latest Llama updates, releases and more.

Sign up