Meta

Meta
FacebookXYouTubeLinkedIn
Documentation
OverviewModels Getting the Models Running Llama How-To Guides Integration Guides Community Support

Community
Community StoriesOpen Innovation AI Research CommunityLlama Impact Grants

Resources
CookbookCase studiesVideosAI at Meta BlogMeta NewsroomFAQPrivacy PolicyTermsCookies

Llama Protections
OverviewLlama Defenders ProgramDeveloper Use Guide

Documentation
Overview
Models
Getting the Models
Running Llama
How-To Guides
Integration Guides
Community Support
Community
Community Stories
Open Innovation AI Research Community
Llama Impact Grants
Resources
Cookbook
Case studies
Videos
AI at Meta Blog
Meta Newsroom
FAQ
Privacy Policy
Terms
Cookies
Llama Protections
Overview
Llama Defenders Program
Developer Use Guide
Documentation
Overview
Models
Getting the Models
Running Llama
How-To Guides
Integration Guides
Community Support
Community
Community Stories
Open Innovation AI Research Community
Llama Impact Grants
Resources
Cookbook
Case studies
Videos
AI at Meta Blog
Meta Newsroom
FAQ
Privacy Policy
Terms
Cookies
Llama Protections
Overview
Llama Defenders Program
Developer Use Guide
Documentation
Overview
Models
Getting the Models
Running Llama
How-To Guides
Integration Guides
Community Support
Community
Community Stories
Open Innovation AI Research Community
Llama Impact Grants
Resources
Cookbook
Case studies
Videos
AI at Meta Blog
Meta Newsroom
FAQ
Privacy Policy
Terms
Cookies
Llama Protections
Overview
Llama Defenders Program
Developer Use Guide

Table Of Contents

Overview
Models
Llama 4
Llama Guard 4 (New)
Llama 3.3
Llama 3.2
Llama 3.1
Llama Guard 3
Llama Prompt Guard 2 (New)
Other models
Getting the Models
Meta
Hugging Face
Kaggle
1B/3B Partners
405B Partners
Running Llama
Linux
Windows
Mac
Cloud
How-To Guides
Fine-tuning
Quantization
Prompting
Validation
Vision Capabilities
Responsible Use
Integration Guides
LangChain
Llamalndex
Community Support
Resources

Overview
Models
Llama 4
Llama Guard 4 (New)
Llama 3.3
Llama 3.2
Llama 3.1
Llama Guard 3
Llama Prompt Guard 2 (New)
Other models
Getting the Models
Meta
Hugging Face
Kaggle
1B/3B Partners
405B Partners
Running Llama
Linux
Windows
Mac
Cloud
How-To Guides
Fine-tuning
Quantization
Prompting
Validation
Vision Capabilities
Responsible Use
Integration Guides
LangChain
Llamalndex
Community Support
Resources
Getting the models

Deploying Llama 3.2 1B/3B: Partner Guides

The Llama lightweight (1B/3B) models enable developers to bring Llama’s capabilities to mobile and embedded devices.

Meta is collaborating with the following partners to provide guidance and foundational software to use the Llama lightweight models on their device hardware. Browse their offerings below and follow the provided links to obtain more detail.

Arm

Arm CPUs are the foundation for AI everywhere. Delivering generative AI and traditional ML by harnessing the power of Llama 3.2 1B and 3B models across cloud, mobile, and edge devices. Using Arm Kleidi technologies to implement Llama on Arm Cortex and Arm Neoverse CPUs, we are enabling developers to create novel use-cases that deliver efficient and performant AI across the breadth of devices built on Arm.

Arm Kleidi technologies unlock unprecedented out-of-the-box performance for running LLMs everywhere from cloud to edge, enabling acceleration for Llama 3.2 through library integration into AI frameworks.

For mobile and edge ecosystem developers, Llama 3.2 runs efficiently across Arm Cortex CPU based devices. See our documentation for developer resources.
Developers can access Arm from all the major cloud service providers for running Llama 3.2 in the cloud on Arm Neoverse CPU. See our documentation for getting started and visit Arm’s Hugging face page.

MediaTek

MediaTek has collaborated with Meta to support device inference through ExecuTorch APIs, bringing the convenience of open-source, fast device prototyping to our developer community. Browse our ExecuTorch GitHub page for more information.

Developers can port Llama models to GenAI enabled MediaTek products using the MediaTek Neuropilot LLM toolkit. The toolkit supports up to 4-bit quantization, LoRA fine-tuning, advanced graph and cache optimizations, and accelerated decoding techniques that promise best-in-class inference efficiency without noticeable loss of accuracy.

Qualcomm

Qualcomm Technologies, Inc., and Meta share a long-term partnership to support Llama models, including the latest Llama 3.2, running directly on-device. This capability allows developers to save on cloud costs and offer users private, reliable, and personalized experiences on smartphones, PCs, XR headsets, IoT, and automotive. This innovative partnership continuously unlocks new possibilities for on-device AI applications, enabling faster processing, reduced latency, and enhanced efficiency. Visit the Qualcomm AI Hub or download Ollama to learn more and start deploying Llama models on the edge today.
Was this page helpful?
Yes
No
On this page
Deploying Llama 3.2 1B/3B: Partner Guides
Arm
MediaTek
Qualcomm
Skip to main content
Meta
Models & Products
Docs
Community
Resources
Llama API
Download models