Meta

Meta
FacebookXYouTubeLinkedIn
Documentation
OverviewModels Getting the Models Running Llama How-To Guides Integration Guides Community Support

Community
Community StoriesOpen Innovation AI Research CommunityLlama Impact Grants

Resources
CookbookCase studiesVideosAI at Meta BlogMeta NewsroomFAQPrivacy PolicyTermsCookie Policy

Llama Protections
OverviewLlama Defenders ProgramDeveloper Use Guide

Documentation
Overview
Models
Getting the Models
Running Llama
How-To Guides
Integration Guides
Community Support
Community
Community Stories
Open Innovation AI Research Community
Llama Impact Grants
Resources
Cookbook
Case studies
Videos
AI at Meta Blog
Meta Newsroom
FAQ
Privacy Policy
Terms
Cookie Policy
Llama Protections
Overview
Llama Defenders Program
Developer Use Guide
Documentation
Overview
Models
Getting the Models
Running Llama
How-To Guides
Integration Guides
Community Support
Community
Community Stories
Open Innovation AI Research Community
Llama Impact Grants
Resources
Cookbook
Case studies
Videos
AI at Meta Blog
Meta Newsroom
FAQ
Privacy Policy
Terms
Cookie Policy
Llama Protections
Overview
Llama Defenders Program
Developer Use Guide
Documentation
Overview
Models
Getting the Models
Running Llama
How-To Guides
Integration Guides
Community Support
Community
Community Stories
Open Innovation AI Research Community
Llama Impact Grants
Resources
Cookbook
Case studies
Videos
AI at Meta Blog
Meta Newsroom
FAQ
Privacy Policy
Terms
Cookie Policy
Llama Protections
Overview
Llama Defenders Program
Developer Use Guide

Table Of Contents

Overview
Models
Llama 4
Llama Guard 4
Llama 3.3
Llama 3.2
Llama 3.1
Llama Guard 3
Llama Prompt Guard 2
Other models
Getting the Models
Meta
Hugging Face
Kaggle
1B/3B Partners
405B Partners
Running Llama
Linux
Windows
Mac
Cloud
Deployment (New)
Private cloud deployment
Production deployment pipelines
Infrastructure migration
Versioning
Accelerator management
Autoscaling
Regulated industry self-hosting
Security in production
Cost projection and optimization
Comparing costs
A/B testing
How-To Guides
Prompt Engineering (Updated)
Fine-tuning (Updated)
Quantization (Updated)
Distillation (New)
Evaluations (New)
Validation
Vision Capabilities
Responsible Use
Integration Guides
LangChain
Llamalndex
Community Support
Resources

Overview
Models
Llama 4
Llama Guard 4
Llama 3.3
Llama 3.2
Llama 3.1
Llama Guard 3
Llama Prompt Guard 2
Other models
Getting the Models
Meta
Hugging Face
Kaggle
1B/3B Partners
405B Partners
Running Llama
Linux
Windows
Mac
Cloud
Deployment (New)
Private cloud deployment
Production deployment pipelines
Infrastructure migration
Versioning
Accelerator management
Autoscaling
Regulated industry self-hosting
Security in production
Cost projection and optimization
Comparing costs
A/B testing
How-To Guides
Prompt Engineering (Updated)
Fine-tuning (Updated)
Quantization (Updated)
Distillation (New)
Evaluations (New)
Validation
Vision Capabilities
Responsible Use
Integration Guides
LangChain
Llamalndex
Community Support
Resources

Documentation

How-to guides

A collection of hands-on guides covering using Llama models.

Develop with Llama

This collection of guides provides a comprehensive overview of best practices, techniques, and tools for developing with Llama. Learn how to optimize, evaluate, and responsibly deploy Llama-based large language models, covering topics from prompt engineering and fine-tuning to quantization, distillation, validation, and ethical considerations.


Prompt engineering

Prompt engineering is one of the easiest ways for developers to improve models output quality and relevance, without needing to retrain or fine-tune the model.

Master techniques like few-shot, chain-of-thought, and role-based prompting to get more accurate, relevant, and creative outputs from your models.

Prompt engineering

Fine-tuning

Fine-tuning allows you to customize Llama for your specific domain or task, dramatically improving accuracy, consistency, and efficiency.

Learn when and how to use full-parameter fine-tuning, LoRA, QLoRA, and RLHF, and discover the best libraries and workflows for custom model training.

Fine-tuning

Quantization

Quantization is a powerful optimization technique that enables Llama models to run faster and use less memory, making them suitable for edge devices and large-scale deployments.

Learn practical methods and tools for quantizing Llama models, and how to balance memory, speed, and accuracy for efficient deployment.

Quantization

Distillation

Model distillation is a key technique for transferring the capabilities of large Llama models into smaller, faster ones, making high-quality AI accessible in resource-constrained environments.

Discover how to generate synthetic data, apply advanced distillation signals, and evaluate student models for efficient, high-performance AI.

Distillation

Evaluations

Comprehensive evaluation strategies are essential for building robust Llama-powered applications that deliver consistent quality and safety.

Learn how to design datasets, choose metrics, and combine automated, model-based, and human evaluation to measure and improve model quality, safety, and user experience.

Evaluations

Validation

Rigorous validation is the foundation of reliable AI systems, ensuring your Llama models meet performance expectations before deployment.

Discover quantitative and qualitative techniques—including loss/perplexity tracking, cross-validation, and manual review—to ensure your fine-tuned Llama models perform reliably in real-world scenarios.

Validation

Vision Capabilities

Llama’s multimodal models empower developers to build applications that understand both images and text, opening up new possibilities for natural, flexible AI interactions.

Learn how to leverage document OCR, visual reasoning, and creative image-based tasks using Llama3.2’s advanced vision features.

Vision Capabilities

Responsible Use Guide

Responsible development with Llama requires a thoughtful approach to safety, ethics, and compliance, especially as AI systems become more widely adopted.

Explore best practices for data annotation, model alignment, safety evaluation, and transparency to ensure your applications meet ethical and regulatory standards.

Responsible Use Guide
Meta
Models & Products
Docs
Community
Resources
Llama API
Download models