Resources

Meta and Community Resources

A repository of Llama resources from videos to cookbooks.

If you have any feature requests, suggestions, bugs to report we encourage you to report the issue in the respective Github repository.

Note: Some of these resources refer to earlier versions of Llama. However, the concepts and ideas described are still relevant to the most recent version.

Meta Resources

RecipesFor our full list check out the Cookbook page.
Llama 4 Cookbook
A guide for building with Llama 4.
Llama Stack Cookbook
A guide for building on Llama Stack.
Llama on Hugging Face
The Llama Hugging Face repo.
Llama 3 Cookbook
A guide for building with Llama 3
How-to-guides
Fine-tuning
Full parameter fine-tuning is a method that fine-tunes all the parameters of all the layers of the pre-trained model.
Quantization
Learn how quantization makes models more efficient for deployment on servers and edge devices.
Prompting
Improve the performance of the language model by providing them with more context and information about the task in hand
Validation
Learn different ways to measure and ultimately validate Llama
Vision Capabilities
Interact with models in new ways.
Developer Use Guide: AI Protections
Build Llama applications responsibly.
Integration Guides
LangChain
Learn to build with this popular open source framework.
LlamaIndex
Learn to build with this popular open source framework.

Videos

Check out our video page to watch tutorials on Llama.

Community Resources

Get Started with LlamaUse these cookbooks to get your journey with Llama started.
Building LLM applications for production
Start building apps with this guide
Fine-tuningDiscover cookbook examples, data sets and more to help you jump start model fine-tuning.
Hugging Face PEFT
A repo for parameter-efficient fine-tuning (PEFT) on Llama.
Efficient fine-tuning with LoRA
Databricks blog on efficient fine-tuning with LoRA.
Weights & Biases training and fine-tuning large language models
A course on fine-tuning LLMs.
End to end fine-tuning with torchtune
PyTorch native post-training library
Fine-tuning comparison (Llama vs GPT)
The Pytorch Github library
Fine-tuning comparison (Llama vs GPT)
The Pytorch Github library.
How to fine-tune Llama with LoRA for Question Answering
NVIDIA deep learning blog on fine-tuning Llama.
Performance & LatencyPapers and blogs to help optimize performance and latency.
Optimizing and testing latency for LLMs
An exploration of ways to optimize on latency
Improving LLM interfaces
How continuous batching enables 23x throughput in LLM inference while reducing p50 latency.
Improving performance of compressed LLMs with prompt engineering
A paper on improving accuracy-efficiency trade-off of LLM Inference.
Was this page helpful?
Yes
No