Table Of Contents
Table Of Contents
Deployment (New)
Private cloud deployment
Production deployment pipelines
Infrastructure migration
Versioning
Accelerator management
Autoscaling
Regulated industry self-hosting
Security in production
Cost projection and optimization
Comparing costs
A/B testing
Deployment (New)
Private cloud deployment
Production deployment pipelines
Infrastructure migration
Versioning
Accelerator management
Autoscaling
Regulated industry self-hosting
Security in production
Cost projection and optimization
Comparing costs
A/B testing
Getting the models
The pages in this section describe how to obtain the Llama models
Note: Kaggle and our ecosystem partners are preparing support for Llama 4 Scout and Llama 4 Maverick. Please check these pages for the most-recent information.
With Llama 3.1, we introduce the 405B model. This model requires significant storage and computational resources, occupying approximately 750GB of disk storage space and necessitating two nodes on MP16 for inference.
On this page