If you are a researcher, academic institution, government agency, government partner, or other entity with a Llama use case that is currently prohibited by the Llama Community License or Acceptable Use Policy, or requires additional clarification, please contact llamamodels@meta.com with a detailed request. We will consider requests on a case-by-case basis.
For Llama 2 and Llama 3, it's correct that the license restricts using any part of the Llama models, including the response outputs to train another AI model (LLM or otherwise). For Llama 3.1 however, this is allowed provided you as the developer provide the correct attribution. See the license for more information.
For Llama 2 and Llama 3, the models were primarily trained on English with some additional data from other languages. We do not expect the same level of performance in these languages as in English. Llama 3.1 however supports additional languages and is considered multilingual. See the Llama 3.1 model card for more information.
Details on how to access the models are available on our website link. Please note that the models are made available subject to the applicable Community License Agreement and Acceptable Use Policy. You should also take advantage of the best practices and considerations set forth in the applicable Responsible Use Guide.
Models are available through multiple sources but the place to start is at https://llama.meta.com/.
Llama models are broadly available to developers and licensees through a variety of hosting providers and on the Meta website and licensed under the applicable Llama Community License Agreement, which provides a permissive license to the models along with certain restrictions to help ensure that the models are being used responsibly. Hosting providers may have additional terms applicable to their services.
Hardware requirements vary based on latency, throughput and cost constraints. For good latency, we split models across multiple GPUs with tensor parallelism in a machine with NVIDIA A100s or H100s. But TPUs, other types of GPUs, or even commodity hardware can also be used to deploy these models (e.g.llama cpp, MLC LLM).
Llama models are auto-regressive language models, built on the transformer architecture. The core language models function by taking a sequence of words as input and predicting the next word, recursively generating text.
The model itself supports these parameters, but whether they are exposed or not depends on implementation.
This depends on your application. The Llama pre-trained models were trained for general large language applications, whereas the Llama instruct or chat models were fine tuned for dialogue specific uses like chat bots.
Developers may fine-tune Llama models for languages beyond English or other officially supported languages (e.g. Llama 3.1 supports 8 languages) provided they comply with the applicable Llama Community License Agreement and the Acceptable Use Policy.
“MAU” means “monthly active users” that access or use your (and your affiliates’) products and services. Examples include users accessing an internet-based service and monthly users/customers of licensee’s hardware devices.
No, such companies are not prohibited when their usage of Llama is not related to the operation of critical infrastructure. Llama, however, may not be used in the operation of critical infrastructure by any company, regardless of government certifications.