Code Llama, a state-of-the-art large language model for coding
Code Llama is built on top of Llama 2 and is available in three models:
- Code Llama
- Code Llama Python
- Code Llama Instruct
With each model download you'll receive:
- All Code Llama models
- README (User Guide)
- Responsible Use Guide
- License
- Acceptable Use Policy
- Model Card
How Code Llama works
It supports many of the most popular languages being used today, including Python, C++, Java, PHP, Typescript (Javascript), C#, and Bash.
Inside the model
The four models address different serving and latency requirements. The 7B model, for example, can be served on a single GPU. The 34B and 70B models return the best results and allow for better coding assistance, but the smaller 7B and 13B models are faster and more suitable for tasks that require low latency, like real-time code completion.
Code Llama
Aside from being a prerequisite for generating longer programs, having longer input sequences unlocks exciting new use cases for a code LLM. For example, users can provide the model with more context from their codebase to make the generations more relevant. It also helps in debugging scenarios in larger codebases, where staying on top of all code related to a concrete issue can be challenging for developers. When developers are faced with debugging a large chunk of code they can pass the entire length of the code into the model.
Code Llama Python
Code Llama Instruct
Code Llama Instruct is an instruction fine-tuned and aligned variation of Code Llama. Instruction tuning continues the training process, but with a different objective. The model is fed a “natural language instruction” input and the expected output. This makes it better at understanding what humans expect out of their prompts. We recommend using Code Llama Instruct variants whenever using Code Llama for code generation since Code Llama Instruct has been fine-tuned to generate helpful and safe answers in natural language.
Evaluating Code Llama’s performance
Our benchmark testing showed that Code Llama performed better than open-source, code-specific LLMs and outperformed Llama 2. Code Llama 70B Instruct, for example, scored 67.8% on HumanEval and 62.2% on MBPP, the highest compared with other state-of-the-art open solutions, and on par with ChatGPT.
Details about our red teaming efforts from domain experts in responsible AI, offensive security engineering, malware development, and software engineering are available in our research paper.
Releasing Code Llama
Programmers are already using LLMs to assist in a variety of tasks, ranging from writing new software to debugging existing code. The goal is to make developer workflows more efficient, so they can focus on the most human centric aspects of their job, rather than repetitive tasks.
At Meta, we believe that AI models, but LLMs for coding in particular, benefit most from an open approach, both in terms of innovation and safety. Publicly available, code-specific models can facilitate the development of new technologies that improve peoples' lives. By releasing code models like Code Llama, the entire community can evaluate their capabilities, identify issues, and fix vulnerabilities.
Code Llama’s training recipes are available on our Github repository and model weights are also available.
Responsible use
We’ve also updated our Responsible Use Guide and it includes guidance on developing downstream models responsibly, including:
- Defining content policies and mitigations.
- Preparing data.
- Fine-tuning the model.
- Evaluating and improving performance.
- Addressing input- and output-level risks.
- Building transparency and reporting mechanisms in user interactions.
The future of generative AI for coding
We hope that Code Llama will inspire others to leverage Llama 2 to create new innovative tools for research and commercial products.
Resources
Explore more on Code Llama
Discover more about Code Llama here — visit our resources, ranging from our research paper, getting started guide and more.