This page describes the prompt format for Llama 3.1 with an emphasis on new features in that release.
Please leverage this guidance in order to take full advantage of the new Llama models. Although prompts designed for Llama 3 should work unchanged in Llama 3.1 and Llama 3.2, we recommend that you update your prompts to the new format to obtain the best results.
Tokens
Description
<|begin_of_text|>
Specifies the start of the prompt.
<|end_of_text|>
Model will cease to generate more tokens. This token is generated only by the base models.
<|finetune_right_pad_id|>
This token is used for padding text sequences to the same length in a batch.
<|start_header_id|>
<|end_header_id|>
[system, user, assistant, and ipython]
<|eom_id|>
Environment: ipython
instruction is used in the system prompt, or if the model calls for a built-in tool.<|eot_id|>
End of turn. Represents when the model has determined that it has finished interacting with the user message that initiated its response. This is used in two scenarios:
This token signals to the executor that the model has finished generating a response.
<|python_tag|>
Special tag used in the model’s response to signify a tool call.
There are 4 different roles that are supported by Llama text models:
system
: Sets the context in which to interact with the AI model. It typically includes rules, guidelines, or necessary information that help the model respond effectively.user
: Represents the human interacting with the model. It includes the inputs, commands, and questions to the model.ipython
: A new role introduced in Llama 3.1. Semantically, this role means "tool". This role is used to mark messages with the output of a tool call when sent back to the model from the executor.assistant
: Represents the response generated by the AI model based on the context provided in the system
, ipython
and user
prompts.[system, assistant, user, ipython]
This is the simplest way to prompt llama-based models, starting with text and allowing the model to generate new content based on the user-provided input.
<|begin_of_text|>{{ user_message}}
The format for a regular multi-turn conversation between a user and the model has not changed between Llama 3 and Llama 3.1. Here is a simple example of how it's formatted.
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
Cutting Knowledge Date: December 2023
Today Date: 23 July 2024
You are a helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|>
What is the capital of France?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
text_prompt_format.md
in the meta-llama
GitHub repository.Environment: ipython
is included in the system
prompt.<|python_tag|>
and terminated with an <|eom_id|>
tag.<|eom_id|>
indicates a continued multi-step reasoning. That is, the model is expecting a continuation message with the output of the tool call. Sample Input:
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
Environment: ipython<|eot_id|><|start_header_id|>user<|end_header_id|>
Write code to check if number is prime, use that to see if the number 7 is prime<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Response:
<|python_tag|>def is_prime(n):
if n <= 1:
return False
for i in range(2, int(n**0.5) + 1):
if n % i == 0:
return False
return True
print(is_prime(7)) # Output: True<|eom_id|>
llama-stack-apps
, the results of the code are passed back to the model for further processing.Tool-calling with the larger models can be done in three ways:
Defining multiple tools in the system and/or user prompt can cause the model to generate multiple tool calls as a result of the user query. Developers must iterate on the prompts required to get the desired result and parse the result accordingly.
To better illustrate the flows below, this diagram shows the flow of information described:
The format for a regular multi-turn conversation between a user and the model has not changed between Llama 3 and Llama 3.1. Here is a simple example of how it's formatted.
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
Cutting Knowledge Date: December 2023
Today Date: 23 July 2024
You are a helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|>
What is the capital of France?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
The models are trained to use two built-in tools:
These can be turned on using the system prompt:
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
Environment: ipython
Tools: brave_search, wolfram_alpha
Cutting Knowledge Date: December 2023
Today Date: 23 July 2024
You are a helpful assistant<|eot_id|>
<|start_header_id|>user<|end_header_id|>
What is the current weather in Menlo Park, California?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
A few things to note:
# for Search
<|python_tag|>
brave_search.call(query="...")
<|eom_id|>
# for Wolfram
<|python_tag|>
wolfram_alpha.call(query="...")
<|eom_id|>
wolfram_alpha
in-built tool. (Please note: The model does not execute the call to the tool; you need to use our llama-agentic-system, or other similar framework, to leverage the tool-calling capabilities.) <|begin_of_text|><|start_header_id|>system<|end_header_id|>
Environment: ipython
Tools: brave_search, wolfram_alpha
Cutting Knowledge Date: December 2023
Today Date: 23 July 2024
You are a helpful assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>
Can you help me solve this equation: x^3 - 4x^2 + 6x - 24 = 0<|eot_id|><|start_header_id|>assistant<|end_header_id|>
<|python_tag|>wolfram_alpha.call(query="solve x^3 - 4x^2 + 6x - 24 = 0")<|eom_id|>
{
"queryresult": {
"success": true,
"inputstring": "solve x^3 - 4x^2 + 6x - 24 = 0",
"pods": [
{
"title": "Input interpretation",
"subpods": [
{
"title": "",
"plaintext": "solve x^3 - 4 x^2 + 6 x - 24 = 0"
}
]
},
{
"title": "Results",
"primary": true,
"subpods": [
{
"title": "",
"plaintext": "x = 4"
},
{
"title": "",
"plaintext": "x = ± (i sqrt(6))"
}
]
},
...
]
}
}
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
Environment: ipython
Tools: brave_search, wolfram_alpha
Cutting Knowledge Date: December 2023
Today Date: 23 July 2024
You are a helpful assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>
Can you help me solve this equation: x^3 - 4x^2 + 6x - 24 = 0<|eot_id|><|start_header_id|>assistant<|end_header_id|>
<|python_tag|>wolfram_alpha.call(query="solve x^3 - 4x^2 + 6x - 24 = 0")<|eom_id|><|start_header_id|>ipython<|end_header_id|>
{"queryresult": {"success": true, "inputstring": "solve x^3 - 4x^2 + 6x - 24 = 0", "pods": [{"title": "Input interpretation", "subpods": [{"title": "", "plaintext": "solve x^3 - 4 x^2 + 6 x - 24 = 0"}]}, {"title": "Results", "primary": true, "subpods": [{"title": "", "plaintext": "x = 4"}, {"title": "", "plaintext": "x = \u00b1 (i sqrt(6))"}]}, ... ]}}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
The solutions to the equation x^3 - 4x^2 + 6x - 24 = 0 are x = 4 and x = ±(i√6).<|eot_id|>
eom_id
, the following instruction needs to be added to the system prompt: Environment: ipython
. Else, it should output eot_id
.<|begin_of_text|><|start_header_id|>system<|end_header_id|>
Cutting Knowledge Date: December 2023
Today Date: 23 July 2024
When you receive a tool call response, use the output to format an answer to the orginal user question.
You are a helpful assistant with tool calling capabilities.<|eot_id|><|start_header_id|>user<|end_header_id|>
Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.
Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}. Do not use variables.
{
"type": "function",
"function": {
"name": "get_current_conditions",
"description": "Get the current weather conditions for a specific location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g., San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["Celsius", "Fahrenheit"],
"description": "The temperature unit to use. Infer this from the user's location."
}
},
"required": ["location", "unit"]
}
}
}
Question: what is the weather like in San Fransisco?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Environment: ipython
instruction in the system prompt, the model ends generation with the eot_id
token, not the eom_id
token{"name": "get_current_conditions", "parameters": {"location": "San Francisco, CA", "unit": "Fahrenheit"}}<eot_id>
ipython
, in the header.<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are a helpful assistant with tool calling capabilities. When you receive a tool call response, use the output to format an answer to the orginal use question.<|eot_id|><|start_header_id|>user<|end_header_id|>
Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.
Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}. Do not use variables.
{
"type": "function",
"function": {
"name": "get_current_conditions",
"description": "Get the current weather conditions for a specific location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g., San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["Celsius", "Fahrenheit"],
"description": "The temperature unit to use. Infer this from the user's location."
}
},
"required": ["location", "unit"]
}
}
}
Question: what is the weather like in San Fransisco?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
{"name": "get_current_conditions", "parameters": {"location": "San Francisco, CA", "unit": "Fahrenheit"}}<|eot_id|><|start_header_id|>ipython<|end_header_id|>
{"output": "Clouds giving way to sun Hi: 76° Tonight: Mainly clear early, then areas of low clouds forming Lo: 56°"}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
The weather in Menlo Park is currently cloudy with a high of 76° and a low of 56°, with clear skies expected tonight.<eot_id>
The following prompts provide an example of how to define your own format in the system prompt and make use of it while interacting with the model. This format will use a new and custom <function> tag to enclose a JSON object with the function parameters.
eom_id
token because of the Environment: ipython
instruction.<|begin_of_text|><|start_header_id|>system<|end_header_id|>
Environment: ipython
Tools: brave_search, wolfram_alpha
Cutting Knowledge Date: December 2023
Today Date: 23 July 2024
# Tool Instructions
- Always execute python code in messages that you share.
- When looking for real time information use relevant functions if available else fallback to brave_search
You have access to the following functions:
Use the function 'spotify_trending_songs' to: Get top trending songs on Spotify
{
"name": "spotify_trending_songs",
"description": "Get top trending songs on Spotify",
"parameters": {
"n": {
"param_type": "int",
"description": "Number of trending songs to get",
"required": true
}
}
}
If a you choose to call a function ONLY reply in the following format:
<{start_tag}={function_name}>{parameters}{end_tag}
where
start_tag => `<function`
parameters => a JSON dict with the function argument name as key and function argument value as value.
end_tag => `</function>`
Here is an example,
<function=example_function_name>{"example_name": "example_value"}</function>
Reminder:
- Function calls MUST follow the specified format
- Required parameters MUST be specified
- Only call one function at a time
- Put the entire function call reply on one line
- Always add your sources when using search results to answer the user query
You are a helpful assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>
Can you check the top 5 trending songs on spotify?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Now the model responds in accordance with the format defined in the system prompt, with the function name in the start tag, and the parameters, as a JSON object, in between the start and end tags.
Environment: ipython
instruction in the system prompt, the model ends generation with the eom_id
token.<function=spotify_trending_songs>{"n": "5"}</function><eom_id>
ipython
, in the header.<|begin_of_text|><|start_header_id|>system<|end_header_id|>
Environment: ipython
Tools: brave_search, wolfram_alpha
Cutting Knowledge Date: December 2023
Today Date: 23 July 2024
# Tool Instructions
- Always execute python code in messages that you share.
- When looking for real time information use relevant functions if available else fallback to brave_search
You have access to the following functions:
Use the function 'spotify_trending_songs' to: Get top trending songs on Spotify
{"name": "spotify_trending_songs", "description": "Get top trending songs on Spotify", "parameters": {"n": {"param_type": "int", "description": "Number of trending songs to get", "required": true}}}
If a you choose to call a function ONLY reply in the following format:
<{start_tag}={function_name}>{parameters}{end_tag}
where
start_tag => `<function`
parameters => a JSON dict with the function argument name as key and function argument value as value.
end_tag => `</function>`
Here is an example,
<function=example_function_name>{"example_name": "example_value"}</function>
Reminder:
- Function calls MUST follow the specified format
- Required parameters MUST be specified
- Only call one function at a time
- Put the entire function call reply on one line
- Always add your sources when using search results to answer the user query
You are a helpful assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>
Can you check the top 5 trending songs on spotify?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
<function=spotify_trending_songs>{"n": "5"}</function><|eom_id|><|start_header_id|>ipython<|end_header_id|>
["1. BIRDS OF A FEATHER by Billie Eilish", "2. Espresso by Sabrina Carpenter", "3. Please Please Please by Sabrina Carpenter", "4. Not Like Us by Kendrick Lamar", "5. Gata Only by FloyyMenor, Cris Mj"]<|eot_id|><|start_header_id|>assistant<|end_header_id|>
The top 5 trending songs on Spotify are:
1. BIRDS OF A FEATHER by Billie Eilish
2. Espresso by Sabrina Carpenter
3. Please Please Please by Sabrina Carpenter
4. Not Like Us by Kendrick Lamar
5. Gata Only by FloyyMenor, Cris Mj<|eot_id|>