Llama 3.1 represents Meta's most capable model to date, including enhanced reasoning and coding capabilities, multilingual support, and an all-new reference system. Check out the following videos to see some of these new capabilities in action.
To correctly prompt each Llama model, please closely follow the formats described in the following sections. Keep in mind that when specified, newlines must be present in the prompt sent to the tokenizer for encoding. For details on implementing code to create correctly formatted prompts, please refer to the linked file for each model version.
For many cases where an application is using a Hugging Face (HF) variant of the Llama 3 model, the upgrade path to Llama 3.1 should be straightforward.
Running the script without any arguments performs inference with the Llama 3 8B Instruct model. Passing the following parameter to the script switches it to use Llama 3.1.
--model-id "meta-llama/Meta-Llama-3.1-8B-Instruct
Looking at the code, you can see that the tokenizer handles all the necessary changes to run the new model.