The following diagram shows how each of the Code Llama models is trained:
(Fig: The Code Llama specialization pipeline. The different stages of fine-tuning annotated with the number of tokens seen during training)
This extension can be used with ollama, allowing for easy local only execution. Additionally, it provides a simple interface to 1/ Chat with the model directly running inside VS Code and 2/ Select specific files and sections to edit or explain.
This extension is an effective way to evaluate Llama because it provides simple and useful features. It also allows developers to build trust, by creating diffs for each proposed change and showing exactly what is being changed before saving the file. Handling the context for the LLM is easy and relies heavily on keyboard shortcuts. It's important to note that all the interactions with the extension are recorded in jsonl format. The objective is to provide data for future fine tuning of the models based on the feedback recorded during real world usage as well.
It currently does not support local models unless TGI is running locally. It would be great to add ollama support to this extension, as it would accelerate the inference with the smaller models by avoiding the network.