First you have to install a backend. The list of available backends is exhausting. For simplification purpose I choosed https://ollama.com/. The install is available for MacOS, Windows and Linux.
curl -fsSL https://ollama.com/install.sh | sh
After install start the server
ollama serve &
Check if the server is running approriate
http://127.0.0.1:11434/
Some usefull commands:
sudo systemctl status ollama
sudo systemctl stop ollama
sudo systemctl disable ollama
sudo systemctl restart ollama
Now install the preffered model(s). The easiest way is to surf on Huggingface and pick your models as follows e.g.,

Now add the model to ollama::
ollama run hf.co/TheBloke/deepseek-coder-6.7B-instruct-GGUF:Q2_K
Next, VSCode needs to utilize the model. To do this, install the ‘Continue’ extension and connect it to Ollama.
Afterwards, we can chat and collaboratively edit a flexible context across various files.
Have fun 🙂