Modelfiles

This repository contains a personal collection of model files that have been used during my LLM experimentation.

Ollama

For my use case, I create and use the models using Ollama. To have access via different tools in the Ollama server, open Ollama configuration:

sudo vim /etc/systemd/system/ollama.service

Then, add the following line:

Environment="OLLAMA_HOST=0.0.0.0:11434"

Then restart Ollama

sudo systemctl daemon-reload
sudo systemctl restart ollama

Make sure that your firewall accepts requests there

# E.g. for Ubuntu/Debian based Distributions
sudo ufw allow 11434

Open WebUI

Open WebUI is also used to provide a graphical interface for interacting with the models.

Installation using Docker - CPU utilization

docker run -d \
	-p 3000:8080 \
	--add-host=host.docker.internal:host-gateway \
	-v open-webui:/app/backend/data \
	--restart always \
	ghcr.io/open-webui/open-webui:main

Installation using Docker - (NVIDIA) GPU utilization

Warning

In order to work:

Configure the repository:

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey |sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list \
&& sudo apt-get update

Install the NVIDIA Container Toolkit packages:

sudo apt-get install -y nvidia-container-toolkit

Configure the container runtime by using the nvidia-ctk command:

sudo nvidia-ctk runtime configure --runtime=docker

Restart the Docker daemon:

sudo systemctl restart docker

Now we are ready to start the container

docker run -d \
	-p 3001:8080 \
	--gpus all \
	--add-host=host.docker.internal:host-gateway \
	-v open-webui:/app/backend/data \
 	--restart always \
	ghcr.io/open-webui/open-webui:cuda

`Continue` a VS-Code extension

Finally, since VS-Code is one of my go-to editors for development, I also use Continue VS-Code extension to interact with the models from there.

Goal

This setup can ensure the deployment of LLMs locally on proprietary servers, which may be a mandatory security constraint for some applications or development of tools.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Modelfiles

Ollama

Open WebUI

Installation using Docker - CPU utilization

Installation using Docker - (NVIDIA) GPU utilization

`Continue` a VS-Code extension

Goal

Files

README.md

Latest commit

History

README.md

File metadata and controls

Modelfiles

Ollama

Open WebUI

Installation using Docker - CPU utilization

Installation using Docker - (NVIDIA) GPU utilization

Continue a VS-Code extension

Goal

`Continue` a VS-Code extension