AI/ML

Deploy DeepSeek-R1 on a Cloud Server Using an Ollama Docker Container Step by Step Guide


We are installing the DeepSeek-R1 Model on a GCP n1-standard-4 instance with 1 NVIDIA T4 GPU. Let's see the step by step process.

Step 1: Install NVIDIA GPU Drivers & CUDA on GCP VM

Since we selected a T4 GPU, we must install NVIDIA drivers and CUDA for GPU acceleration.

Update System sudo apt update && sudo apt upgrade -y

Install NVIDIA Driver sudo apt install -y nvidia-driver-535

Check if the GPU is detected:

nvidia-smi

 

The output will show NVIDIA T4 with its memory details.

  • Install CUDA & cuDNN

Install CUDA Toolkit 12.2 (Latest)

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pinsudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600wget https://developer.download.nvidia.com/compute/cuda/12.2.0/local_installers/cuda-repo-ubuntu2204-12-2-local_12.2.0-1_amd64.debsudo dpkg -i cuda-repo-ubuntu2204-12-2-local_12.2.0-1_amd64.debsudo cp /var/cuda-repo-ubuntu2204-12-2-local/cuda-*-keyring.gpg /usr/share/keyrings/sudo apt updatesudo apt install -y cuda

 

Set CUDA paths:

echo 'export PATH=/usr/local/cuda/bin:$PATH' >> ~/.bashrcecho 'export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH' >> ~/.bashrcsource ~/.bashrc

 

Verify installation:

nvcc --version

 

Output will show a CUDA version like 12.2.

Step 2: Install Docker & NVIDIA Container Toolkit

  • Install Docker
sudo apt install -y docker.iosudo systemctl start dockersudo systemctl enable docker

 

Verify installation:

docker --version

 

  • Install NVIDIA Container Toolkit

Enable GPU support in Docker:

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \ && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \ && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.listsudo apt updatesudo apt install -y nvidia-docker2sudo systemctl restart docker

 

Verify GPU support in Docker:

docker run --rm --gpus all nvidia/cuda:12.2.0-base nvidia-smi

 

If it shows the NVIDIA T4 GPU, then we are ready to proceed.

 

Step 3: Install Ollama in Docker

Now, install Ollama inside a Docker container.

  • Pull Ollama Docker Image
docker pull ollama/ollama

 

Start Ollama with GPU:

docker run --rm --gpus all -d --name ollama -p 11434:11434 ollama/ollama

 

Verify it's running:

docker ps

 

Step 4: Pull & Run Deepseek-r1 8B

Once Ollama is running, pull the Deepseek-r1 8B model:

docker exec -it ollama ollama pull deepseek-r1:8b

 

Wait for it to download.

Run Deepseek-r1:

docker exec -it ollama ollama run deepseek-r1:8b

 

Test it:

curl -X POST "http://localhost:11434/api/generate" -d '{  "model": "deepseek-r1:8b",  "prompt": "What is the capital of France?",  "stream": false}'

 

It should return "Paris" as the response.

Step 5: Install & Start Ollama GUI

  • Pull Ollama GUI Docker Image
docker pull ghcr.io/open-webui/open-webui:main

 

Run it:

docker run -d --name ollama-gui -p 3000:3000 \ -e OLLAMA_API_BASE_URL=http://host.docker.internal:11434 \ ghcr.io/open-webui/open-webui:main

 

This will start Ollama GUI on port 3000.

 

Step 6: Access Ollama GUI & Use Deepseek-r1

Open http://YOUR_VM_EXTERNAL_IP:3000 in your browser.

You should see the Ollama GUI dashboard.

Click Models and you should see deepseek-r1:8b.

Click Chat, select the model and start chatting.

 

Final Summary

  •  Installed NVIDIA GPU drivers, CUDA and cuDNN
  •  Set up Docker & NVIDIA container support
  •  Ran Ollama in Docker with GPU acceleration
  •  Pulled & tested Deepseek-r1 8b model
  •  Installed Ollama GUI and accessed it in a browser

GCP VM with T4 GPU can now run Deepseek-r1 8B smoothly.

 

Ready to transform your business with our technology solutions? Contact Us  today to Leverage Our AI/ML Expertise. 

0