We are installing the DeepSeek-R1 Model on a GCP n1-standard-4 instance with 1 NVIDIA T4 GPU. Let's see the step by step process.
Since we selected a T4 GPU, we must install NVIDIA drivers and CUDA for GPU acceleration.
Update System
sudo apt update && sudo apt upgrade -y
Install NVIDIA Driver
sudo apt install -y nvidia-driver-535
Check if the GPU is detected:
nvidia-smi
The output will show NVIDIA T4 with its memory details.
Install CUDA Toolkit 12.2 (Latest)
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/
cuda/12.2.0/local_installers/cuda-repo-ubuntu2204-12-2-local_12.2.0-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204-12-2-local_12.2.0-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-12-2-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt update
sudo apt install -y cuda
Set CUDA paths:
echo 'export PATH=/usr/local/cuda/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
source ~/.bashrc
Verify installation:
nvcc --version
Output will show a CUDA version like 12.2.
sudo apt install -y docker.io
sudo systemctl start docker
sudo systemctl enable docker
Verify installation:
docker --version
Enable GPU support in Docker:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list
| sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt update
sudo apt install -y nvidia-docker2
sudo systemctl restart docker
Verify GPU support in Docker:
docker run --rm --gpus all nvidia/cuda:12.2.0-base nvidia-smi
If it shows the NVIDIA T4 GPU, then we are ready to proceed.
Now, install Ollama inside a Docker container.
docker pull ollama/ollama
Start Ollama with GPU:
docker run --rm --gpus all -d --name ollama -p 11434:11434 ollama/ollama
Verify it's running:
docker ps
Once Ollama is running, pull the Deepseek-r1 8B model:
docker exec -it ollama ollama pull deepseek-r1:8b
Wait for it to download.
Run Deepseek-r1:
docker exec -it ollama ollama run deepseek-r1:8b
Test it:
curl -X POST "http://localhost:11434/api/generate" -d '{
"model": "deepseek-r1:8b",
"prompt": "What is the capital of France?",
"stream": false
}'
It should return "Paris" as the response.
docker pull ghcr.io/open-webui/open-webui:main
Run it:
docker run -d --name ollama-gui -p 3000:3000 \
-e OLLAMA_API_BASE_URL=http://host.docker.internal:11434 \
ghcr.io/open-webui/open-webui:main
This will start Ollama GUI on port 3000.
Open http://YOUR_VM_EXTERNAL_IP:3000 in your browser.
You should see the Ollama GUI dashboard.
Click Models and you should see deepseek-r1:8b.
Click Chat, select the model and start chatting.
Final Summary
GCP VM with T4 GPU can now run Deepseek-r1 8B smoothly.
Ready to transform your business with our technology solutions? Contact Us today to Leverage Our AI/ML Expertise.
0