DeepSeek has captured public eyes for months. I decide to deploy it to our local server and make it open to remote usage.
There are several UI tools.
Type http://101.6.122.12:11434 in your remote device. You should see "Ollama is running", otherwise there are something wrong.
REF ChatBox provide a client and a web. Download the client and install it.
API Host with "http://101.6.122.12:11434"Model, choose "deepseek-r1:32b"
That's all. Chat with deepseek. You may begin with a "Hello, DeepSeek".
The response may be slow for the first time, since it take some time to wake the model. (冷启动和热启动的区别)
VSCode, along with extension Continue, supports local LLM usage. However, the experience is bad.
I put the configuration file config.json below for convience.
{
"models": [
{
"title": "deepseek-r1:32b",
"model": "deepseek-r1:32b",
"provider": "ollama",
"apiBase": "http://101.6.122.12:11434/"
}
],
"tabAutocompleteModel": {
"title": "deepseek-r1:32b",
"model": "deepseek-r1:32b",
"provider": "ollama",
"apiBase": "http://101.6.122.12:11434/"
}
}
Broadly speaking, use software ollama to download and run the model deepseek, then open a port to make it accessible externally.
REF
For linux, download the file install.sh and run it. The software will be installed to path /usr/bin.
curl -fsSL https://ollama.com/install.sh | sh
REF
The default storage path is /usr/share, but it's a routine that we put softwares in path /home/usr/share. Thus the next step is move the storage path of ollama.
Make a new directory and change the file permissions
mkdir /home/usr/share/ollama_models sudo chown -R ollama:ollama /home/usr/share/ollama_models sudo chmod -R 775 /home/usr/share/ollama_models
Change the default configurations of ollama
sudo vi /etc/systemd/system/ollama.service # add the next line to [Service] Environment="OLLAMA_MODELS=/home/usr/share/ollama_models"
Refresh the service of ollama
sudo systemctl daemon-reload sudo systemctl restart ollama.service sudo systemctl status ollama
REF
The "real" deepseek-r1 is the model with 671 billion parameters. Ollama distilled several models with different file sizes. For 4080 or 3090, deepseek-r1:32b is the most proper one. Smaller ones are worse in performance while bigger ones run slowly.
This is an easy while tricky step. It is easy because just a command ollama run deepseek-r1:32b works. It is tricky because the downloading may restart automatically many many times (REF). There are two ticks:
Download the smallest one to ensure that the pipeline works.
ollama run deepseek-r1:1.5b
Use a script to download the model. REF
#!/bin/bash
while true; do
echo "Attempting to download model..."
ollama pull deepseek-r1:32b &
process_pid=$!
sleep 10
if wait $process_pid; then
echo "Model downloaded successfully!"
break
else
echo "Download failed. Retrying..."
kill -9 $process_pid 2>/dev/null
fi
sleep 2
done
REF To make the ollama service available externally, you need to set the following two environment variables OLLAMA_HOST and OLLAMA_ORIGINS.
Change the default configuration
sudo vi /etc/systemd/system/ollama.service # add the two lines to [Service] Environment="OLLAMA_HOST=0.0.0.0:11434" Environment="OLLAMA_ORIGINS=*"
Refresh the service of ollama
sudo systemctl daemon-reload sudo systemctl restart ollama.service sudo systemctl status ollama
Make the port open
sudo firewall-cmd --zone=public --add-port=11434/tcp --permanent # open the port # sudo firewall-cmd --zone=public --remove-port=11434/tcp --permanent # close the port sudo firewall-cmd --reload # accept changes sudo firewall-cmd --zone=public --list-ports # show open ports
Test the configuration:
Type http://IP_ADDRESS:11434 in your remote device. You should see "Ollama is running".