<markdown>
[DeepSeek](https://www.deepseek.com/) has captured public eyes for months. I decide to deploy it to our local server and make it open to remote usage.

# 1. Usage (For users)
There are several UI tools. 

## 1.1 Initial test

Type `http://101.6.122.12:11434` in your remote device. You should see "Ollama is running", otherwise there are something wrong.

## 1.2 Use ChatBox

[REF](https://www.banzhuti.com/deepseek-localhost-install.html)
[ChatBox](https://chatboxai.app/zh) provide a client and a web. Download the client and install it.

- Choose "Use My Own API key / Local Model"
- Choose "Ollama API"
- Fill `API Host` with "http://101.6.122.12:11434"
- For `Model`, choose "deepseek-r1:32b"
- Click "SAVE"

That's all. Chat with `deepseek`. You may begin with a "Hello, DeepSeek". 

The response may be slow for the first time, since it take some time to wake the model. (冷启动和热启动的区别)

- [***WARNING!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!***]
- You may see several different models and be curious about the performance between each other. Don't wake up more than one model at a time! The graphical memory is occupied after one model is waken.

## 1.3 VSCode

VSCode, along with extension `Continue`, supports local LLM usage. However, the experience is bad.

I put the configuration file `config.json` below for convience.
```
{
  "models": [
    {
      "title": "deepseek-r1:32b",
      "model": "deepseek-r1:32b",
      "provider": "ollama",
      "apiBase": "http://101.6.122.12:11434/"
    }
  ],
  "tabAutocompleteModel": {
      "title": "deepseek-r1:32b",
      "model": "deepseek-r1:32b",
      "provider": "ollama",
      "apiBase": "http://101.6.122.12:11434/"
  }
}
```

# 2. Installation (For developers only)
Broadly speaking, use software `ollama` to download and run the model `deepseek`, then open a port to make it accessible externally.

## 2.1 install ollama
[REF](https://ollama.com/download/linux)
For linux, download the file `install.sh` and run it. The software will be installed to path `/usr/bin`.
```
curl -fsSL https://ollama.com/install.sh | sh
```

## 2.2 change storage path
[REF](https://blog.csdn.net/yyh2508298730/article/details/138288553)
The default storage path is `/usr/share`, but it's a routine that we put softwares in path `/home/usr/share`. Thus the next step is move the storage path of ollama.

- Make a new directory and change the file permissions
```
mkdir /home/usr/share/ollama_models
sudo chown -R ollama:ollama /home/usr/share/ollama_models
sudo chmod -R 775 /home/usr/share/ollama_models
```

- Change the default configurations of ollama
```
sudo vi /etc/systemd/system/ollama.service
# add the next line to [Service]
Environment="OLLAMA_MODELS=/home/usr/share/ollama_models"
```

- Refresh the service of ollama
```
sudo systemctl daemon-reload
sudo systemctl restart ollama.service
sudo systemctl status ollama
```

## 2.3 download deepseek's models
[REF](https://ollama.com/library/deepseek-r1:1.5b)
The "real" deepseek-r1 is the model with 671 billion parameters. Ollama distilled several models with different file sizes. For 4080 or 3090, `deepseek-r1:32b` is the most proper one. Smaller ones are worse in performance while bigger ones run slowly.

This is an easy while tricky step. It is easy because just a command `ollama run deepseek-r1:32b` works. It is tricky because the downloading may restart automatically many many times ([REF](https://www.bilibili.com/opus/1014673340657303555)). There are two ticks:

- Download the smallest one to ensure that the pipeline works.
```
ollama run deepseek-r1:1.5b
```

- Use a script to download the model. [REF](https://github.com/ollama/ollama/issues/8687)
```
#!/bin/bash
while true; do
    echo "Attempting to download model..."
    ollama pull deepseek-r1:32b &
    process_pid=$!
    sleep 10
    if wait $process_pid; then
        echo "Model downloaded successfully!"
        break
    else
        echo "Download failed. Retrying..."
        kill -9 $process_pid 2>/dev/null
    fi
    sleep 2
done
```

## 2.4 make it open to remote access
[REF](https://chatboxai.app/en/help-center/connect-chatbox-remote-ollama-service-guide) To make the ollama service available externally, you need to set the following two environment variables `OLLAMA_HOST` and `OLLAMA_ORIGINS`.

- Change the default configuration
```
sudo vi /etc/systemd/system/ollama.service
# add the two lines to [Service]
Environment="OLLAMA_HOST=0.0.0.0:11434"
Environment="OLLAMA_ORIGINS=*"
```

- Refresh the service of ollama
```
sudo systemctl daemon-reload
sudo systemctl restart ollama.service
sudo systemctl status ollama
```

- Make the port open
```
sudo firewall-cmd --zone=public --add-port=11434/tcp --permanent  # open the port
# sudo firewall-cmd --zone=public --remove-port=11434/tcp --permanent  # close the port
sudo firewall-cmd --reload  # accept changes
sudo firewall-cmd --zone=public --list-ports  # show open ports
```

- Test the configuration: 

    Type `http://IP_ADDRESS:11434` in your remote device. You should see "Ollama is running".

</markdown>