將 safetensors 模型轉換為 GGUF，導入Ollama

Ollama 默認 pull 到的模型都是量化過的

要使用非量化的模型就需要自己導入

但是 Hugging Face 不用魔法是打不開的

我們找個鏡像站下載模型:

比如這個吧：https://hf-mirror.com/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B

首先安裝 git lfs

git lfs install

然后下載模型：

git clone https://hf-mirror.com/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B

ollama（舊版本）貌似只能直接導入GGUF格式的模型

我們下面用 llama.cpp 來轉換下

首先下載 llama.cpp

git clone https://github.com/ggerganov/llama.cpp.git

進行安裝：

cd llama.cpp pip install -r requirements.txt

看下轉換語句的參數：

python convert_hf_to_gguf.py -h

然后開始轉換：

python convert_hf_to_gguf.py ./DeepSeek-R1-Distill-Qwen-7B --outfile ./DeepSeek-R1-Distill-Qwen-7B/DeepSeek-R1-Distill-Qwen-7B.gguf --outtype f16

轉換完成后，要將模型導入到 Ollama

首先建立一個 Modelfile 文件，文件內容：

FROM ./DeepSeek-R1-Distill-Qwen-7B/DeepSeek-R1-Distill-Qwen-7B.gguf

看下模型：

ollama list

看下模型詳情：

ollama show deepseek-r1-qwen:7b

但是有個問題

你運行模型后，會發現它在胡言亂語…

這個時候我們需要改下 Modelfile 文件

FROM ./DeepSeek-R1-Distill-Qwen-7B/DeepSeek-R1-Distill-Qwen-7B.gguf
 TEMPLATE """{{- if .System }}{{ .System }}{{ end }}
 {{- range $i, $_ := .Messages }}
 {{- $last := eq (len (slice $.Messages $i)) 1}}
 {{- if eq .Role "user" }}<｜User｜>{{ .Content }}
 {{- else if eq .Role "assistant" }}<｜Assistant｜>{{ .Content }}{{- if not $last }}<｜end▁of▁sentence｜>{{- end }}
 {{- end }}
 {{- if and $last (ne .Role "assistant") }}<｜Assistant｜>{{- end }}
 {{- end }}"""
 PARAMETER stop "<|begin▁of▁sentence|>"
 PARAMETER stop "<|end▁of▁sentence|>"
 PARAMETER stop "<|User|>"
 PARAMETER stop "<|Assistant|>"