<output id="qn6qe"></output>

    1. <output id="qn6qe"><tt id="qn6qe"></tt></output>
    2. <strike id="qn6qe"></strike>

      亚洲 日本 欧洲 欧美 视频,日韩中文字幕有码av,一本一道av中文字幕无码,国产线播放免费人成视频播放,人妻少妇偷人无码视频,日夜啪啪一区二区三区,国产尤物精品自在拍视频首页,久热这里只有精品12

      K8S部署B(yǎng)OBAI 服務(wù)(Nvidia版)

      目錄

      一、GPU 節(jié)點(diǎn)部署 Driver && CUDA部署

      官方安裝文檔

      1、前提準(zhǔn)備

      檢查機(jī)器上面有支持CUDA的NVIDIA GPU

       lspci | grep -i nvidia
      

      查看自己的系統(tǒng)是否支持

      CUDA Installation Guide for Linux — Installation Guide for Linux 13.0 documentation The installation instructions for the CUDA Toolkit on Linux. https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#system-requirements

      驗(yàn)證系統(tǒng)是否有GCC編譯環(huán)境

      gcc -v
      

      驗(yàn)證系統(tǒng)是否安裝了正確的內(nèi)核頭文件和開發(fā)

      sudo yum install kernel-devel-$(uname -r) kernel-headers-$(uname -r)
      

      2、開始安裝

      禁用nouveau

      nouveau是一個(gè)第三方開源的Nvidia驅(qū)動(dòng),一般Linux安裝的時(shí)候默認(rèn)會(huì)安裝這個(gè)驅(qū)動(dòng)。 這個(gè)驅(qū)動(dòng)會(huì)與Nvidia官方的驅(qū)動(dòng)沖突,在安裝Nvidia驅(qū)動(dòng)和和CUDA之前應(yīng)先禁用nouveau。

      # 查看系統(tǒng)是否正在使用nouveau
      lsmod | grep nouveau
      
      # 如果顯示內(nèi)容,則禁用。以下是centos7的禁用方法
      
      
      #新建一個(gè)配置文件
      sudo vim /etc/modprobe.d/blacklist-nouveau.conf
      #寫入以下內(nèi)容
      blacklist nouveau
      options nouveau modeset=0
      #保存并退出
      :wq
      #備份當(dāng)前的鏡像
      sudo mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
      #建立新的鏡像
      sudo dracut /boot/initramfs-$(uname -r).img $(uname -r)
      #重啟
      sudo reboot
      #最后輸入上面的命令驗(yàn)證
      lsmod | grep nouveau
      
      

      開始安裝驅(qū)動(dòng)NVIDIA Driver (也可跳過,直接去安裝CUDA)

      ??一定要確認(rèn)NVIDIA Driver的版本適合自己的顯卡

      image-20250904163216248

      在下載前確認(rèn)Driver是否支持自己的顯卡

      image-20250904163432769

      • 安裝NVIDIA Driver

        rpm -ivh nvidia-driver-local-repo-rhel9-580.82.07-1.0-1.x86_64.rpm
        
      • 驗(yàn)證驅(qū)動(dòng)是否安裝成功

        # 執(zhí)行如下命令
        root@GPU1:~ nvidia-smi
        +---------------------------------------------------------------------------------------+
        | NVIDIA-SMI 535.161.08             Driver Version: 535.161.08   CUDA Version: 12.2     |
        |-----------------------------------------+----------------------+----------------------+
        | GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
        | Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
        |                                         |                      |               MIG M. |
        |=========================================+======================+======================|
        |   0  Tesla T4                       On  | 00000000:3B:00.0 Off |                    0 |
        | N/A   51C    P0              29W /  70W |  12233MiB / 15360MiB |      0%      Default |
        |                                         |                      |                  N/A |
        +-----------------------------------------+----------------------+----------------------+
        |   1  Tesla T4                       On  | 00000000:86:00.0 Off |                    0 |
        | N/A   49C    P0              30W /  70W |   6017MiB / 15360MiB |      0%      Default |
        |                                         |                      |                  N/A |
        +-----------------------------------------+----------------------+----------------------+
        
        +---------------------------------------------------------------------------------------+
        | Processes:                                                                            |
        |  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
        |        ID   ID                                                             Usage      |
        |=======================================================================================|
        |   
        +---------------------------------------------------------------------------------------
        

        到這里我們的GPU的驅(qū)動(dòng)就安裝好了,系統(tǒng)也可以正常的識(shí)別到GPU了。這里顯示的CUDA Version指的是當(dāng)前驅(qū)動(dòng)最大支持的CUDA版本。

      開始安裝CUDA Toolkit

      image

      選擇好自己的系統(tǒng)和版本

      image (1)

      建議下載.run文件

      • 安裝CUDA Toolkit

        wget https://developer.download.nvidia.com/compute/cuda/13.0.0/local_installers/cuda_13.0.0_580.65.06_linux.run
        sudo sh cuda_13.0.0_580.65.06_linux.run
        
        # 安裝成功的日志示例
        ===========
        = Summary =
        ===========
        
        Driver:   Installed
        Toolkit:  Installed in /usr/local/cuda-13.0/
        
        Please make sure that
         -   PATH includes /usr/local/cuda-13.0/bin
         -   LD_LIBRARY_PATH includes /usr/local/cuda-13.0/lib64, or, add /usr/local/cuda-13.0/lib64 to /etc/ld.so.conf and run ldconfig as root
        
        To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-13.0/bin
        To uninstall the NVIDIA Driver, run nvidia-uninstall
        Logfile is /var/log/cuda-installer.log
        
        

        下載并安裝即可,其實(shí)如果你沒有安裝NVIDIA Driver這一步會(huì)幫你安裝好適合你的NVIDIA Driver,比如這個(gè)會(huì)安裝580.65.06版本的Driver。

      • 配置環(huán)境變量

        vim /etc/profile.d/cuda.sh
        
        # 編輯一個(gè)新文件,內(nèi)容如下:
        
        # 添加 CUDA 13.0 到 PATH
        export PATH=/usr/local/cuda-13.0/bin:$PATH
        
        # 添加 CUDA 13.0的 lib64 到 LD_LIBRARY_PATH
        export LD_LIBRARY_PATH=/usr/local/cuda-13.0/lib64:$LD_LIBRARY_PATH
        

        保存,刷新配置文件

        source /etc/profile.d/cuda.sh

        
        檢查是否部署成功
        ```markdown 
        # 如果輸出版本號(hào)即為成功
        (base) root@Colourdata-GPU:~# nvcc -V
        nvcc: NVIDIA (R) Cuda compiler driver
        Copyright (c) 2005-2025 NVIDIA Corporation
        Built on Fri_Feb_21_20:23:50_PST_2025
        Cuda compilation tools, release 12.8, V12.8.93
        Build cuda_12.8.r12.8/compiler.35583870_0
        
        
        

      二、容器環(huán)境(Docker or Containerd)

      官方地址

      1、安裝 nvidia-container-toolkit

      說明:

      NVIDIA Container Toolkit 的主要作用是將 NVIDIA GPU 設(shè)備掛載到容器中。兼容docker、containerd、cri-o等。

      With dnf: RHEL/CentOS, Fedora, Amazon Linux

      # 配置生產(chǎn)存儲(chǔ)庫
      curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \
        sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
      
      # 安裝NVIDIA Container Toolkit 軟件包
      export NVIDIA_CONTAINER_TOOLKIT_VERSION=1.17.8-1
        sudo dnf install -y \
            nvidia-container-toolkit-${NVIDIA_CONTAINER_TOOLKIT_VERSION} \
            nvidia-container-toolkit-base-${NVIDIA_CONTAINER_TOOLKIT_VERSION} \
            libnvidia-container-tools-${NVIDIA_CONTAINER_TOOLKIT_VERSION} \
            libnvidia-container1-${NVIDIA_CONTAINER_TOOLKIT_VERSION}
      
      

      2、配置Runtime 為NVIDIA

      Docker

      # 配置runtime=doker
      sudo nvidia-ctk runtime configure --runtime=docker
      
      # 建議在 /etc/docker/daemon.json 里面檢查一下,并將默認(rèn)runtime也修改為nvidia
      (base) root@Colourdata-GPU:~# vim  /etc/docker/daemon.json
      {
        "registry-mirrors": [
          "https://ihsxva0f.mirror.aliyuncs.com",
          "https://docker.m.daocloud.io",
          "https://registry.docker-cn.com"
        ],
        "exec-opts": ["native.cgroupdriver=systemd"],
        "log-driver": "json-file",
        "log-opts": {
          "max-size": "10m",
          "max-file": "3"
        },
        "storage-driver": "overlay2",
        "default-runtime": "nvidia",
        "runtimes": {
          "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
          }
        }
      }
      
      systemctl daemon-relod 
      systemctl restart docker
      
      
      
      

      Containerd

      # 配置runtime=containerd
      sudo nvidia-ctk runtime configure --runtime=containerd
      
      
      # 建議在 /etc/containerd/config.toml 里面檢查一下,并將默認(rèn)runtime也修改為nvidia
      # 修改好后重啟containerd
      sudo systemctl restart containerd
      
      
      

      以上部署完成,就可以配置K8S調(diào)用GPU了。


      三、K8S調(diào)用GPU

      說明:

      device-plugin由NVIDIA提供,官網(wǎng)文檔

      部署Plugin

      • 建議先給GPU節(jié)點(diǎn)打上標(biāo)簽 gpu=true

      • 部署服務(wù)

        # 下載地址,建議選擇新版本
        https://github.com/NVIDIA/k8s-device-plugin/blob/main/deployments/static/nvidia-device-plugin.yml
        
        # 發(fā)布服務(wù)
        root@test:~# kubectl apply -f nvidia-device-plugin.yml
        
        root@test:~# kubectl get po -l app=gpu -n bobai
        NAME                                   READY   STATUS    RESTARTS   AGE
        nvidia-device-plugin-daemonset-7nkjw   1/1     Running   0          10m
        
      • 檢查服務(wù)是否部署成功

        # 如果可以看到nvidia gpu說明服務(wù)已經(jīng)部署成功了
        root@test:~# kubectl describe node GPU | grep nvidia.com/gpu
          nvidia.com/gpu:     2
        
        

      以上部署完成后,你的K8S集群就可以調(diào)用GPU了。

      三、部署服務(wù)

      1、部署Deekseek-v3

      示例yaml文件如下,僅供參考

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: deepseek-v3
        namespace: bobai
      spec:
        replicas: 1
        selector:
          matchLabels:
            app: deepseek-v3
        template:
          metadata:
            labels:
              app: deepseek-v3
          spec:
            containers:
            - command:
              - sh
              - -c
              - vllm serve  --port 8000 --trust-remote-code --served-model-name deepseek-v3 --dtype=fp8  --max-model-len 65536 --gpu-memory-utilization 0.95 /models/DeepSeek-V3  
              name: deepseek-v3
              image: registry.cn-shanghai.aliyuncs.com/colourdata/bobai-dependency:vllm
              imagePullPolicy: IfNotPresent
              ports:
              - containerPort: 8000
              volumeMounts:
              - name: model-volume
                mountPath: /models
              resources:
                requests:
                  nvidia.com/gpu: 8
                  memory: "16Gi"
                  cpu: "8"
                limits:
                  nvidia.com/gpu: 8
                  memory: "32Gi"
                  cpu: "16"
              livenessProbe:
                tcpSocket:
                  port: 8000
                initialDelaySeconds: 300
                periodSeconds: 10
                failureThreshold: 3
              readinessProbe:
                tcpSocket:
                  port: 8000
                initialDelaySeconds: 300
                periodSeconds: 10
                failureThreshold: 3
            volumes:
            - name: model-volume
              hostPath:
                path: /models
                type: Directory
      

      2、qwen-embedding模型部署

      示例yaml文件如下,僅供參考

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: vllm-embedding
        namespace: bobai
      spec:
        replicas: 1
        selector:
          matchLabels:
            app: vllm-embedding
        template:
          metadata:
            labels:
              app: vllm-embedding
          spec:
            containers:
            - command:
              - sh
              - -c
              - vllm serve  --port 8000 --trust-remote-code --served-model-name vllm-embedding  --max-model-len 4096 --gpu-memory-utilization 0.85 /models/Qwen3-Embedding-0.6B
              name: vllm-embedding
              image: registry.cn-shanghai.aliyuncs.com/colourdata/bobai-dependency:vllm
              imagePullPolicy: IfNotPresent
              ports:
              - containerPort: 8000
      
              volumeMounts:
              - name: model-volume
                mountPath: /models
              resources:
                limits:
                  nvidia.com/gpu: 1
                requests:
                  memory: "8Gi"
                  cpu: "4"
                limits:
                  memory: "16Gi"
                  cpu: "8"
              livenessProbe:
                tcpSocket:
                  port: 8000
                initialDelaySeconds: 30
                periodSeconds: 10
                failureThreshold: 3
              readinessProbe:
                tcpSocket:
                  port: 8000
                initialDelaySeconds: 10
                periodSeconds: 5
                failureThreshold: 3
            volumes:
            - name: model-volume
              nfs:
                server: 192.168.2.250
                path: /data/bobai/models
            restartPolicy: Always
          
      ---
      apiVersion: v1
      kind: Service
      metadata:
        name: vllm-embedding
        namespace: bobai
      spec:
        type: ClusterIP
        ports:
        - port: 8000
          protocol: TCP
          targetPort: 8000
        selector:
          app: vllm-embedding
           
      
      

      3、Tika部署

      示例yaml文件如下,僅供參考

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: tika
        namespace: bobai
      spec:
        replicas: 1
        selector:
          matchLabels:
            app: tika
        template:
          metadata:
            labels:
              app: tika
          spec:
            containers:
            - name: tika
              image: tika-ocr-cn:v1
              imagePullPolicy: IfNotPresent
              ports:
              - containerPort: 9998
              resources:
                requests:
                  memory: "1Gi"
                  cpu: "500m"
                limits:
                  memory: "2Gi"
                  cpu: "1"
              livenessProbe:
                tcpSocket:
                  port: 9998
                initialDelaySeconds: 30
                periodSeconds: 10 
                failureThreshold: 3
              readinessProbe:
                tcpSocket:        
                  port: 9998
                initialDelaySeconds: 10
                periodSeconds: 5
                failureThreshold: 3
            restartPolicy: Always
      ---
      apiVersion: v1
      kind: Service
      metadata:
        name: tika-service
        namespace: bobai
      spec:
        type: ClusterIP
        selector:
          app: tika
        ports:
        - protocol: TCP
          port: 9998
          targetPort: 9998
            
      
      

      4、部署ASR

      示例yaml文件如下,僅供參考

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: openai-edge-tts
        namespace: bobai
      spec:
        replicas: 1
        selector:
          matchLabels:
            app: openai-edge-tts
        template:
          metadata:
            labels:
              app: openai-edge-tts
          spec:
            containers:
            - name: openai-edge-tts
              image: travisvn/openai-edge-tts:latest
              imagePullPolicy: IfNotPresent
              ports:
              - containerPort: 5050
              env:
              - name: API_KEY
                value: "Colourdata1234@"
              - name: PORT
                value: "5050"
              - name: DEFAULT_VOICE
                value: "en-US-AvaNeural"
              - name: DEFAULT_RESPONSE_FORMAT
                value: "mp3"
              - name: DEFAULT_SPEED
                value: "1.0"
              - name: DEFAULT_LANGUAGE
                value: "en-US"
              - name: REQUIRE_API_KEY
                value: "True"
              - name: REMOVE_FILTER
                value: "False"
              - name: EXPAND_API
                value: "True"
              resources:
                requests:
                  memory: "512Mi"
                  cpu: "500m"
                limits:
                  memory: "1Gi"
                  cpu: "1"
              livenessProbe:
                tcpSocket:
                  port: 5050
                initialDelaySeconds: 30 
                periodSeconds: 10 
                failureThreshold: 3
              readinessProbe:
                tcpSocket:
                  port: 5050
                initialDelaySeconds: 10 
                periodSeconds: 5  
                failureThreshold: 3     
            restartPolicy: Always
            
      ---
      apiVersion: v1
      kind: Service
      metadata:
        name: openai-edge-tts-service
        namespace: bobai
      spec:
        type: ClusterIP
        selector:
          app: openai-edge-tts
        ports:
        - protocol: TCP
          port: 5050
          targetPort: 5050      
      
      
      posted on 2025-09-28 00:08  ellison123  閱讀(22)  評(píng)論(0)    收藏  舉報(bào)

      主站蜘蛛池模板: 97国产成人无码精品久久久| 男女猛烈激情xx00免费视频| 久久精品国产99国产精品亚洲| 国产午夜福利高清在线观看| 午夜精品久久久久久久久| 国产成人无码免费视频在线| 亚洲无?码A片在线观看| 亚洲二区中文字幕在线| 国产盗摄xxxx视频xxxx| 亚洲av中文乱码一区二| 日本深夜福利在线观看| 欧美亚洲一区二区三区在线| 老司机亚洲精品一区二区| 色综合五月伊人六月丁香| 亚洲更新最快无码视频| 国产在线观看播放av| 18禁免费无码无遮挡网站 | 国产精品中文字幕久久| 最新午夜男女福利片视频| 国产欧美日韩亚洲一区二区三区 | 阳山县| 天天天做夜夜夜做无码| 在线视频中文字幕二区| 九色综合狠狠综合久久| 国产精品日本一区二区不卡视频| 亚洲黄色成人网在线观看| 国产欧美日韩精品第二区| 秋霞电影网| 国产一区二区三区四区激情 | 国产精成人品日日拍夜夜| 亚洲色欲色欲大片www无码| 色欲av亚洲一区无码少妇| 亚洲va久久久噜噜噜久久狠狠| 国产精品免费无遮挡无码永久视频 | 亚洲综合成人av在线| 粉嫩小泬无遮挡久久久久久| 强奷漂亮少妇高潮麻豆| 兴隆县| 国产美女久久久亚洲综合| 99久久免费只有精品国产| 日韩精品中文字幕有码 |