| H__D |
|
||
|
在 Kubernetes (k8s) 集群中部署監控系統是確保集群健康、性能穩定和故障快速定位的關鍵步驟。以下是基于 Prometheus + Grafana 的完整監控部署方案(使用 1. 核心監控組件
2. 部署步驟(使用 Helm)本例k8s版本是 1.28.0 2.1 安裝 Helm 工具
2.2 添加 Helm Repo 倉庫helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm repo update 網絡問題,可以離線下載 Helm Chart 及其依賴 # 歷史版本下載地址 https://github.com/prometheus-community/helm-charts/tags # 版本選擇參考 https://github.com/prometheus-operator/kube-prometheus # 下載得到 kube-prometheus-stack-70.0.0.tgz # 解壓kube-prometheus-stack tar -zxvf kube-prometheus-stack-70.0.0.tgz # 查看需要的鏡像(可以離線在上上傳到節點上) 離線下載鏡像 # 使用可上網機器下載 docker pull registry.k8s.io/ingress-nginx/kube-webhook-certgen:v1.5.1 # 打包鏡像 docker save -o certgen.tar registry.k8s.io/ingress-nginx/kube-webhook-certgen:v1.5.1 # 到目標節點上導入鏡像 sudo ctr -n k8s.io image import certgen.tar 2.3 安裝 kube-prometheus-stackhelm install kube-prometheus-stack prometheus-community/kube-prometheus-stack \ --namespace monitoring \ --create-namespace \ --version <CHART_VERSION> # 替換為目標版本號
# 離線安裝 helm install kube-prometheus-stack ./kube-prometheus-stack \ --namespace monitoring \ --create-namespace \ --debug 3. 驗證部署3.1 檢查 Pod 狀態kubectl get pods -n monitoring 3.2 訪問 Web UI
4. 關鍵配置優化4.1 數據持久化
5. 刪除 kube-prometheus-stack5.1 刪除RELEASEhelm uninstall [RELEASE_NAME] helm uninstall kube-prometheus-stack -n monitoring 刪除CRD kubectl delete crd alertmanagerconfigs.monitoring.coreos.com kubectl delete crd alertmanagers.monitoring.coreos.com kubectl delete crd podmonitors.monitoring.coreos.com kubectl delete crd probes.monitoring.coreos.com kubectl delete crd prometheusagents.monitoring.coreos.com kubectl delete crd prometheuses.monitoring.coreos.com kubectl delete crd prometheusrules.monitoring.coreos.com kubectl delete crd scrapeconfigs.monitoring.coreos.com kubectl delete crd servicemonitors.monitoring.coreos.com kubectl delete crd thanosrulers.monitoring.coreos.com
|
![]() |
|
|
博客園
|
|