【亚马逊云科技】使用 Helm 3 为 Amazon EKS 部署 Prometheus+Grafana 监控平台

Kubernetes
Amazon EKS Anywhere
技领云博主
0
0
> 文章作者:云矩阵 #### 1. 创建 Kubernetes 命名空间 首先,创建一个 Kubernetes 命名空间,并使用 `helm` 来部署 `stable/monitoring` 软件包: ``` $ kubectl create namespace monitoring ``` 操作过程演示: ``` [ec2-user@ip-172-31-37-104 ~]$ kubectl create namespace monitoring namespace/monitoring created [ec2-user@ip-172-31-37-104 ~]$ kubectl get ns NAME STATUS AGE default Active 153m kube-node-lease Active 153m kube-public Active 153m kube-system Active 153m monitoring Active 86m ``` #### 2. 添加 Prometheus 社区 helm chart 其次,添加 Prometheus 社区 helm chart: ``` $ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts $ helm repo update $ helm repo list $ helm search repo stable/prometheus-operator $ helm search repo prometheus-operator ``` 操作过程演示: ``` [ec2-user@ip-172-31-37-104 ~]$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts "prometheus-community" already exists with the same configuration, skipping [ec2-user@ip-172-31-37-104 ~]$ helm repo update Hang tight while we grab the latest from your chart repositories... ...Successfully got an update from the "grafana" chart repository ...Successfully got an update from the "prometheus-community" chart repository ...Successfully got an update from the "stable" chart repository Update Complete. ⎈Happy Helming!⎈ [ec2-user@ip-172-31-37-104 ~]$ helm repo list NAME URL prometheus-community https://prometheus-community.github.io/helm-charts grafana https://grafana.github.io/helm-charts stable https://charts.helm.sh/stable [ec2-user@ip-172-31-37-104 ~]$ helm search repo prometheus-community/prometheus-operator NAME CHART VERSION APP VERSION DESCRIPTION stable/prometheus-operator 9.3.2 0.38.1 DEPRECATED Provides easy monitoring definitions... ``` #### 3. 安装 prometheus 接着,使用 helm install 安装 prometheus: ``` helm install stable prometheus-community/kube-prometheus-stack --debug ``` 创建过程显示正常,最后会输出如下内容: ``` NOTES: kube-prometheus-stack has been installed. Check its status by running: kubectl --namespace default get pods -l "release=stable" Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator. ``` #### 4. 检查 Prometheus Pod 运行状况 最后,检查 Prometheus Pods 的部署情况。 ``` [ec2-user@ip-172-31-37-104 ~]$ kubectl get pods -n monitoring NAME READY STATUS RESTARTS AGE alertmanager-stable-kube-prometheus-sta-alertmanager-0 2/2 Running 0 131m prometheus-stable-kube-prometheus-sta-prometheus-0 2/2 Running 0 131m stable-grafana-58b76cd9d7-tgd8r 3/3 Running 0 131m stable-kube-prometheus-sta-operator-7699d6bfb8-zx8jn 1/1 Running 0 131m stable-kube-state-metrics-65f45c47c9-5zmj5 1/1 Running 0 131m stable-prometheus-node-exporter-2q98d 1/1 Running 0 126m stable-prometheus-node-exporter-98cf4 1/1 Running 0 131m stable-prometheus-node-exporter-d6jnm 1/1 Running 0 128m stable-prometheus-node-exporter-gp5dh 1/1 Running 0 131m stable-prometheus-node-exporter-gqqls 1/1 Running 0 128m stable-prometheus-node-exporter-sqg6x 1/1 Running 0 126m [ec2-user@ip-172-31-37-104 ~]$ ``` #### 5. 检查 Prometheus Service 部署情况 检查部署完成的服务 `kubectl get svc` ``` $ kubectl get svc -n monitoring ``` > 检查部署完成的 Prometheus Pods 可以看到每个节点都运行了 `node-exporter` 且已经运行起 Prometheus 和 Grafana #### 6. 修改服务访问端口类型 > 🛑情况说明:默认情况下, prometheus 和 grafana 服务都是使用 ClusterIP 在集群内部,所以要能够在外部访问,需要使用 NodePort > > ✅操作内容:修改 stable-kube-prometheus-sta-prometheus 服务和 stable-grafana 服务配置文件中的内容,将 type 从 ClusterIP 修改为 NodePort。 ``` $ kubectl edit svc stable-grafana ``` ![image.png](https://dev-media.amazoncloud.cn/769d6f4ad670443296d4cb29813e11c0_image.png "image.png") ``` $ kubectl edit svc stable-kube-prometheus-sta-prometheus ``` ![image.png](https://dev-media.amazoncloud.cn/02405b35c3274cba91aa4890dd4af7b6_image.png "image.png") > 执行`kubectl get svc -n monitoring`命令,查看对应服务访问的外置端口。 ![image.png](https://dev-media.amazoncloud.cn/4b4b8b3466bc4b8ebcf9d304a892b873_image.png "image.png") > 注意:`检查对应的主机安全组的端口是否开启。` #### 7. 访问 Prometheus 数据收集情况 > 访问 Prometheus 数据收集情况。其中一个集群工作节点的 IP 地址 +prometheus 端口。 ![image.png](https://dev-media.amazoncloud.cn/3ac4a96235d1420d917f4ea341584323_image.png "image.png") #### 8. 访问 Grafana > 访问 Grafana。对 Amazon EKS 中的数据进行数据可观测性查看。其中一个集群工作节点的 IP 地址 +grafana 端口。 > > *** > > 访问 Grafana 面板,初始账号 `admin` 密码是 `prom-operator` ,请立即修改 ![image.png](https://dev-media.amazoncloud.cn/4699158a7722484ab9bdaf441e0b2673_image.png "image.png") #### 9. 设置数据源 > 1、设置名称。设置为默认数据源 > > *** > > 2、Prometheus server URL:Prometheus 的 SVC 对应的 Cluster-IP ![image.png](https://dev-media.amazoncloud.cn/ffc709369eca40d3b12be6e8b0a92464_image.png "image.png") * 点击 “Save & test” 保存并测试。输出结果如下所示,表示成功。 ![image.png](https://dev-media.amazoncloud.cn/986bfa037c8a4f59bc450aad5718bbf9_image.png "image.png") #### 10. 查看 Kubernetes 各类性能可视化参数信息 * 查看 Kubernetes 网络工作负载 ![image.png](https://dev-media.amazoncloud.cn/dda80afb44e143b3ad01ddea17064a4a_image.png "image.png") * 查看 Kubernetes Pod 网络传输情况 ![image.png](https://dev-media.amazoncloud.cn/0ef04457776841fabc14630b2529cb12_image.png "image.png") * 查看 Kubernetes 服务器接口 ![image.png](https://dev-media.amazoncloud.cn/4ffcad45005d4a9c909f443e9343d750_image.png "image.png") * 查看需要 Kubelet 命令查询操作的数据 ![image.png](https://dev-media.amazoncloud.cn/03373c9283e74f239eb7282855ca25b0_image.png "image.png") * 查看 Kubernetes Proxy ![image.png](https://dev-media.amazoncloud.cn/a037889677fc40e8bb9adce330dca7ba_image.png "image.png") * 全局查看 Prometheus ![image.png](https://dev-media.amazoncloud.cn/8e271f508e574ef782592842aa30d9d6_image.png "image.png") * 查看集群工作节点的系统配置参数信息 ![image.png](https://dev-media.amazoncloud.cn/82e8d7d5a9f740a8b5be5e810e253b6d_image.png "image.png") ![image.png](https://dev-media.amazoncloud.cn/91bfc242e9de4b89b5f475ffce1182a3_image.png "image.png") ![image.png](https://dev-media.amazoncloud.cn/14170998e43b41f4ae88e2849e8a3ae2_image.png "image.png") [![2.png](https://dev-media.amazoncloud.cn/ed0e28c5fbf449809032a2fd0c4ab798_2.png "2.png")](https://summit.amazoncloud.cn/2024/register.html?source=DSJAVfG2GS7gEk2Osm6kYXAa+8HnSEVdbCVjkuit7lE= )
目录
亚马逊云科技解决方案 基于行业客户应用场景及技术领域的解决方案
联系亚马逊云科技专家
亚马逊云科技解决方案
基于行业客户应用场景及技术领域的解决方案
联系专家
0
目录
关闭