首页  

zookeeper3.4 prometheus监控     所属分类 zookeeper 浏览量 275
3.6.0 以上版本,原生支持开放指标接口供Prometheus采集

zoo.cfg 
metricsProvider.className=org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider
metricsProvider.httpPort=7000
metricsProvider.exportJvmInfo=true



低于该版本 使用zookeeper-exporter进行采集 https://github.com/jiankunking/zookeeper_exporter https://github.com/carlpett/zookeeper_exporter/releases/download/v1.0.2/zookeeper_exporter export 适合 zookeeper3.4+ Zookeerper Exporter Overview https://grafana.com/dashboards/9236
基于4字命令采集指标 expoter采集报错 mntr is not executed because it is not in the whitelist telnet 127.0.0.1 2181 Trying 127.0.0.1... Connected to 127.0.0.1. Escape character is '^]'. mntr mntr is not executed because it is not in the whitelist. zoo.cfg 4lw.commands.whitelist=* ./zkServer.sh restart
重点关注的指标 zk_outstanding_requests 堆积请求数 zk_pending_syncs 阻塞中的 sync 操作 zk_avg_latency 平均 响应延迟 zk_open_file_descriptor_count 打开 文件描述符 数 zk_max_file_descriptor_count 最大 文件描述符 数 zk_up 1 zk_server_state 主从状态 zk_num_alive_connections 活跃连接数
监控告警设置 参考 groups: - name: zookeeperStatsAlert rules: - alert: 堆积请求数过大 expr: avg(zk_outstanding_requests) by (instance) > 10 for: 1m labels: severity: critical annotations: summary: "Instance {{ $labels.instance }} " description: "积请求数过大" - alert: 阻塞中的 sync 过多 expr: avg(zk_pending_syncs) by (instance) > 10 for: 1m labels: severity: critical annotations: summary: "Instance {{ $labels.instance }} " description: "塞中的 sync 过多" - alert: 平均响应延迟过高 expr: avg(zk_avg_latency) by (instance) > 10 for: 1m labels: severity: critical annotations: summary: "Instance {{ $labels.instance }} " description: '平均响应延迟过高' - alert: 打开文件描述符数大于系统设定的大小 expr: zk_open_file_descriptor_count > zk_max_file_descriptor_count * 0.85 for: 1m labels: severity: critical annotations: summary: "Instance {{ $labels.instance }} " description: '打开文件描述符数大于系统设定的大小' - alert: zookeeper服务器宕机 expr: zk_up == 0 for: 1m labels: severity: critical annotations: summary: "Instance {{ $labels.instance }} " description: 'zookeeper服务器宕机' - alert: zk主节点丢失 expr: absent(zk_server_state{state="leader"}) != 1 for: 1m labels: severity: critical annotations: summary: "Instance {{ $labels.instance }} " description: 'zk主节点丢失'

上一篇     下一篇
linux 权限 777

Prometheus Pushgateway

flink1.18.1 pushgateway prometheus监控

grafana Dashboard SpringBoot2 micrometer-prometheus

mysql监控 mysqld_exporter

SkyWalking 慢sql 数据获取 ,graphQL 接口 例子