个人网站监控实战
所属分类 architecture
浏览量 2805
prometheus grafana node-exporter
prometheus 用于抓取指标
grafana 从prometheus查询数据 分析展示
node-exporter 用于主机监控,暴露主机信息给node-exporter抓取
下载并安装
https://github.com/prometheus/node_exporter/releases/download/v0.17.0/node_exporter-0.17.0.linux-amd64.tar.gz
https://github.com/prometheus/prometheus/releases/download/v2.5.0/prometheus-2.5.0.linux-amd64.tar.gz
https://dl.grafana.com/oss/release/grafana-6.6.2.linux-amd64.tar.gz
GO做的东西部署很方便,解压 直接启动即可
node_exporter
nohup ./node_exporter &
http://127.0.0.1:9100
promethoeus
nohup ./prometheus --web.listen-address="0.0.0.0:8010" --web.enable-lifecycle &
http://127.0.0.1:8010
数据损坏的话,prometheus 启动可能会失败,可以删掉data目录再启动
prometheus 数据抓取配置
prometheus.yml
https://gitee.com/dyyx/hellocode/blob/master/web/code/mymonitor/prometheus.yml
metrics_path defaults to '/metrics'
metrics_path: '/actuator/prometheus'
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:8010']
- job_name: 'node_exporter'
static_configs:
- targets: ['localhost:9100']
- job_name: 'myweb'
static_configs:
- targets: ['localhost']
修改配置后 刷新配置
curl -X POST http://localhost:8010/-/reload
查看抓取目标 /targets
grafana
nohup ./bin/grafana-server web &
http://127.0.0.1:3000
默认密码 admin/admin
重置密码
./grafana-cli admin reset-admin-password 12345678
图表配置
自定义主机监控
https://gitee.com/dyyx/hellocode/blob/master/web/tech/grafana/grafana-host-monitor.json
不错的图表 直接导入即可
Node Exporter for Prometheus Dashboard CN v20191102
https://grafana.com/grafana/dashboards/8919
Node Exporter Full
https://grafana.com/grafana/dashboards/1860
grafana里有执行查询语句的工具
explore
可执行查询语句 图表和表格形式 graph table
右上角 Run Query 按钮
page_pv
page_pv{job="myweb",page="article_detail"}
rate(page_pv{job="myweb",page="article_detail"}[1m])
irate(page_pv{job="myweb",page="article_detail"}[1m])
irate(total_pv[1m])
时间不能小于 抓取间隔
否则 No datapoints found.
irate(gcinfo_time{name="Copy"}[5m]) / irate(gcinfo_count{name="Copy"}[5m])
total_pv
page_pv
irate(total_pv[1m])
irate(page_pv[1m])
接口响应时间超过 200ms 的次数
sum(dataServiceHttpService_200_total + dataServiceHttpService_500_total + dataServiceHttpService_1000_total)
总TPS
sum(irate(dataServiceHttpService_seconds_count[1m]))
sum(jvm_threads_states_threads{state="runnable",job='xxx'})
sum(jvm_threads_states_threads{state="blocked",job='xxx'})
sum(jvm_threads_states_threads{state="waiting",job='xxx'})
sum(jvm_threads_states_threads{state="timed-waiting",job='xxx'})
集群 多个实例 平均响应时间 ms
1000 * sum(irate(dataServiceHttpService_seconds_sum[1m])) / sum(irate(dataServiceHttpService_seconds_count[1m]))
当 irate 与 聚合操作(例如sum )或 时间聚合函数(任何以_over_time结尾的函数)组合在一起时,总是首先使用irate(),然后聚合。
否则,当目标重新启动时,irate()无法检测计数器重置。
mempool{subtype="used"}
gcinfo_count
= 完全相等
!= 不相等
=~ 正则表达式匹配
!~ 正则表达式不匹配
{__name__=~"totalLoadedClassCount|loadedClassCount|unloadedClassCount"}
=~ 与 | 结合使用 , 不能使用 =
page_pv{page=~"about|hq"}
grafana 界面上 支持多选
开放 grafana 端口
云产品 云服务器 安全组
安全组
增加一行
来源 0.0.0.0/0
协议端口 TCP:3000
策略 允许
备注 grafana
0.0.0.0/0 TCP:3000 允许 grafana
0.0.0.0/0 TCP:80,443 允许 放通Web服务端口
0.0.0.0/0 TCP:22 允许 放通Linux SSH登录
根据prometheus指标格式 , 暴露指标
http://codefun007.xyz/metrics
响应头
Content-Type: text/plain;charset=utf-8
MetricsServlet
@Override
public void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
response.setCharacterEncoding("utf-8");
response.setContentType("text/plain; charset=utf-8");
String str = MetricsUtil.getMetrics();
response.getWriter().write(str);
}
总pv
每个页面的pv
jvm监控指标
total_pv 490537
page_pv{page="about"} 52
page_pv{page="article_detail"} 7359
page_pv{page="article_hot_list"} 64
page_pv{page="article_list"} 2897
page_pv{page="category_articles"} 638
page_pv{page="category_list"} 31
page_pv{page="dzlist"} 987
page_pv{page="fintechhome"} 26
page_pv{page="home"} 1854
peakThreadCount 22
threadCount 21
totalStartedThreadCount 36
daemonThreadCount 20
totalLoadedClassCount 3927
loadedClassCount 3927
unloadedClassCount 0
gcinfo_count{name="Copy"} 16
gcinfo_time{name="Copy"} 189
gcinfo_count{name="MarkSweepCompact"} 1
gcinfo_time{name="MarkSweepCompact"} 44
mempool{name="Code Cache",subtype="used",type="NON_HEAP"} 19647360
mempool{name="Code Cache",subtype="init",type="NON_HEAP"} 2555904
mempool{name="Code Cache",subtype="committed",type="NON_HEAP"} 20447232
mempool{name="Code Cache",subtype="max",type="NON_HEAP"} 251658240
mempool{name="Metaspace",subtype="used",type="NON_HEAP"} 29447496
mempool{name="Metaspace",subtype="init",type="NON_HEAP"} 0
mempool{name="Metaspace",subtype="committed",type="NON_HEAP"} 30277632
mempool{name="Metaspace",subtype="max",type="NON_HEAP"} -1
mempool{name="Compressed Class Space",subtype="used",type="NON_HEAP"} 2800720
mempool{name="Compressed Class Space",subtype="init",type="NON_HEAP"} 0
mempool{name="Compressed Class Space",subtype="committed",type="NON_HEAP"} 3014656
mempool{name="Compressed Class Space",subtype="max",type="NON_HEAP"} 1073741824
mempool{name="Eden Space",subtype="used",type="HEAP"} 31875488
mempool{name="Eden Space",subtype="init",type="HEAP"} 71630848
mempool{name="Eden Space",subtype="committed",type="HEAP"} 71696384
mempool{name="Eden Space",subtype="max",type="HEAP"} 195756032
mempool{name="Survivor Space",subtype="used",type="HEAP"} 4353232
mempool{name="Survivor Space",subtype="init",type="HEAP"} 8912896
mempool{name="Survivor Space",subtype="committed",type="HEAP"} 8912896
mempool{name="Survivor Space",subtype="max",type="HEAP"} 24444928
mempool{name="Tenured Gen",subtype="used",type="HEAP"} 47241976
mempool{name="Tenured Gen",subtype="init",type="HEAP"} 178978816
mempool{name="Tenured Gen",subtype="committed",type="HEAP"} 178978816
mempool{name="Tenured Gen",subtype="max",type="HEAP"} 489357312
参考资料
prometheus node-exporter 实用指标
监控系统安装配置及使用
使用node-exporter监控主机信息
prometheus使用技巧
prometheus要点整理
prometheus特殊指标
prometheus 系列文章
grafana使用技巧
上一篇
下一篇
银行业务简介
prometheus特殊指标
趣味小故事
美股熔断记录
clickhouse简介
函数式与非函数式,你来pick