文章详情|prometheus FAQ 要点整理

prometheus FAQ 要点整理 所属分类 prometheus 浏览量 1917

https://prometheus.io/docs/introduction/faq/

与其他监控系统比较
https://prometheus.io/docs/introduction/comparison/

The main Prometheus server runs standalone and has no external dependencies.

独立运行不需要额外外部依赖

Most Prometheus components are written in Go
大多数组件使用go编写

Why do you pull rather than push?
Pulling over HTTP offers a number of advantages:

You can run your monitoring on your laptop when developing changes.
You can more easily tell if a target is down.
You can manually go to a target and inspect its health with a web browser.
Overall, we believe that pulling is slightly better than pushing, but it should not be considered a major point when considering a monitoring system.

For cases where you must push, we offer the Pushgateway.

使用 http 拉 模式 及优点
开发友好
很容易判断监控目标是否挂了
方便使用浏览器检查目标的将康状况

总的来说，拉比推稍微好一点，但是在考虑监测系统时，不应该把它看作是一个重点。

使用 Pushgateway 支持推模式

How to feed logs into Prometheus?
Short answer: Don't! Use something like the ELK stack instead.

日志采集，请使用elk

Longer answer: Prometheus is a system to collect and process metrics, not an event logging system.
The Raintank blog post Logs and Metrics and Graphs, Oh My! provides more details about the differences between logs and metrics.

https://grafana.com/blog/2016/01/05/logs-and-metrics-and-graphs-oh-my/

Can I create dashboards?
Yes, we recommend Grafana for production usage. There are also Console templates.

Can I change the timezone? Why is everything in UTC?
To avoid any kind of timezone confusion, especially when the so-called daylight saving time is involved,
we decided to exclusively use Unix time internally and UTC for display purposes in all components of Prometheus.

为了避免时区混淆，特别是涉及到所谓的夏令时时，我们决定只在内部使用Unix时间和UTC，以便在Prometheus的所有组件中显示。

There are a number of client libraries for instrumenting your services with Prometheus metrics.

https://prometheus.io/docs/instrumenting/clientlibs/

机器 网络设备 批量任务 监控

推送网关

Can I monitor machines?
Yes, the Node Exporter exposes an extensive set of machine-level metrics on Linux and other Unix systems such as CPU usage, memory, disk utilization, filesystem fullness, and network bandwidth.

Can I monitor network devices?
Yes, the SNMP Exporter allows monitoring of devices that support SNMP.

Can I monitor batch jobs?
Yes, using the Pushgateway. See also the best practices for monitoring batch jobs.

Can I monitor JVM applications via JMX?
Yes, for applications that you cannot instrument directly with the Java client,
you can use the JMX Exporter either standalone or as a Java Agent.

https://github.com/prometheus/jmx_exporter

jvm 监控
代码埋点
JMX Exporter
Java Agent

the list of exporters and integrations.
https://prometheus.io/docs/instrumenting/exporters/

https://prometheus.io/docs/instrumenting/clientlibs/

https://github.com/prometheus/client_java

What is the performance impact of instrumentation?
Performance across client libraries and languages may vary.
For Java, benchmarks indicate that incrementing a counter/gauge with the Java client will take 12-17ns, depending on contention.
This is negligible for all but the most latency-critical code.

插装代码对性能的影响
除了最关键的延迟代码外， 可以忽略

Why are all sample values 64-bit floats?
We restrained ourselves to 64-bit floats to simplify the design.
为了简化设计，使用64位浮点数

Why don't the Prometheus server components support TLS or authentication?
不支持安全传输 和 身份验证

Adding Basic Auth to Prometheus with Nginx.
https://www.robustperception.io/adding-basic-auth-to-prometheus-with-nginx

Troubleshooting

非正常关机 恢复模式 启动会慢一些

高可用架构简单介绍

prometheus概述

linux内核版本与系统版本查看以及x86与x86_64的区别

prometheus使用入门

使用node-exporter监控主机信息

prometheus数据模型