flink 反压监控
所属分类 flink
浏览量 196
flink cdc3.0 数据同步 mysql 到 doris
https://nightlies.apache.org/flink/flink-docs-master/zh/docs/ops/monitoring/back_pressure/
看到一个 task 发生 反压警告 ,意味着它生产数据的速率比下游 task 消费数据的速率要快
(flink_taskmanager_job_task_isBackPressured and on(job_id) (increase(flink_jobmanager_job_runningTime[5m]) >0)) >0
出现反压,一般是写入doris 慢了 超时了
org.apache.doris.flink.sink.batch.DorisBatchStreamLoad [] - stream load error with 192.168.0.5:8040, to retry, cause by
java.net.SocketException: Connection timed out (Read failed)
source组件 点开 BackPressure标签页 显示
Back Pressure Status: HIGH
flink_taskmanager_job_task_numRecordsIn{job_id='45e5c8208c518b8a4e004ceb37ccc573'}
flink_taskmanager_job_task_numRecordsIn{app="pushgateway", host="192_168_0_10", job="flink25e38edac6f4d98a8f84d16c398ac2cd", job_id="45e5c8208c518b8a4e004ceb37ccc573", job_name="sync_001", subtask_index="0", task_attempt_id="0dac5d0a80958eca9de2f590655e9fa0_d40592faea9b13cc59503ebfb2b12986_0_13", task_attempt_num="13", task_id="d40592faea9b13cc59503ebfb2b12986", task_name="PostPartition____Sink_Writer:_Flink_CDC_Event_Sink:_doris", tm_id="192_168_0_10:43589_688cb3"} 3221
flink_taskmanager_job_task_numRecordsOut{job_id='45e5c8208c518b8a4e004ceb37ccc573'}
flink_taskmanager_job_task_numRecordsOut{app="pushgateway", host="192_168_0_10", job="flink25e38edac6f4d98a8f84d16c398ac2cd", job_id="45e5c8208c518b8a4e004ceb37ccc573", job_name="sync_001", subtask_index="0", task_attempt_id="0dac5d0a80958eca9de2f590655e9fa0_cbc357ccb763df2852fee8c4fc7d55f2_0_13", task_attempt_num="13", task_id="cbc357ccb763df2852fee8c4fc7d55f2", task_name="Source:_Flink_CDC_Event_Source:_mysql____Route____SchemaOperator____PrePartition", tm_id="192_168_0_10:43589_688cb3"} 3740
Name Status RecordsReceived RecordsSent
Source: Flink CDC Event Source: mysql -> Route -> SchemaOperator -> PrePartition
RUNNING
0 3740
PostPartition -> Sink Writer: Flink CDC Event Sink: doris
RUNNING
3221 0
task_attempt_num="13" 任务重启次数
flink_jobmanager_job_numRestarts
flink_jobmanager_job_numRestarts and on(job,job_id) (increase(flink_jobmanager_job_runningTime[5m]) >0)
上一篇
下一篇
rabbitmq prometheus插件 和 rabbitmq-exporter
rabbitmq prometheus 插件 指标说明
布林通道
OOM kill 监控
grafana 图表 变量
easyrules 简介