首页  

flink 反压监控     所属分类 flink 浏览量 206
flink cdc3.0  数据同步 mysql 到 doris

https://nightlies.apache.org/flink/flink-docs-master/zh/docs/ops/monitoring/back_pressure/


看到一个 task 发生 反压警告 ,意味着它生产数据的速率比下游 task 消费数据的速率要快


(flink_taskmanager_job_task_isBackPressured  and on(job_id)  (increase(flink_jobmanager_job_runningTime[5m]) >0)) >0
出现反压,一般是写入doris 慢了  超时了 
org.apache.doris.flink.sink.batch.DorisBatchStreamLoad       [] - stream load error with 192.168.0.5:8040, to retry, cause by
java.net.SocketException: Connection timed out (Read failed)


source组件 点开  BackPressure标签页 显示  
Back Pressure Status: HIGH



flink_taskmanager_job_task_numRecordsIn{job_id='45e5c8208c518b8a4e004ceb37ccc573'}
flink_taskmanager_job_task_numRecordsIn{app="pushgateway", host="192_168_0_10", job="flink25e38edac6f4d98a8f84d16c398ac2cd", job_id="45e5c8208c518b8a4e004ceb37ccc573", job_name="sync_001", subtask_index="0", task_attempt_id="0dac5d0a80958eca9de2f590655e9fa0_d40592faea9b13cc59503ebfb2b12986_0_13", task_attempt_num="13", task_id="d40592faea9b13cc59503ebfb2b12986", task_name="PostPartition____Sink_Writer:_Flink_CDC_Event_Sink:_doris", tm_id="192_168_0_10:43589_688cb3"}  	3221


flink_taskmanager_job_task_numRecordsOut{job_id='45e5c8208c518b8a4e004ceb37ccc573'}
flink_taskmanager_job_task_numRecordsOut{app="pushgateway", host="192_168_0_10", job="flink25e38edac6f4d98a8f84d16c398ac2cd", job_id="45e5c8208c518b8a4e004ceb37ccc573", job_name="sync_001", subtask_index="0", task_attempt_id="0dac5d0a80958eca9de2f590655e9fa0_cbc357ccb763df2852fee8c4fc7d55f2_0_13", task_attempt_num="13", task_id="cbc357ccb763df2852fee8c4fc7d55f2", task_name="Source:_Flink_CDC_Event_Source:_mysql____Route____SchemaOperator____PrePartition", tm_id="192_168_0_10:43589_688cb3"}   3740


Name	 Status   RecordsReceived   RecordsSent
Source: Flink CDC Event Source: mysql -> Route -> SchemaOperator -> PrePartition	
RUNNING  
0 3740 	

PostPartition -> Sink Writer: Flink CDC Event Sink: doris	
RUNNING 
3221 0


task_attempt_num="13"    任务重启次数
flink_jobmanager_job_numRestarts
flink_jobmanager_job_numRestarts  and on(job,job_id)  (increase(flink_jobmanager_job_runningTime[5m]) >0)

上一篇     下一篇
rabbitmq prometheus插件 和 rabbitmq-exporter

rabbitmq prometheus 插件 指标说明

布林通道

OOM kill 监控

grafana 图表 变量

easyrules 简介