主要记录Flink使用过程中的一些问题,不定时更新
软件版本
Apache Flink 1.13.2
Apache Hadoop 3.2.2
Apache Hive 3.1.2
Apache Hudi GitHub-master(2021-11-03)
Apache Kafka 2.4.1
准生产环境,均采用完全分布式部署,Flink使用standalone集群
问题记录
关键报错:org.apache.kafka.common.errors.UnsupportedVersionException: Attempted to write a non-default producerId at version 2
问题描述:使用mysql cdc同步数据至Kafka,第一次提交任务可以正常运行,当使用flink stop命令停掉任务后,从savepoint启动任务,任务能够提交成功,但是无法启动,taskmanager日志有如下报错:
2021-11-03 11:04:28.060 [Source: Custom Source -> Sink: sink_Kafka (1/1)#914] WARN org.apache.flink.runtime.taskmanager.Task - Source: Custom Source -> Sink: sink_Kafka (1/1)#914 (d174f4c6f58d2ad3349a2583c28458a6) switched from INITIALIZING to FAILED with failure cause: org.apache.kafka.common.errors.UnsupportedVersionException: Attempted to write a non-default producerId at version 2 Suppressed: java.lang.NullPointerException at com.ververica.cdc.debezium.DebeziumSourceFunction.cancel(DebeziumSourceFunction.java:492) at org.apache.flink.streaming.api.operators.StreamSource.cancel(StreamSource.java:160) at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.cancelTask(SourceStreamTask.java:210) at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.cancelTask(SourceStreamTask.java:191) at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUponFail(StreamTask.java:653) at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:537) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:759) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566) at java.lang.Thread.run(Thread.java:748) Suppressed: java.lang.NullPointerException at com.ververica.cdc.debezium.DebeziumSourceFunction.cancel(DebeziumSourceFunction.java:492) at com.ververica.cdc.debezium.DebeziumSourceFunction.close(DebeziumSourceFunction.java:497) at org.apache.flink.api.common.functions.util.FunctionUtils.closeFunction(FunctionUtils.java:41) at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:117) at org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:861) at org.apache.flink.streaming.runtime.tasks.StreamTask.runAndSuppressThrowable(StreamTask.java:840) at org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:753) at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.cleanUpInvoke(SourceStreamTask.java:186) at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUponFail(StreamTask.java:659) at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:537) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:759) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566) at java.lang.Thread.run(Thread.java:748)
解决办法:经过排查,该报错属于kafka-clients包,检查项目的pom文件,发现kafka-clients包的依赖为provided,而Flink的lib目录下没有kafka-clients包,由于Flink集群暂时无法重启,因此修改项目pom文件,注释掉provided使用默认依赖范围,重新打包并从savepoint启动任务,任务可以正常启动



