栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 前沿技术 > 大数据 > 大数据系统

Flink cdc 2.1.0发布测试

Flink cdc 2.1.0发布测试

依赖:


    com.ververica
    flink-sql-connector-mysql-cdc
    2.1.0
    provided

1,最简单的代码:

package com.ververica.cdc.connectors.mysql.source;

import org.apache.flink.api.common.eventtime.WatermarkStrategy;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;

import com.ververica.cdc.connectors.mysql.testutils.UniqueDatabase;
import com.ververica.cdc.debezium.JsonDebeziumDeserializationSchema;
import org.junit.Ignore;
import org.junit.Test;


public class MySqlSourceExampleTest extends MySqlSourceTestbase {

  

    @Test
    @Ignore("Test ignored because it won't stop and is used for manual test")
    public void testConsumingAllEvents() throws Exception {
        inventoryDatabase.createAndInitialize();
        MySqlSource mySqlSource =
                MySqlSource.builder()
                        .hostname(MYSQL_CONTAINER.getHost())
                        .port(MYSQL_CONTAINER.getDatabasePort())
                        .databaseList(inventoryDatabase.getDatabaseName())
                        .tableList(inventoryDatabase.getDatabaseName() + ".products")
                        .username(inventoryDatabase.getUsername())
                        .password(inventoryDatabase.getPassword())
                        .serverId("5401-5404")
                        .deserializer(new JsonDebeziumDeserializationSchema())
                        .includeSchemaChanges(true) // output the schema changes as well
                        .build();

        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        // enable checkpoint
        env.enableCheckpointing(3000);
        // set the source parallelism to 4
        env.fromSource(mySqlSource, WatermarkStrategy.noWatermarks(), "MySqlParallelSource")
                .setParallelism(4)
                .print()
                .setParallelism(1);

        env.execute("Print MySQL Snapshot + Binlog");
    }
}

includeSchemaChanges(true) 是开启了,表结构变更感知

我这里只分析mysql cdc的一些升级:

  • 支持所有 MySQL 数据类型

    包括枚举类型、数组类型、地理信息类型等复杂类型。

  • 支持 metadata column

    用户可以在 Flink DDL 中通过 db_name STRING metaDATA FROM 'database_name' 的方式来访问库名(database_name)、表名(table_name)、变更时间(op_ts)等 meta 信息。这对分库分表场景的数据集成非常使用。

  • 支持并发读取的 DataStream API

    在 2.0 版本中,无锁算法,并发读取等功能只在 SQL API 上透出给用户,而 DataStream API 未透出给用户,2.1 版本支持了 DataStream API,可通过 MySqlSourceBuilder 创建数据源。用户可以同时捕获多表数据,借此搭建整库同步链路。同时通过 MySqlSourceBuilder#includeSchemaChanges 还能捕获 schema 变更。

  • 支持 currentFetchEventTimeLag,currentEmitEventTimeLag,sourceIdleTime 监控指标

    这些指标遵循 FLIP-33 [1] 的连接器指标规范,可以查看 FLIP-33 获取每个指标的含义。其中,currentEmitEventTimeLag 指标记录的是 Source 发送一条记录到下游节点的时间点和该记录在 DB 里产生时间点差值,用于衡量数据从 DB 产生到离开 Source 节点的延迟。用户可以通过该指标判断 source 是否进入了 binlog 读取阶段:

    • 即当该指标为 0 时,代表还在全量历史读取阶段;

    • 当大于 0 时,则代表进入了 binlog 读取阶段。

正常读取数据:

 debug一下结构变更之后:

name1  -> name2

 

Struct的完整结构:

Struct{source=Struct{version=1.5.4.Final,connector=mysql,name=mysql_binlog_source,ts_ms=1637117644411,db=bi_dev,table=test,server_id=1921684100,gtid=ef6f9e15-1218-11ec-997f-968db1336f14:2840388,file=mysql-bin.000056,pos=509499737,row=0},historyRecord={"source":{"file":"mysql-bin.000056","pos":509499737,"server_id":1921684100},"position":{"transaction_id":null,"ts_sec":1637117644,"file":"mysql-bin.000056","pos":509499965,"gtids":"ef6f9e15-1218-11ec-997f-968db1336f14:1-2840387","server_id":1921684100},"databaseName":"bi_dev","ddl":"ALTER TABLE `bi_dev`.`test` rnCHANGE COLUMN `name1` `name2` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL AFTER `id`","tableChanges":[{"type":"ALTER","id":""bi_dev"."test"","table":{"defaultCharsetName":"utf8mb4","primaryKeyColumnNames":["id"],"columns":[{"name":"id","jdbcType":4,"typeName":"INT","typeexpression":"INT","charsetName":null,"length":11,"position":1,"optional":false,"autoIncremented":true,"generated":true},{"name":"name2","jdbcType":12,"typeName":"VARCHAR","typeexpression":"VARCHAR","charsetName":"utf8mb4","length":255,"position":2,"optional":true,"autoIncremented":false,"generated":false},{"name":"date4","jdbcType":91,"typeName":"DATE","typeexpression":"DATE","charsetName":null,"position":3,"optional":true,"autoIncremented":false,"generated":false},{"name":"datetime1","jdbcType":93,"typeName":"DATETIME","typeexpression":"DATETIME","charsetName":null,"position":4,"optional":true,"autoIncremented":false,"generated":false},{"name":"timestamp1","jdbcType":2014,"typeName":"TIMESTAMP","typeexpression":"TIMESTAMP","charsetName":null,"position":5,"optional":true,"autoIncremented":true,"generated":true}]}}]}}

修改语句为:

"databaseName":"bi_dev","ddl":"ALTER TABLE `bi_dev`.`test`  CHANGE COLUMN `name1` `name2` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL AFTER `id`"

我们执行这个语句,稍微修改一下 ,将目前的name2 修改成name3:

 查看效果:

修改之后的详细情况:

 [
    {
        "type":"ALTER",
        "id":""bi_dev"."test"",
        "table":{
            "defaultCharsetName":"utf8mb4",
            "primaryKeyColumnNames":[
                "id"
            ],
            "columns":[
                {
                    "name":"id",
                    "jdbcType":4,
                    "typeName":"INT",
                    "typeexpression":"INT",
                    "charsetName":null,
                    "length":11,
                    "position":1,
                    "optional":false,
                    "autoIncremented":true,
                    "generated":true
                },
                {
                    "name":"name2",
                    "jdbcType":12,
                    "typeName":"VARCHAR",
                    "typeexpression":"VARCHAR",
                    "charsetName":"utf8mb4",
                    "length":255,
                    "position":2,
                    "optional":true,
                    "autoIncremented":false,
                    "generated":false
                },
                {
                    "name":"date4",
                    "jdbcType":91,
                    "typeName":"DATE",
                    "typeexpression":"DATE",
                    "charsetName":null,
                    "position":3,
                    "optional":true,
                    "autoIncremented":false,
                    "generated":false
                },
                {
                    "name":"datetime1",
                    "jdbcType":93,
                    "typeName":"DATETIME",
                    "typeexpression":"DATETIME",
                    "charsetName":null,
                    "position":4,
                    "optional":true,
                    "autoIncremented":false,
                    "generated":false
                },
                {
                    "name":"timestamp1",
                    "jdbcType":2014,
                    "typeName":"TIMESTAMP",
                    "typeexpression":"TIMESTAMP",
                    "charsetName":null,
                    "position":5,
                    "optional":true,
                    "autoIncremented":true,
                    "generated":true
                }
            ]
        }
    }
]

所以一定添加了对元数据修改的操作,数据解析也不一样了,要添加判断,后续代码会添加完整的解析代码, 然后监控元数据操作之后,针对下游的doris表进行元数据修改,未完。 

转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/584311.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号