栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 前沿技术 > 大数据 > 大数据系统

seatunnel 简单使用(原名waterdrop)

seatunnel 简单使用(原名waterdrop)

简介

seatunnel 是一个非常易用,高性能、支持实时流式和离线批处理的海量数据处理产品,架构于Apache Spark 和 Apache Flink之上。

官方文档:https://interestinglab.github.io/seatunnel-docs/#/

安装

安装比较简单,参考官方文档即可。

配置

config.conf 下述配置是从hive中抽数插入到clickhouse中的配置,数据源是hive的一张表,通过seatunnel插件根据id字段进行分片插入clickhouse集群不同分片。

spark {
  spark.sql.catalogImplementation = "hive"
  spark.app.name = "hive2clickhouse"
  spark.executor.instances = 30
  spark.executor.cores = 1 
  spark.executor.memory = "2g"
  spark.ui.port = 13000
}

input {
    hive {
		pre_sql = "select id,name,create_time from table"
		table_name = "table_tmp"
    }
}

filter {
	convert {
		source_field = "data_source"
		new_type = "UInt8"
	}

	org.interestinglab.waterdrop.filter.Slice {
		source_table_name = "table_tmp"
		source_field = "id"
		slice_num = 2
		slice_code = 0
		result_table_name = "table_8123"
	}
	org.interestinglab.waterdrop.filter.Slice {
		source_table_name = "table_tmp"
		source_field = "id"
		slice_num = 2
		slice_code = 1
		result_table_name = "table_8124"
	}
}

output {
    clickhouse {
        source_table_name="table_8123"
        host = "ip1:8123"
        database = "db_name"
        username="username"
        password="pwd"
        table = "model_score_local"
        fields = ["id","name","create_time"]
		    clickhouse.socket_timeout = 50000
		    retry_codes = [209, 210]
		    retry = 3
		    bulk_size = 500000
    }
	clickhouse {
        source_table_name="table_8124"
        host = "ip2:8123"
        database = "db_name"
        username="username"
        password="pwd"
        table = "model_score_local"
        fields = ["id","name","create_time"]
	    	clickhouse.socket_timeout = 50000
	    	retry_codes = [209, 210]
	    	retry = 3
		    bulk_size = 500000
    }
}

启动
../bin/start-waterdrop.sh --master local[4] --deploy-mode client --config.conf
转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/350173.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号