栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 前沿技术 > 大数据 > 大数据系统

datax抽取es数据到hive

datax抽取es数据到hive

收到一个需求:将es集群的数据抽取到大数据平台
在hive创建一个对应数据表

create table if not exists ods.pr_es_test_orc(
						clueId STRING,
						brandId STRING,
						clueEstype STRING
)row format delimited FIELDS TERMINATED BY '|' 
STORED AS orc;

有些主要需要配置的点:
“endpoint” :es的ip地址,
“accessId”:用户名,
“accessKey”: 密码,
“index”: 数据库前缀*,( 其中的*是全匹配 )
“scroll”: 每次读取数据缓存时间,

{
	"job": {
		"setting": {
			"speed": {
				"channel": 7
			}
		},
		"content": [{
			"reader": {
				"name": "elasticsearchreader",
				"parameter": {
					"endpoint": "http://XXX.XXX.XXX.XXX:9200",
					"accessId": "XXXXXXX*",
					"accessKey": "XXXXXXXXXXX",
					"index": "XXXXXX-*",
					"type": "_doc",
					"scroll": "3m",
					"headers": {
					},
					"search": [{
							"query": {
								"bool": {
										"filter":[
                                                   {
                                                        "range":{
                                                                "createdTime":{
                                                                        "boost":1,
                                                                        "from": "${st}", ,
                                                                        "include_lower":true,
                                                                        "include_upper":true,
                                                                        "to": "${et}"
                                                                }
                                                        }
                                                }
                                                ]
								}
							},
							"size": 10
						}],
					"table": {
						"column": [							
							{"name" : "clueId"},
							{"name" : "brandId"},
							{"name" : "clueEstype"}
							]
					}
				}
			},
			"writer": {
				"name": "hdfswriter",
				"parameter": {
                        "defaultFS": "hdfs://${hdfs}",
                        "fileType": "ORC",
                        "path": "/user/hive/warehouse/ods.db/pr_es_test_orc",
					"fileName": "aaaaaa",
					"column": [			
										{"name" : "clueId", "type": "STRING"},
										{"name" : "brandId", "type": "STRING"},
										{"name" : "clueEstype", "type": "STRING"}

					],
					"writeMode": "append",
					"fieldDelimiter": "|",
					"compress": "NONE"
				}
			}
		}]
	}
}
转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/354936.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号