栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 前沿技术 > 大数据 > 大数据系统

2021-10-10

2021-10-10

hive动态分区表、分桶表、既分区又分桶表的应用

hive 动态分区表
partitioned by(colName Type)

create table tableName(
.......
.......
)
partitioned by (colName colType [comment '...'],...)

例如

create table dy_part1(
sid int,
name string,
gender string,
age int,
academy string
)
partitioned by (dt string)
row format delimited fields terminated by ','
;

动态分区表导入数据要建立一个临时表,不然没有分区的效果
创建临时表

create table temp_part1(
sid int,
name string,
gender string,
age int,
academy string,
dt string
)
row format delimited 
fields terminated by ','
;
加载数据
load data local inpath '路径' into table temp_part1;

动态导入数据

insert into dy_part1 partition(dt) select sid,name,gender,age,academy,dt from temp_part1;

hive 分桶表

drop table student;
create table student(
sno int,
name string,
sex string,
age int,
academy string
)
clustered by (sno) sorted by (age desc) into 4 buckets
row format delimited 
fields terminated by ','
;

创建临时表

create table temp_student(
sno int,
name string,
sex string,
age int,
academy string
)
clustered by (sno) sorted by (age desc) into 4 buckets
row format delimited 
fields terminated by ','
;
导入数据
load data local inpath '路径' into table temp_student

从临时表中导入数据

insert into table student
select * from temp_student
distribute by(sno) 
sort by (age desc)
;
或者
insert overwrite table student
select * from temp_student
distribute by(sno) 
sort by (age desc)
;

hive既分区又分桶

创建分区分桶表

create table student(
sid int,
name string,
gender string,
age int,
chinese int,
math int,
english int
)
partitioned by (academy string,dt date)
clustered by(sid) into 4 buckets
row format delimited fields terminated by 't'
注意
clustered by 要写在partitioned 的后面

导入数据建立临时表

create table student_temp(
sid int,
name string,
gender string,
age int,
academy string,
dt date,
chinese int,
math int,
english int
)row format delimited fields terminated by 't'

导入数据
load data local inpath './data/students_test' into table students_test;

从临时表中导入数据

insert into student partition (academy,dt) select sid,name,gender,age,chinese,math,english,academy,dt from student_temp

注意 分区又分桶的时候导入数据不能把分桶的字段写上如:(cluster by (sid)),会自动分桶

转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/313011.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号