Insert Data Into Hive_大数据系统

Insert Data Into Hive

序言

通过前面Hive的搭建,包括我们整合Mybatis和Hive.发现一个问题即:即传统的insert 不能执行

所以这里整理下往Hive插入数据的几种形式,同时这些语句也可以在Mybatis中使用.

Hive不支持INSERT INTO, UPDATe, DELETE针对单条数据的操作.cuiyaonan2000@163.com

Load

加载本地文件到Hive中

#创建Hive表
hive> create table db_hive.student(cui string ,yao string ,nan string)  row format delimited fields terminated 't';


#load本地文件到Hive的数据库db_hive的student表中
hive> load data local inpath '/soft/hive/apache-hive-3.1.2-bin/bin/datatest' into table db_hive.student;

它的语法格式为:

LOAD  DATA  [LOCAL] INPATH 'filepath'  [OVERWRITE]  INTO  TABLE  tablename
 [PARTITION (partcol1=val1, partcol2=val2 ...)]

说明:

local: 可选项,标识从本地加载而非Hdfs中加载.
overwrite: 可选.先删除原来数据，然后再加载
partition: 这里是指将inpath中的所有数据加载到那个分区，并不会判断待加载的数据中每一条记录属于哪个分区。----如果制定了就是固定了分区信息.cuiyaonan2000@163.com
注意：load完了之后，会自动把INPATH下面的源数据删掉，其实就是将INPATH下面的数据移动到/usr/hive/warehouse目录下了。
分区加载示例：load data inpath '/tmp/test.txt' into table score partition (school="school1",)

Load From Hdfs

这里跟上面读取本地文件到hive的不同点就是文件是放在HDFS上的.那实现其实也很简单,即(去掉loca可选项,表示不是从本地上传):

#创建Hive表
hive> create table db_hive.student(cui string ,yao string ,nan string) row format delimited fields terminated by 't';


#load hdfs中的文件到Hive的数据库db_hive的student表中
hive> load data inpath '/soft/hive/apache-hive-3.1.2-bin/bin/datatest' into table db_hive.student;

Insert Data

这里说明下:insert into 跟传统关系型数据库的插入是不一样的.这里是只能针对整张表的内容数据插入.

例如:

//如下的是动态创建分区.当然也可以指定 country和 state为静态值

hive>insert into table test_partition_table partition(country,state) select name,age,country,state from t1;


//静态分区的设置举例
hive>insert into table test_partition_tablepartition partition(country = 'US',state) select name,age,country,state from t1;

语法如下:

INSERT INTO|OVERWRITE TABLE 表明 [PARTITION (partcol1=val1, partcol2=val2 ...)] 
select 字段1,字段2,字段3 FROM frometable1,fromtable2..

Insert Data When Create Table

#如下的创建表是不可以复制cuiyaonan2000_table中的分区.
hive> create table test_table as  select id, name, tel from cuiyaonan2000_table;


#如下的创建表是可以复制cuiyaonan2000_table中的分区.
hive> create table test_table like cuiyaonan2000_table;

曲线实现Insert into

举例感觉没问题~~~:

# 通过 select 语句构造出 array 类型
select array('cuiyaonan','2000','@163.com') ;

# 转储 array 类型数据
insert into employee(name,work_place) select 'tom',array('cuiyaonan2000@163.com','sichuan','chengdu');

# 通过 select 语句构造出 map 类型
select map('bigdata',100);

# 转储 map 类型数据
insert into employee(name,score) select 'tomas', map('bigdata',100);


# 通过 select 语句构造出 struct 类型
select struct('male',10);
select named_struct('sex','male','age',10);

# 转储 struct 类型数据
insert into employee(name,sex_age) select 'tomson',named_struct('sex','male','age',10);


#特别说明针对有分区的表是可以直接使用insert into 来插入一条数据的cuiyaonan2000@163.com
insert into test_partition_table  partition(provice = 'hebei', city = 'shijiazhuang') values('tomslee',26);

Insert Data Into Hive

大数据系统相关栏目本月热门文章