栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 前沿技术 > 大数据 > 大数据系统

HiveQL: Data Manipulation

HiveQL: Data Manipulation

Loading Data into managed Tables
create table t1(c1 string) stored as textfile;
LOAD DATA
LOAD DATA [LOCAL] INPATH () [OVERWRITE] INTO TABLE T1;

如果使用 LOCAL INPATH,则从本地拷贝文件到表的存储路径。
如果不用 LOCAL, 则需要数据在HDFS 上,从INPATH 的路径移动到表的存储路径。
使用OVERWRITE 则表原来的数据会删除。
如果不用 OVERWRITE,如果文件名已经在表中存在,则会在新的文件名后加上“copy${seq}”,seq 从 1 开始编号。

Inserting Data into Tables from Queries Single Partition
INSERT OVERWRITE TABLE employees
PARTITION (country = 'US', state = 'OR')
SELECt * FROM staged_employees se
WHERe se.cnty = 'US' AND se.st = 'OR';
Multiple insert
FROM staged_employees se
INSERT OVERWRITE TABLE employees
PARTITION (country = 'US', state = 'OR')
SELECt * WHERe se.cnty = 'US' AND se.st = 'OR'
INSERT OVERWRITE TABLE employees
PARTITION (country = 'US', state = 'CA')
SELECT * WHERe se.cnty = 'US' AND se.st = 'CA'
INSERT OVERWRITE TABLE employees
PARTITION (country = 'US', state = 'IL')
SELECT * WHERe se.cnty = 'US' AND se.st = 'IL';
Dynamic insert
  • Specify a partiton column.
INSERT OVERWRITE TABLE employees
PARTITION (country = 'US', state)
SELECT ..., se.cnty, se.st
FROM staged_employees se
WHERe se.cnty = 'US';
  • All partition columns are dynamic.
INSERT OVERWRITE TABLE employees
PARTITION (country, state)
SELECt ..., se.cnty, se.st
FROM staged_employees se;

Table 5-1. Dynamic partitions properties

NameDefaultDescription
hive.exec.dynamic.partitionfalseSet to true to enable dynamic partitioning.
hive.exec.dynamic.partition.modestrictSet to nonstrict to enable all partitions to be determined dynamically.
hive.exec.max.dynamic.partitions.pernode100The maximum number of dynamic partitions that can be created by each mapper or reducer. Raises a fatal error if one mapper or reducer attempts to create more than the threshold.

hive.exec.max.dynamic.par
titions
+1000 The total number of dynamic partitions that can be created by
one statement with dynamic partitioning. Raises a fatal error if the limit is exceeded.
| hive.exec.max.created.files | 100000 | The maximum total number of files that can be created globally. A Hadoop counter is used to track the number of files created. Raises a fatal error if the limit is exceeded.

Creating Tables and Loading Them in one Query
CREATE TABLE ca_employees
AS SELECt name, salary, address
FROM employees
WHERe se.state = 'CA';
Exporting Data
INSERT OVERWRITE LOCAL DIRECTORY '/tmp/ca_employees'
SELECt name, salary, address
FROM employees
WHERe se.state = 'CA';

with LOCAL, Data insert into the computer hive driver runs on.

转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/422650.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号