一.Hive安装
1.1安装地址
1)Hive官网地址
http://hive.apache.org/
2)文档查看地址
https://cwiki.apache.org/confluence/display/Hive/GettingStarted
3)下载地址
http://archive.apache.org/dist/hive/
4)github地址
GitHub - apache/hive: Apache Hive
1.2Hive的安装部署
1.2.0 修改hadoop相关配置
配置core-site.xml
[atguigu@hadoop102 ~]$ cd $HADOOP_HOME/etc/hadoop
[atguigu@hadoop102 hadoop]$ vim core-site.xml
增加如下配置
配置yarn-site.xml
for containers. If set to -1 and
yarn.nodemanager.resource.detect-hardware-capabilities is true, it is
automatically calculated(in case of Windows and Linux).
In other cases, the default is 8192MB.
containers.
两个文件修改完毕后,记得分发,然后重启集群
1.2.1 安装Hive
1)把apache-hive-3.1.2-bin.tar.gz上传到linux的/opt/software目录下
2)解压apache-hive-3.1.2-bin.tar.gz到/opt/module/目录下面
[root@localhost software]$ tar -zxvf /opt/software/apache-hive-3.1.2-bin.tar.gz -C /opt/module/
3)修改apache-hive-3.1.2-bin.tar.gz的名称为hive
[root@localhost software]$ mv /opt/module/apache-hive-3.1.2-bin/ /opt/module/hive
4)修改/etc/profile.d/my_env.sh,添加环境变量
[root@localhost software]$ sudo vim /etc/profile.d/my_env.sh
5)添加内容
#HIVE_HOME
export HIVE_HOME=/opt/module/hive
export PATH=$PATH:$HIVE_HOME/bin
6)解决日志Jar包冲突
[root@localhost software]$ mv $HIVE_HOME/lib/log4j-slf4j-impl-2.10.0.jar $HIVE_HOME/lib/log4j-slf4j-impl-2.10.0.bak
7)初始化元数据库
[root@localhost hive]$ bin/schematool -dbType derby -initSchema
1.2.2 启动并使用Hive
1)启动Hive
[root@localhost hive]$ bin/hive
2)使用Hive
hive> show databases;
hive> show tables;
hive> create table test(id int);
hive> insert into test values(1);
hive> select * from test;
3)在xshell窗口中开启另一个窗口开启Hive,在/tmp/atguigu目录下监控hive.log文件
Caused by: ERROR XSDB6: Another instance of Derby may have already booted the database /opt/module/hive/metastore_db.
at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
at org.apache.derby.impl.store.raw.data.baseDataFileFactory.privGetJBMSLockonDB(Unknown Source)
at org.apache.derby.impl.store.raw.data.baseDataFileFactory.run(Unknown Source)
...
原因在于Hive默认使用的元数据库为derby,开启Hive之后就会占用元数据库,且不与其他客户端共享数据,所以我们需要将Hive的元数据地址改为MySQL。
- 在Hive的安装目录下将derby.log和metastore_db删除,顺便将hdfs上目录删除
[root@localhost hive]$ rm -rf derby.log metastore_db
[root@localhost hive]$ hadoop fs -rm -r /user



