软件包在第一个博客中都有,需要的点这里。
一、Hive学习过程中,略学了一点HQL语句,链接给各位奉献上。
1.下载解压hive
2.配置文件
- 全局配置(/etc/profile)
export HIVE_HOME=/software/hive export PATH="$HIVE_HOME/bin:$PATH
- hive-site.xml
hive.server2.thrift.port 10000 hive.server2.thrift.bind.host 192.168.9.105 hive.exec.scratchdir /tmp/hive hive.exec.local.scratchdir /software/hive/tmp/hive hive.downloaded.resources.dir /usr/local/hive/tmp/ hive.querylog.location /software/hive/tmp/hive hive.aux.jars.path /software/hive/lib,/software/hive/jdbc hive.metastore.warehouse.dir hdfs://master:9000/user/hive/warehouse javax.jdo.option.ConnectionURL jdbc:mysql://master:3306/hivedb?createDatabaseIfNotExist=true&useSSL=false& javax.jdo.option.ConnectionDriverName com.mysql.cj.jdbc.Driver javax.jdo.option.ConnectionUserName hivedb javax.jdo.option.ConnectionPassword hivedb //这里是你mysql的密码hive.server2.thrift.bind.host master hive.server2.thrift.client.user master/value> hive.server2.thrift.client.password 123456 //这里是你主机用户的密码hive.server2.webui.host master
如果觉得太多也可以这样做
[hadoop@master hive]$ mkdir tmp
[hadoop@master hive]$ cd conf
[hadoop@master conf]$ cp hive-default.xml.template hive-site.xml
将hive-site.xml文件中:
${system:java.io.tmpdir}——hive/tmp(hive安装目录下的tmp文件夹)
${system:user.name}——主机名
3.hive-env.sh
export JAVA_HOME=/software/hadoop/jdk1.8.0_221 export HADOOP_HOME=/software/hadoop/hadoop-2.7.7 export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop export HIVE_HOME=/software/hive export HIVE_CONF_DIR=$HIVE_HOME/conf export HIVE_AUX_JARS_PATH=$HIVE_HOME/lib
3.设置hive的日志在hive目录下
[hadoop@master conf]$ cp hive-log4j2.properties.template hive-log4j2.properties [hadoop@master conf]$ vi hive-log4j2.properties #修改以下参数 property.hive.log.dir = /root/bigdata/hive-2.3.6/logs
4.启动和hive端
[hadoop@master ~]$ hive --service metastore& [hadoop@master ~]$ ctrl+c [hadoop@master ~]$ hive
5.配置metastore
二、Mysql在默认情况下, Hive元数据保存在内嵌的derby数据库当中, 但根据要求生产环境需使用MySQL来存放Hive元数据。
将 mysql-connector-java-x.x.x.jar 放入 $HIVE_HOME/lib 下。(mysql jdbc驱动程序)
1.下载解压mysql
2.配置相关文件
- 新建mysql用户
[root@master ~]$ groupadd mysql ---新建一个msyql组 [root@master ~]$ useradd -r -s /sbin/nologin -g mysql mysql -d /software/mysql
- 配置环境变量(/etc/profile:MYSQL_HOME、PATH)
- 配置my.cnf文件
socket、basedir、datadir、log、pid文件与目录需自己提前创建并修改权限
[root@master ~]$ chown -R mysql:mysql 目录/文件路径 [root@master ~]$ chmod -R 777 目录/文件路径 #(mysql、777可以是别的值,相关参考关于里linux权限相关博客)
[mysql] default-character-set=utf8 [mysqld] port=3306 socket=/software/mysql/sock/mysql.sock basedir=/software/mysql datadir=/software/mysql/data log-error=/var/log/mysql/mysql.log pid-file=/var/run/mysql/mysql.pid max_connections=1000 character-set-server=utf8 wait_timeout=31536000 interactive_timeout=31536000 default-storage-engine=INNODB max_allowed_packet=1024M [mysqld_safe] socket=/software/mysql/sock/mysql.sock [client] socket=/software/mysql/sock/mysql.sock [mysql.server] socket=/software/mysql/sock/mysql.sock
- mysqld.server
datadir=/software/mysql/data basedir=/software/mysql/
- server启动配置
[hadoop@master ~]$ cp /software/mysql/support-files/mysql.server /etc/init.d/mysqld
3.初始化mysql
[hadoop@master ~]$ mysqld --initialize
4.启动mysql
[hadoop@master ~]$ service mysqld start [hadoop@master ~]$ mysql -uroot -p #初次启动mysql需要临时密码,一般在my.cnf中定义的log文件中 #也可以用命令检索:grep ‘temporary password' 路径
5.进入后修改密码
mysql> set password = password('密码');
mysql> alter user 'hive'@'%'password expire never;
mysql> flush privileges;
6.设置远程链接
mysql> use mysql; mysql> update user set host =% where user ='hive'; mysql> flush privileges; #链接不上查看防火墙是否是关闭状态;三、Redis
1.下载解压redis后
[hadoop@master redis-4.0.1]$ make [hadoop@master redis-4.0.1]$ cd .. [hadoop@master software]$ mv redis-4.0.1/ redis [hadoop@master software]$ cd redis/ [hadoop@master redis]$ ./src/redis-server
2.启动redis
[hadoop@master redis]$ ./src/redis-server ./redis.conf
打开另一个终端
[hadoop@master redis]$ ./src/redis-cli四、Flink(集群搭建)
1.下载安装flink
2.配置相关文件
- zoo.cfg
参考内容:
server.1=hadoop1:2888:3888 server.2=hadoop2:2888:3888 server.3=hadoop3:2888:3888 server.4=master:2888:3888
- masters
master
- slaves
hadoop1 hadoop2 hadoop3
- flink-conf.yaml
jobmanager.rpc.address: master jobmanager.rpc.port: 6123 jobmanager.heap.size: 1024m taskmanager.memory.process.size: 1728m high-availability: zookeeper high-availability.storageDir: hdfs:/user/flink/ high-availability.zookeeper.quorum: hadoop1:2181,hadoop2:2181,hadoop3:2181 high-availability.zookeeper.path.root: /flink high-availability.cluster-id: flinkCluster state.backend: filesystem state.checkpoints.dir: hdfs://master/user/flink/flink-checkpoints state.savepoints.dir: hdfs://master/user/flink/flink-checkpoints rest.port: 8081
- /flink/bin/config.sh
DEFAULT_YARN_CONF_DIR="/software/hadoop/etc/hadoop" # YARN Configuration Directory, if necessary DEFAULT_HADOOP_CONF_DIR="/software/hadoop/etc/hadoop" # Hadoop Configuration Directory, if necessary五、运行Hive与mysql中遇到的问题
1.权限问题
FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeExcepti...
修改方案如下:
mysql> create user ‘hive’ identified by 'hive' #''号内写的是自己在mysql创建的用户名与密码 mysql> grant all privileges on *.* to 'hive'@'%' identified by 'hivedb' with grant option;
2.字符集的问题
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. metaException(message:An exception was thrown while adding/validating class(es) : Column length too big for column 'PARAM_VALUE' (max = 21845); use BLOB or TEXT instead com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Column length too big for column 'PARAM_VALUE' (max = 21845); use BLOB or TEXT instead
修改方案如下:
mysql> alter database hive character set latin1;#hive那可更改,内容为数据库名;


![[大数据技术与应用省赛学习记录七]——模块一(其余软件安装配置) [大数据技术与应用省赛学习记录七]——模块一(其余软件安装配置)](http://www.mshxw.com/aiimages/31/585009.png)
