3.1Hadoop集群部署规划
// linux101 linu102 linux103 // HDFS NameNode,Datanode Datanode SecondaryNameNode,DataNode // YARN NodeManager ResourceManager,NodeManager NodeManager
3.2 Hadoop安装与配置
上传hadoop3.1.3并解压配置环境变量
[longlong@linux101 software]$ sudo vim /etc/profile.d/my-env.sh ## HADOOP_HOME export HADOOP_HOME=/opt/module/hadoop-3.1.3 export PATH=$HADOOP_HOME/bin:$PATH export PATH=$HADOOP_HOME/sbin:$PATH
配置集群
配置core-site.xml
fs.defaultFS hdfs://linux101:8020 hadoop.tmp.dir /opt/module/hadoop-3.1.3/data hadoop.http.staticuser.user longlong hadoop.proxyuser.longlong.hosts * hadoop.proxyuser.longlong.groups * hadoop.proxyuser.longlong.users *
配置hdfs-site.xml
dfs.namenode.http-address linux101:9870 dfs.namenode.secondary.http-address linux103:9868 dfs.replication 3
配置yarn-site.xml
yarn.nodemanager.aux-services mapreduce_shuffle yarn.resourcemanager.hostname linux102 yarn.nodemanager.env-whitelist JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME yarn.scheduler.minimum-allocation-mb 1024 yarn.scheduler.maximum-allocation-mb 4096 yarn.nodemanager.resource.memory-mb 4096 yarn.nodemanager.pmem-check-enabled false yarn.nodemanager.vmem-check-enabled false yarn.log-aggregation-enable true yarn.log.server.url http://linux101:19888/jobhistory/logs yarn.log-aggregation.retain-seconds 604800
配置map-reduce.xml
mapreduce.framework.name yarn mapreduce.jobhistory.address linux101:10020 mapreduce.jobhistory.webapp.address linux101:19888
配置workers
[longlong@linux101 hadoop]$ vim /opt/module/hadoop-3.1.3/etc/hadoop/workers linux101 linux102 linux103
分发hadoop至其他机器
3.3 集群的启动与测试
集群启动
首次启动需要在NameNode上格式化
[longlong@linux101 bin]$ hdfs namenode --format
启动hdfs(namenode 所在节点)
[longlong@linux101 bin]$ start-dfs.sh
启动Yarn(在ResourceManager所在节点)
[longlong@linux102 ~]$ start-yarn.sh
集群测试
后台服务
[longlong@linux101 hadoop-3.1.3]$ myjps.sh =================> linux101 JPS <================= 16834 Jps 16281 NameNode 16732 NodeManager 16431 DataNode =================> linux102 JPS <================= 2602 ResourceManager 2412 DataNode 3053 Jps 2909 NodeManager =================> linux103 JPS <================= 15122 NodeManager 15222 Jps 15016 SecondaryNameNode 14907 DataNode [longlong@linux101 hadoop-3.1.3]$
Web界面
**3.4 HDFS高可用的搭建与配置 **
上传解压至linux101,配置环境变量并分发至其他机器
# 配置环境变量 [longlong@linux101 hadoop-3.1.3]$ sudo vim /etc/profile.d/my-env.sh # HADOOP export HADOOP_HOME=/opt/module/hadoop-3.1.3 export PATH=$HADOOP_HOME/bin:$PATH export PATH=$HADOOP_HOME/sbin:$PATH # 分发配置 [longlong@linux101 hadoop-3.1.3]$ scp /etc/profile.d/my-env.sh root@linux102:/etc/profile.d/
配置nameservice服务,修改hdfs-site.xml文件
## 添加如下配置dfs.nameservices mycluster dfs.namenode.name.dir file://${hadoop.tmp.dir}/name dfs.datanode.data.dir file://${hadoop.tmp.dir}/data dfs.journalnode.edits.dir ${hadoop.tmp.dir}/jn dfs.nameservices mycluster dfs.ha.namenodes.mycluster linux101,linux102,linux103 dfs.namenode.rpc-address.mycluster.linux101 linux101:8020 dfs.namenode.rpc-address.mycluster.linux102 linux102:8020 dfs.namenode.rpc-address.mycluster.linux103 linux103:8020 dfs.namenode.http-address.mycluster.linux101 linux101:9870 dfs.namenode.http-address.mycluster.linux102 linux102:9870 dfs.namenode.http-address.mycluster.linux103 linux103:9870 dfs.namenode.shared.edits.dir qjournal://linux101:8485;linux102:8485;linux103:8485/mycluster dfs.client.failover.proxy.provider.mycluster org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.fencing.methods sshfence dfs.ha.fencing.ssh.private-key-files /home/longlong/.ssh/id_rsa dfs.permissions.enable false dfs.ha.automatic-failover.enabled true dfs.namenode.secondary.http-address linux103:9868 dfs.replication 3
配置core-site.xml
fs.defaultFS hdfs://mycluster dfs.journalnode.edits.dir /opt/module/hadoop-3.1.3/JN/data ha.zookeeper.quorum linux101:2181,linux102:2181,linux103:2181 hadoop.tmp.dir /opt/module/hadoop-3.1.3/data hadoop.http.staticuser.user longlong hadoop.proxyuser.longlong.hosts * hadoop.proxyuser.longlong.groups * hadoop.proxyuser.longlong.users *
配置hadoop-env.sh
[longlong@linux101 hadoop]$ vim hadoop-env.sh export JAVA_HOME=/opt/module/java-xxx export HDFS_NAMENODE_USER=longlong export HDFS_DATANODE_USER=longlong export HDFS_ZKFC_USER=longlong export HDFS_JOURNALNODE_USER=longlong export YARN_RESOURCEMANAGER_USER=longlong export YARN_NODEMANAGER_USER=longlong
分发配置文件至linux102、linux103
集群启动测试
# 0. 必须先启动Zookeeper集群 # 1. 在各个JournalNode上启动服务 [longlong@linux101 ~]$ hdfs --daemon start journalnode [longlong@linux102 ~]$ hdfs --daemon start journalnode [longlong@linux103 ~]$ hdfs --daemon start journalnode # 2. 第一次启动在nn1上进行格式化并启动 [longlong@linux101 ~]$ hdfs namenode -format [longlong@linux101 ~]$ hdfs --daemon start namenode # 3. 在nn2、nn3上同步nn1的元数据 [longlong@linux102 ~]$ hdfs namenode --bootstrapStandby [longlong@linux103 ~]$ hdfs namenode --bootstrapStandby # 4. 启动nn2和nn3 [longlong@linux102 ~]$ hdfs --daemon start namenode [longlong@linux103 ~]$ hdfs --daemon start namenode # 5. 关闭服务 [longlong@linux101 module]$ stop-dfs.sh # 6. 初始化HA在zookeeper的状态 [longlong@linux101 ~]$ hdfs zkfc -formatZK # 7. 启动集群服务 [longlong@linux101 module]$ start-dfs.sh四、MySQL数据库的安装
卸载linux自带的mariadb
[longlong@linux101 mysql-8.0.18]$ sudo yum -y remove mariadb*
上传mysql8.1.18.rpm包进行解压
[longlong@linux101 mysql-8.0.18]$ tar -xf mysql-8.0.18-1.el7.x86_64.rpm-bundle.tar
开始安装
# 1. [longlong@linux101 mysql-8.0.18]$ sudo rpm -ivh mysql-community-common-8.0.18-1.el7.x86_64.rpm # 2. [longlong@linux101 mysql-8.0.18]$ sudo rpm -ivh mysql-community-libs-8.0.18-1.el7.x86_64.rpm # 3. [longlong@linux101 mysql-8.0.18]$ sudo rpm -ivh mysql-community-client-8.0.18-1.el7.x86_64.rpm # 4. [longlong@linux101 mysql-8.0.18]$sudo yum -y install libaio # 5. [longlong@linux101 mysql-8.0.18]$ sudo rpm -ivh mysql-community-server-8.0.18-1.el7.x86_64.rpm
启动修改密码以及远程登录
# 1. 启动mysqld服务 [longlong@linux101 mysql-8.0.18]$ systemctl start mysqld # 2. 查找密码并登录mysql [longlong@linux101 mysql-8.0.18]$ sudo cat /var/log/mysqld.log | grep password # 3. 登录 [longlong@linux101 mysql-8.0.18]$ mysql -u root -p # 4. 修改密码 # 4.1 修改策略 set global validate_password.policy=low; set global validate_password.length=11; # 4.2 修改加密方式 ALTER USER 'root'@'localhost' IDENTIFIED BY 'SJL@123456#' PASSWORD EXPIRE NEVER; # 4.3 修改密码 ALTER USER 'root'@'localhost' IDENTIFIED WITH mysql_native_password BY 'SJL@123456#'; # 5. 刷新权限 FLUSH PRIVILEGES; # 6. 修改远程登录 use mysql; update user set host ='%' where user = 'root'; commit; exit;五、hive的安装与配置 5.1 hive的安装
hive的解压与上传
配置环境变量
## HIVE_HOME export HIVE_HOME=/opt/module/hive-3.1.2 export PATH=$HIVE_HOME/bin:$PATH
解决 Jar 包冲突
[longlong@linux101 hive-3.1.2]$ mv $HIVE_HOME/lib/log4j-slf4j-impl-2.10.0.jar $HIVE_HOME/lib/log4j-slf4j-impl-2.10.0.bak5.2 hive元数据配置到mysql
拷贝mysql的驱动至hive的lib中
配置metastore到mysql
[longlong@linux101 conf]$ vim hive-site.xml5.3 启动hivejavax.jdo.option.ConnectionURL jdbc:mysql://linux101:3306/metastore?useUnicode=true&serverTimezone=GMT&characterEncoding=UTF-8&useSSL=false javax.jdo.option.ConnectionDriverName com.mysql.cj.jdbc.Driver javax.jdo.option.ConnectionUserName root javax.jdo.option.ConnectionPassword SJL@123456# hive.metastore.warehouse.dir /user/hive/warehouse hive.metastore.schema.verification false hive.metastore.event.db.notification.api.auth false hive.cli.print.header true hive.cli.print.current.db true hive.server2.thrift.bind.host linux101 hive.server2.thrift.port 10000 hive.metastore.uris thrift://linux101:9083
初始化元数据库
登录mysql数据库,创建metastore数据库
初始化hive元数据库
[longlong@linux101 conf]$ schematool -initSchema -dbType mysql -verbose
启动hive
先启动hadoop集群
**元数据 **启动hive
[longlong@linux101 conf]$ hive --service metastore 2022-03-05 15:32:23: Starting Hive metastore Server [longlong@linux101 conf]$ hive
JDBC 访问hive
启动元数据
[longlong@linux101 conf]$ hive --service metastore
启动 hiveserver2服务
[longlong@linux101 conf]$ hive --service hiveserver2 | hiveserver2
启动 beeline客户端
[longlong@linux101 ~]$ beeline -u jdbc:hive2://linux101:10000 -n longlong六、Kafka安装与配置 6.1 kafka 的安装
上传并解压
在 kafka 文件夹中创建 logs
修改配置文件
[longlong@linux101 config]$ vim server.properties # broker的全局唯一编号,不能重复 broker.id=0 log.dirs=/opt/module/kafka_2.11-2.4.1/logs zookeeper.connect=linux101:2181,linux102:2181,linux103:2181
配置环境变量
## KAFKA_HOME export KAFKA_HOME=/opt/module/kafka_2.11-2.4.1 export PATH=$KAFKA_HOME/bin:$PATH
分发kafka
在linux102、linux103上修改 broker.id
linux102 => broker.id = 1 linux103 => broker.id = 26.2 kafka启动测试
# 1. 先开启zookeeper集群 # 2. 再启动kafka kafka-server-start.sh -daemon config/server.properties七、Hbase安装与配置 7.1 Hbase的安装
上传并解压hbase2.4.10
配置环境变量
## Hbase_HOME export Hbase_HOME=/opt/module/hbase-2.4.10 export PATH=$Hbase_HOME/bin:$PATH7.2 修改配置
修改 conf/regionservers 配置文件
[longlong@linux101 hbase-2.4.10]$ vim conf/regionservers linux101 linux102 linux103
在conf目录创建一个名为 backup-masters的文件添加主机名为linux102
[longlong@linux101 hbase-2.4.10]$ vim conf/backup-masters linux102
修改conf下的hbase-site.xml文件
hbase.rootdir hdfs://linux101/hbase hbase.cluster.distributed true hbase.zookeeper.quorum linux101,linux102,linux103 hbase.master.port 16000 hbase.tmp.dir ./tmp hbase.zookeeper.property.dataDir /home/longlong/zookeeper-hbase hbase.unsafe.stream.capability.enforce false
在hbase-env.sh 末尾添加如下配置
export JAVA_HOME=/opt/module/jdk1.8.0_212 export Hbase_MANAGES_ZK=false
拷贝hdfs-site.xml到conf目录下
[longlong@linux101 conf]$ cp /opt/module/hadoop-3.1.3/etc/hadoop/hdfs-site.xml /opt/module/hbase-2.4.10/conf/
分发配置
7.3 Hbase的测试
启动zookeeper
启动hadoop
启动Hbase
start-hbase.sh
HbaseUI界面



