目录
一.部署环境规划
1.1 虚拟机及hadoop角色规划
1.2 软件版本规划
1.3 数据目录规划
二.环境准备及依赖安装
2.1 基本环境及优化
2.2 服务器互信
2.3 JDK安装
三、zookeeper安装
3.1 下载安装
3.2 环境配置
3.3 创建myid
3.4 启动zookeeper
四、hadoop安装
4.1 软件包下载
4.2 修改配置文件
4.3 分发到其他服务器
4.4 启动hadoop集群
4.5 web页面
五、遇到的问题
5.1 zookeeper安装错误
5.2 hadoop安装错误
一.部署环境规划
1.1 虚拟机及hadoop角色规划
| 序号 | 主机名称 | os | ip | cpu | 内存 | 存储 | hadoop HA角色规划 | ||||||
| namenode | datanode | resourcemanager | nodemanager | zkfc | journalnode | zk | |||||||
| 1 | master | centos7 x64 | 192.168.141.100 | 1*2 | 4G | 50G | √ | √ | √ | ||||
| 2 | node-01 | centos7 x64 | 192.168.141.101 | 1*2 | 4G | 50G | √ | √ | √ | √ | √ | √ | √ |
| 3 | node-02 | centos7 x64 | 192.168.141.102 | 1*2 | 4G | 50G | √ | √ | √ | √ | |||
| 4 | node-03 | centos7 x64 | 192.168.141.103 | 1*2 | 4G | 50G | √ | √ | √ | √ | |||
1.2 软件版本规划
| 软件信息 | 版本 |
| Java | jdk-8u311-linux-x64.tar.gz |
| Hadoop | 3.3.0 |
| zookeeper | 3.7.0 |
1.3 数据目录规划
| 名称 | 目录 |
| datanode目录 | /data/hadoop/dfs/data |
| namenode目录 | /data/hadoop/dfs/name |
| hadoop临时目录 | /data/hadoop/tmp |
| zookeeper数据目录 | /data/zookeeper/data/ |
| zookeeper日志目录 | /data/zookeeper/log/ |
二.环境准备及依赖安装
2.1 基本环境及优化
笔者这里写了一个shell脚本执行,里面的命令可以复制直接执行,在master上新建init_env.sh
vi init_en.sh
输入内容,如果hosts有其他host配置可以把rm -f那行删掉,笔者是为了可以重复执行才添加这行的,hadoop用户的命名修改脚本your hadoop user passwd即可。
#!/bin/bash #删除hosts文件 rm -f /etc/hosts && touch /etc/hosts cat>>/etc/hosts </dev/null 2>&1 mkdir /data/hadoop/dfs/{data,name} -pv mkdir /data/hadoop/tmp -pv chown -R hadoop:hadoop /data/hadoop/ #关闭防火墙&selinux systemctl stop firewalld systemctl disable firewalld setenforce 0 && sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config #OS调优 cat>>/etc/sysctl.conf < >/etc/security/limits.conf < >/etc/rc.local < /sys/kernel/mm/transparent_hugepage/enabled fi if test -f /sys/kernel/mm/transparent_hugepage/defrag; then echo never > /sys/kernel/mm/transparent_hugepage/defrag fi EOF
2.2 服务器互信
先安装服务器互信的原因主要是为后面文件传输提供方便,在master上安装然后在通过分发的方式发送到其他服务器上,如果用SecureCRT连接工具的话,可以下方空白框使用右键选择发送所有会话是可以实现多个会话同时执行命令的,这种方式就相对每个服务手动执行要简单点。
如果想root跟hadoop用户都实现服务器互信的话,需要分别在root用户以及hadoop用户下分别执行
# 每个节点都执行
ssh-keygen -t rsa # 一路回车
# 将公钥添加到认证文件中
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
# 并设置authorized_keys的访问权限
chmod 600 ~/.ssh/authorized_keys
# 只要在一个节点执行即可
ssh node-01 cat ~/.ssh/id_rsa.pub >>~/.ssh/authorized_keys
ssh node-02 cat ~/.ssh/id_rsa.pub >>~/.ssh/authorized_keys
ssh node-03 cat ~/.ssh/id_rsa.pub >>~/.ssh/authorized_keys
# 分发整合后的文件到其它节点
scp ~/.ssh/authorized_keys node-01:~/.ssh/
scp ~/.ssh/authorized_keys node-02:~/.ssh/
scp ~/.ssh/authorized_keys node-03:~/.ssh/
#各节点执行
ssh node-01 date
ssh node-02 date
ssh node-03 date
将master的init_env.sh通过scp的方式传到相应的服务器运行
scp ~/init_env.sh node-01:~/
2.3 JDK安装
笔者虚拟机的安装选择最小安装方式,如果是桌面或者其他方式安装的话可能需要先卸载默认自带的jdk,具体参考CentOS卸载自带OpenJDK。
JDK下载地址:https://download.oracle.com/otn/java/jdk/8u311-b11/4d5417147a92418ea8b615e228bb6935/jdk-8u311-linux-x64.tar.gz?AuthParam=1634870696_fcaf73d98aa9bd357f31877eec58e049
master上传下载好的jdk文件,可以使用sftp或者rz命令上传,笔者使用的是rz命令上传,可通过执行yum install -y lrzsz安装rz上传命令。
#所有节点执行 mkdir -p /usr/java/ #rz 上传jdk文件 rz
解压jdk及配置相应的jdk环境
tar -zxvf jdk-8u311-linux-x64.tar.gz cat>>/etc/profile <完成master的jdk安装后在分发到node节点
scp -r jdk1.8.0_311 node-01:/usr/java scp -r jdk1.8.0_311 node-02:/usr/java scp -r jdk1.8.0_311 node-03:/usr/java每个node节点配置jdk环境
cat>>/etc/profile <三、zookeeper安装
3.1 下载安装
下载地址:https://downloads.apache.org/zookeeper/zookeeper-3.7.0/apache-zookeeper-3.7.0-bin.tar.gz
上传到master上,其他的通过master分发
mkdir -p /usr/local/zookeeper chown -R hadoop:hadoop /usr/local/zookeeper/ cd /usr/local/zookeeper #上传zookeeper安装包 rz tar -zxvf apache-zookeeper-3.7.0-bin.tar.gz &&rm -f apache-zookeeper-3.7.0-bin.tar.gz scp -r apache-zookeeper-3.7.0-bin node-01:/usr/local/zookeeper/ scp -r apache-zookeeper-3.7.0-bin node-02:/usr/local/zookeeper/ scp -r apache-zookeeper-3.7.0-bin node-03:/usr/local/zookeeper/3.2 环境配置
所有zookeeper节点配置(node-01、node-02、node-03)
cat>>/etc/profile <在node-01上修改zookeeper的配置文件
su hadoop cd /usr/local/zookeeper/apache-zookeeper-3.7.0-bin/conf/ cp zoo_sample.cfg zoo.cfg修改zoo.cfg配置文件
dataDir=/data/zookeeper/data/ dataLogDir=/data/zookeeper/log/ server.1=node-01:2888:3888 server.2=node-02:2888:3888 server.3=node-03:2888:3888分发到node-02、node-03节点
scp zoo.cfg node-02:/usr/local/zookeeper/apache-zookeeper-3.7.0-bin/conf/ scp zoo.cfg node-03:/usr/local/zookeeper/apache-zookeeper-3.7.0-bin/conf/3.3 创建myid
根据服务器对应的数字,配置相应的myid,node-01配置1,node-02配置2,node-03配置3
#各节点配置,根据server.1就是1 echo 1 > /data/zookeeper/data/myid3.4 启动zookeeper
各个节点分别启动
zkServer.sh start zkServer.sh status四、hadoop安装
4.1 软件包下载
下载地址:https://dlcdn.apache.org/hadoop/common/hadoop-3.3.0/hadoop-3.3.0.tar.gz
上传并解压(root用户):
tar -zxvf hadoop-3.3.0.tar.gz -C /usr/local/环境配置(所有节点都执行),root用户执行
chown -R hadoop:hadoop /usr/local/hadoop-3.3.0 cat>>/etc/profile <4.2 修改配置文件
hadoop-env.sh
添加export JAVA_HOME=/usr/java/jdk1.8.0_311
cd $HADOOP_HOME/etc/hadoop vi hadoop-env.sh cat hadoop-env.sh | grep -v '^#' | grep -v "^$"core-site.xml
修改成以下的内容
fs.defaultFS hdfs://master/ hadoop.tmp.dir /data/hadoop/tmp namenode上本地的hadoop临时文件夹 io.file.buffer.size 131072 Size of read/write buffer used in SequenceFiles ha.zookeeper.quorum node-01:2181,node-02:2181,node-03:2181 指定zookeeper地址 ha.zookeeper.session-timeout.ms 1000 hadoop链接zookeeper的超时时长设置ms hdfs-site.xml
dfs.replication 1 Hadoop的备份系数是指每个block在hadoop集群中有几份,系数越高,冗余性越好,占用存储也越多 dfs.namenode.name.dir file:///data/hadoop/dfs/name namenode上存储hdfs名字空间元数据 dfs.datanode.data.dir file:///data/hadoop/dfs/data datanode上数据块的物理存储位置 dfs.webhdfs.enabled true dfs.nameservices hadoop-ha dfs.ha.namenodes.hadoop-ha nn1,nn2 dfs.namenode.rpc-address.hadoop-ha.nn1 master:9000 dfs.namenode.http-address.hadoop-ha.nn1 master:50070 dfs.namenode.rpc-address.hadoop-ha.nn2 node-01:9000 dfs.namenode.http-address.hadoop-ha.nn2 node-01:50070 dfs.namenode.shared.edits.dir qjournal://master:8485;node-01:8485;node-02:8485;node-03:8485/hadoop-ha dfs.journalnode.edits.dir /data/hadoop/data/journaldata dfs.ha.automatic-failover.enabled true dfs.client.failover.proxy.provider.hadoop-ha org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.fencing.methods sshfence shell(/bin/true) dfs.ha.fencing.ssh.private-key-files /home/hadoop/.ssh/id_rsa dfs.ha.fencing.ssh.connect-timeout 30000 ha.failover-controller.cli-check.rpc-timeout.ms 60000 mapred-site.xml
mapreduce.framework.name yarn The runtime framework for executing MapReduce jobs. Can be one of local, classic or yarn. true mapreduce.jobtracker.http.address master:50030 mapreduce.jobhistory.address master:10020 mapreduce.jobhistory.webapp.address master:19888 mapred.job.tracker http://master:9001 yarn-site.xml
yarn.resourcemanager.ha.enabled true yarn.resourcemanager.cluster-id yrc yarn.resourcemanager.ha.rm-ids rm1,rm2 yarn.resourcemanager.hostname.rm1 node-02 yarn.resourcemanager.hostname.rm2 node-03 yarn.resourcemanager.zk-address node-01:2181,node-02:2181,node-03:2181 yarn.nodemanager.aux-services mapreduce_shuffle yarn.log-aggregation-enable true yarn.log-aggregation.retain-seconds 86400 yarn.resourcemanager.recovery.enabled true yarn.resourcemanager.store.class org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore yarn.application.classpath /usr/local/hadoop-3.3.0/etc/hadoop:/usr/local/hadoop-3.3.0/share/hadoop/common/lib/*:/usr/local/hadoop-3.3.0/share/hadoop/common/*:/usr/local/hadoop-3.3.0/share/hadoop/hdfs:/usr/local/hadoop-3.3.0/share/hadoop/hdfs/lib/*:/usr/local/hadoop-3.3.0/share/hadoop/hdfs/*:/usr/local/hadoop-3.3.0/share/hadoop/mapreduce/*:/usr/local/hadoop-3.3.0/share/hadoop/yarn:/usr/local/hadoop-3.3.0/share/hadoop/yarn/lib/*:/usr/local/hadoop-3.3.0/share/hadoop/yarn/* workers
vi workers4.3 分发到其他服务器
scp -r /usr/local/hadoop-3.3.0/ node-01:/usr/local/ scp -r /usr/local/hadoop-3.3.0/ node-02:/usr/local/ scp -r /usr/local/hadoop-3.3.0/ node-03:/usr/local/所有节点执行
chown -R hadoop:hadoop /usr/local/hadoop-3.3.04.4 启动hadoop集群
启动journalnode(所有节点)
su hadoop hadoop-daemon.sh start journalnode格式化namenode(master)
hadoop namenode -format同步元数据
scp -r /data/hadoop/dfs/name/current/ node-01:/data/hadoop/dfs/name/格式化zkfc(master)
hdfs zkfc -formatZK启动HDFS(master)
start-dfs.sh启动yarn
start-yarn.sh启动 mapreduce 任务历史服务器
mr-jobhistory-daemon.sh start historyserver查看各主节点状态hdfs/yarn
hdfs haadmin -getServiceState nn1 hdfs haadmin -getServiceState nn2 yarn rmadmin -getServiceState rm1 yarn rmadmin -getServiceState rm24.5 web页面
hdfs:http://192.168.141.100:50070/dfshealth.html#tab-overview
yarn:http://192.168.141.100:8088/cluster
五、遇到的问题
5.1 zookeeper安装错误
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/apache-zookeeper-3.7.0-bin/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Error contacting service. It is probably not running.查看zookeeper自动日志,查看相应的日志,出现此问题的原因是myid配置成同一个了,根据server对应的数字配置相应的id即可解决。
5.2 hadoop安装错误
格式化namenode
hdfs.xml高可用的服务名称配置不一致
hdfs格式化报错
2021-10-27 01:05:13,690 WARN namenode.NameNode: Encountered exception during format
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Unable to check if JNs are ready for formatting. 1 successful responses:
192.168.141.100:8485: false
3 exceptions thrown:
192.168.141.101:8485: Call From master/192.168.141.100 to node-01:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
192.168.141.102:8485: Call From master/192.168.141.100 to node-02:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
192.168.141.103:8485: Call From master/192.168.141.100 to node-03:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:305)
at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.hasSomeData(QuorumJournalManager.java:282)
at org.apache.hadoop.hdfs.server.common.Storage.confirmFormat(Storage.java:1165)
at org.apache.hadoop.hdfs.server.namenode.FSImage.confirmFormat(FSImage.java:211)
at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1267)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1713)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1821)
2021-10-27 01:05:13,700 ERROR namenode.NameNode: Failed to start namenode.
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Unable to check if JNs are ready for formatting. 1 successful responses:
192.168.141.100:8485: false
3 exceptions thrown:
192.168.141.101:8485: Call From master/192.168.141.100 to node-01:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused解决:journalnode节点没启动完,需要在各个服务器启动journalnode,本文错误因只启动了master上的。
参考连接:
Hadoop3分布式高可用集群部署



