- 一、环境介绍
- 1、环境组件介绍
- 2、组件功能及作用介绍
- 二、资源下载
- 1、jdk1.8下载
- 2、zookeeper3.4.8下载
- 3、hadoop2.6.5下载
- 三、基础环境搭建
- 1、hadoop用户配置
- 2、免密登录
- 2.1、jdk1.8安装
- 2.2、zookeeper安装
- 2.3、hadoop安装
- 3、确认环境
- 四、Hadoop配置
- 1、hadoop变量文件配置
- 2、hadoop配置文件配置
- 五、启动测试
- 1、初始化集群
- 2、脚本管理
- 3、启动集群测试
- 3.1、脚本管理
- 3.2、namenode测试
- 3.3、resourcemanager状态测试
- 4、页面访问
一、环境介绍 1、环境组件介绍
使用的是三台Linux Centos7系统的机器
| IP地址 | Host | Namenode | Datanode | Zookeeper | DFSZKFailoverController (ZKFC) | Journalnode | Resourcemanager | Nodemanager | JobHistory |
|---|---|---|---|---|---|---|---|---|---|
| 10.20.123.1 | bai1 | √ | √ | √ | √ | √ | √ | ||
| 10.20.123.2 | bai2 | √ | √ | √ | √ | √ | √ | √ | |
| 10.20.123.3 | bai3 | √ | √ | √ | √ | √ |
- Zookeeper:分布式应用程序协调服务,以Fast Paxos算法为基础,实现同步服务,配置维护和命名服务等分布式应用
- Namenode:管理HDFS,监控Datanode
- DFSZKFailoverController (ZKFC):监控Namenode的状态,并及时把状态信息写入Zookeeper。当Active状态的Namenode发生故障时,负责故障切换
- Journalnode:存放Namenode的editlog文件(元数据)
- Datanode:存储节点,多副本
- Resourcemanager:负责各个Nodemanager的资源调度
- Nodemanager:管理Datanode的资源
- Jobhistory:记录已经finished的mapreduce运行日志
jdk1.8安装包提取码:LWXB
2、zookeeper3.4.8下载zookeeper3.4.8安装包提取码:LWXB
3、hadoop2.6.5下载hadoop2.6.5安装包提取码:LWXB
三、基础环境搭建 1、hadoop用户配置三台服务器做同样的配置
[root@bai1 ~] vim /etc/hosts #在hosts中追加解析 10.20.123.1 bai1 10.20.123.2 bai2 10.20.123.3 bai3 [root@bai1 ~] useradd hadoop [root@bai1 ~] passwd hadoop [root@bai1 ~] chmod 640 /etc/sudoers [root@bai1 ~] vim /etc/sudoers #添加如下 hadoop ALL=(ALL) ALL
[root@bai1 ~] chmod 440 /etc/sudoers #切换到hadoop用户下将安装包上传至服务器 [root@bai1 ~] su hadoop [hadoop@bai1 ~]$ ll -rw-r--r-- 1 hadoop hadoop 199635269 Nov 30 2020 hadoop-2.6.5.tar.gz -rw-r--r-- 1 hadoop hadoop 181442359 Jun 12 2020 jdk-8u111-linux-x64.tar.gz -rw-r--r-- 1 hadoop hadoop 22261552 May 10 2021 zookeeper-3.4.8.tar.gz [hadoop@bai1 ~]$ mkdir service2、免密登录
#传输密钥时的密码为hadoop用户密码 [hadoop@bai1 ~]$ ssh-keygen -t rsa [hadoop@bai1 ~]$ cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys [hadoop@bai1 ~]$ ssh-copy-id -i bai2 [hadoop@bai1 ~]$ ssh-copy-id -i bai3 [hadoop@bai2 ~]$ ssh-keygen -t rsa [hadoop@bai2 ~]$ cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys [hadoop@bai2 ~]$ ssh-copy-id -i bai1 [hadoop@bai2 ~]$ ssh-copy-id -i bai3 [hadoop@bai3 ~]$ ssh-keygen -t rsa [hadoop@bai3 ~]$ cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys [hadoop@bai3 ~]$ ssh-copy-id -i bai1 [hadoop@bai3 ~]$ ssh-copy-id -i bai22.1、jdk1.8安装
[hadoop@bai1 ~]$ tar xzvf jdk-8u111-linux-x64.tar.gz
[hadoop@bai1 ~]$ mv jdk1.8.0_111 service/jdk1.8
[hadoop@bai1 ~]$ vim .bash_profile
#在.bash_profile最后追加
export JAVA_HOME=/home/hadoop/service/jdk1.8
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=$PATH:$HOME/.local/bin:$HOME/bin:${JAVA_HOME}/bin
[hadoop@bai1 ~]$ source .bash_profile
[hadoop@bai1 ~]$ java -version
java version "1.8.0_111"
Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)
2.2、zookeeper安装
[hadoop@bai1 ~]$ tar xzvf zookeeper-3.4.8.tar.gz [hadoop@bai1 ~]$ mv zookeeper-3.4.8 service/zookeeper [hadoop@bai1 ~]$ vim service/zookeeper/conf/zoo.cfg tickTime=2000 initLimit=10 syncLimit=5 dataDir=/home/hadoop/service/zookeeper clientPort=2181 server.1=bai1:2888:3888 server.2=bai2:2888:3888 server.3=bai3:2888:3888 [hadoop@bai1 ~]$ mkdir service/zookeeper/data [hadoop@bai1 ~]$ echo 1 > service/zookeeper/data/myid #按照zoo.cfg中server的顺序echo1,2,3到myid中 [hadoop@bai1 ~]$ vim service/zookeeper/bin/zkEnv.sh #添加java的家目录 JAVA_HOME="/home/hadoop/service/jdk1.8"
[hadoop@bai1 ~]$ vim .bash_profile
#在刚才添加的JAVA_HOME后追加
export ZK_HOME=/home/hadoop/service/zookeeper
export PATH=$PATH:$HOME/.local/bin:$HOME/bin:${JAVA_HOME}/bin:$ZK_HOME/bin
[hadoop@bai1 ~]$ vim service/zookeeper/bin/auto-zk.sh #编写启动脚本
#!/bin/bash
zkbin=$ZK_HOME/bin
serverlist=`cat $zkbin/../conf/zoo.cfg|grep ^server|awk -F= '{print $2}'|awk -F: '{print $1}'`
user=hadoop #这个变量是用我们的hadoop用户去登录
for server in ${serverlist[@]}; do
echo -e "nHost [$server]:"
ssh $user@$server "$zkbin/zkServer.sh $1"
done
[hadoop@bai1 ~]$ source .bash_profile
[hadoop@bai1 ~]$ auto-zk.sh start #启动
[hadoop@bai1 ~]$ auto-zk.sh status #查看启动状态
Host [bai1]:
ZooKeeper JMX enabled by default
Using config: /home/hadoop/service/zookeeper/bin/../conf/zoo.cfg
Mode: follower
Host [bai2]:
ZooKeeper JMX enabled by default
Using config: /home/hadoop/service/zookeeper/bin/../conf/zoo.cfg
Mode: leader
Host [bai3]:
ZooKeeper JMX enabled by default
Using config: /home/hadoop/service/zookeeper/bin/../conf/zoo.cfg
Mode: follower
2.3、hadoop安装
[hadoop@bai1 ~]$ tar xzvf hadoop-2.6.5.tar.gz
[hadoop@bai1 ~]$ mv hadoop-2.6.5 service/hadoop
[hadoop@bai1 ~]$ vim .bash_profile
#在刚才添加的JAVA_HOME和ZK_HOME后追加
export HADOOP_HOME=/home/hadoop/service/hadoop
export PATH=$PATH:$HOME/.local/bin:$HOME/bin:${JAVA_HOME}/bin:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$ZK_HOME/bin
[hadoop@bai1 ~]$ source .bash_profile
[hadoop@bai1 ~]$ hadoop version
Hadoop 2.6.5
Subversion https://github.com/apache/hadoop.git -r e2a9fe0r6t252czf2ebf1454405577650f113497
Compiled by sjlee on 2016-10-02T23:43Z
Compiled with protoc 2.5.0
From source with checksum f05v0qa095a395faa9de2j7ba5j954
This command was run using /home/hadoop/service/hadoop/share/hadoop/common/hadoop-common-2.6.5.jar
3、确认环境
[hadoop@bai1 ~]$ mkdir -p service/hadoop/data/ha/{jn,tmp}
[hadoop@bai1 ~]$ mkdir -p service/hadoop/data/hadoop-yarn/staging/history/{done,done_intermediate}
#再次到三台服务器上确认环境没有问题,可以免密登录
[hadoop@bai1 ~]$ java -version
java version "1.8.0_111"
Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)
[hadoop@bai1 ~]$ auto-zk.sh status #可以收集到信息,并且是启动状态
[hadoop@bai1 ~]$ hadoop version
Hadoop 2.6.5
Subversion https://github.com/apache/hadoop.git -r e2a9fe0r6t252czf2ebf1454405577650f113497
Compiled by sjlee on 2016-10-02T23:43Z
Compiled with protoc 2.5.0
From source with checksum f05v0qa095a395faa9de2j7ba5j954
This command was run using /home/hadoop/service/hadoop/share/hadoop/common/hadoop-common-2.6.5.jar
四、Hadoop配置
1、hadoop变量文件配置
~/service/hadoop/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/home/hadoop/service/jdk1.8
~/service/hadoop/etc/hadoop/mapred-env.sh
export JAVA_HOME=/home/hadoop/service/jdk1.8
~/service/hadoop/etc/hadoop/yarn-env.sh
export JAVA_HOME=/home/hadoop/service/jdk1.82、hadoop配置文件配置
~/service/hadoop/etc/hadoop/core-site.xml
fs.defaultFS hdfs://mycluster hadoop.tmp.dir /home/hadoop/service/hadoop/data/ha/tmp ha.zookeeper.quorum bai1:2181,bai2:2181,bai3:2181 hadoop.proxyuser.root.groups * hadoop.proxyuser.root.hosts *
~/service/hadoop/etc/hadoop/hdfs-site.xml
dfs.replication 3 dfs.nameservices mycluster dfs.ha.namenodes.mycluster nn1,nn2 dfs.namenode.rpc-address.mycluster.nn1 bai1:8020 dfs.namenode.rpc-address.mycluster.nn2 bai2:8020 dfs.namenode.http-address.mycluster.nn1 bai1:50070 dfs.namenode.http-address.mycluster.nn2 bai2:50070 dfs.namenode.shared.edits.dir qjournal://bai1:8485;bai2:8485;bai3:8485/mycluster dfs.ha.fencing.methods sshfence dfs.ha.fencing.ssh.private-key-files /home/hadoop/.ssh/id_rsa dfs.journalnode.edits.dir /home/hadoop/service/hadoop/data/ha/jn dfs.permissions.enable false dfs.client.failover.proxy.provider.mycluster org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.automatic-failover.enabled true dfs.webhdfs.enabled true dfs.datanode.directoryscan.throttle.limit.ms.per.sec 1000 dfs.datanode.max.transfer.threads 8192
~/service/hadoop/etc/hadoop/mapred-site.xml 没有这个文件的,将mapred-site.xml.template复制成mapred-site.xml
mapreduce.framework.name yarn mapreduce.jobhistory.address bai1:10020 mapreduce.jobhistory.webapp.address bai1:19888 mapreduce.jobhistory.joblist.cache.size 20000 yarn.app.mapreduce.am.staging-dir /home/hadoop/service/hadoop/data/hadoop-yarn/staging mapreduce.jobhistory.done-dir ${yarn.app.mapreduce.am.staging-dir}/history/done mapreduce.jobhistory.intermediate-done-dir ${yarn.app.mapreduce.am.staging-dir}/history/done_intermediate
~/service/hadoop/etc/hadoop/yarn-site.xml
yarn.nodemanager.aux-services mapreduce_shuffle yarn.resourcemanager.ha.enabled true yarn.resourcemanager.cluster-id rmCluster yarn.resourcemanager.ha.rm-ids rm1,rm2 yarn.resourcemanager.hostname.rm1 bai2 yarn.resourcemanager.hostname.rm2 bai3 yarn.resourcemanager.zk-address bai1:2181,bai2:2181,bai3:2181 yarn.resourcemanager.recovery.enabled true yarn.resourcemanager.store.class org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
service/hadoop/etc/hadoop/slaves
bai1 bai2 bai3
配置方面到此基本就结束了
五、启动测试 1、初始化集群journalnode
#三台机器都要有journalnode,下面两个命令作用一样,选一个执行 [hadoop@bai1 ~]$ hadoop-daemon.sh start journalnode #单独启动一台journalnode [hadoop@bai1 ~]$ hadoop-daemons.sh start journalnode #在一台上执行,启动集群的journalnode
namenode
执行namenode的命令要分开来
[hadoop@bai1 ~]$ hadoop-daemon.sh start namenode #启动一台主节点 #主节点启动成功后在备用节点执行 [hadoop@bai2 ~]$ hdfs namenode -bootstrapStandby
DFSZKFailoverController (ZKFC)
#确保zookeeper集群启动状态 [hadoop@bai1 ~]$ hdfs zkfc -formatZK
到此没有报错的话集群初始化就已经完成了
2、脚本管理在这里为了方便大家可以使用脚本来管理集群,当然hadoop本身也有一键启动集群的脚本
[hadoop@bai1 ~]$ vim service/hadoop/bin/auto-hdp.sh
#!/bin/bash
sbindir=$HADOOP_HOME/sbin
nodelist=(bai1 bai2 bai3) #改成自己的主机名
case $1 in
(start)
ssh ${nodelist[0]} $sbindir/$1-dfs.sh
ssh ${nodelist[1]} $sbindir/$1-yarn.sh
ssh ${nodelist[2]} $sbindir/yarn-daemon.sh $1 resourcemanager
ssh ${nodelist[0]} $sbindir/mr-jobhistory-daemon.sh $1 historyserver
;;
(stop)
ssh ${nodelist[0]} $sbindir/mr-jobhistory-daemon.sh $1 historyserver
ssh ${nodelist[1]} $sbindir/$1-yarn.sh
ssh ${nodelist[2]} $sbindir/yarn-daemon.sh $1 resourcemanager
ssh ${nodelist[0]} $sbindir/$1-dfs.sh
;;
(status)
for node in ${nodelist[@]}; do
echo -e "nHost [$node]:"
ssh $node "/home/hadoop/service/jdk1.8/bin/jps|sort|grep -v Jps"
done
;;
esac
3、启动集群测试
3.1、脚本管理
auto-hdp.sh start
bai1: starting namenode, logging to /home/hadoop/service/hadoop/logs/hadoop-hadoop-namenode-bai1.out bai2: starting namenode, logging to /home/hadoop/service/hadoop/logs/hadoop-hadoop-namenode-bai2.out bai2: starting datanode, logging to /home/hadoop/service/hadoop/logs/hadoop-hadoop-datanode-bai2.out bai1: starting datanode, logging to /home/hadoop/service/hadoop/logs/hadoop-hadoop-datanode-bai1.out bai3: starting datanode, logging to /home/hadoop/service/hadoop/logs/hadoop-hadoop-datanode-bai3.out Starting journal nodes [bai1 bai2 bai3] bai3: starting journalnode, logging to /home/hadoop/service/hadoop/logs/hadoop-hadoop-journalnode-bai3.out bai1: starting journalnode, logging to /home/hadoop/service/hadoop/logs/hadoop-hadoop-journalnode-bai1.out bai2: starting journalnode, logging to /home/hadoop/service/hadoop/logs/hadoop-hadoop-journalnode-bai2.out Starting ZK Failover Controllers on NN hosts [bai1 bai2] bai1: starting zkfc, logging to /home/hadoop/service/hadoop/logs/hadoop-hadoop-zkfc-bai1.out bai2: starting zkfc, logging to /home/hadoop/service/hadoop/logs/hadoop-hadoop-zkfc-bai2.out starting yarn daemons starting resourcemanager, logging to /home/hadoop/service/hadoop/logs/yarn-hadoop-resourcemanager-bai2.out bai2: starting nodemanager, logging to /home/hadoop/service/hadoop/logs/yarn-hadoop-nodemanager-bai2.out bai1: starting nodemanager, logging to /home/hadoop/service/hadoop/logs/yarn-hadoop-nodemanager-bai1.out bai3: starting nodemanager, logging to /home/hadoop/service/hadoop/logs/yarn-hadoop-nodemanager-bai3.out starting resourcemanager, logging to /home/hadoop/service/hadoop/logs/yarn-hadoop-resourcemanager-bai3.out starting historyserver, logging to /home/hadoop/service/hadoop/logs/mapred-hadoop-historyserver-bai1.out
auto-hdp.sh status
Host [bai1]: 290172 NameNode 290280 DataNode 290478 JournalNode 290670 DFSZKFailoverController 290777 NodeManager 290943 JobHistoryServer Host [bai2]: 802459 NameNode 802583 DataNode 802752 JournalNode 802947 DFSZKFailoverController 803103 ResourceManager 803236 NodeManager Host [bai3]: 353827 DataNode 353902 JournalNode 354025 NodeManager 354141 ResourceManager
auto-hdp.sh stop
stopping historyserver stopping yarn daemons stopping resourcemanager bai2: stopping nodemanager bai1: stopping nodemanager bai3: stopping nodemanager no proxyserver to stop stopping resourcemanager Stopping namenodes on [bai1 bai2] bai2: stopping namenode bai1: stopping namenode bai2: stopping datanode bai1: stopping datanode bai3: stopping datanode Stopping journal nodes [bai1 bai2 bai3] bai1: stopping journalnode bai3: stopping journalnode bai2: stopping journalnode Stopping ZK Failover Controllers on NN hosts [bai1 bai2]3.2、namenode测试
这里我们测试一下集群的高可用性
#之前定义的nn1为bai1,nn2为bai2 [hadoop@bai1 ~]$ hdfs haadmin -getServiceState nn1 active [hadoop@bai1 ~]$ hdfs haadmin -getServiceState nn2 standby #bai2为备用状态,bai1是存活状态,杀掉bai1,查看bai2 [hadoop@bai1 ~]$ jps 291299 Jps 290172 NameNode 290280 DataNode 290478 JournalNode 290670 DFSZKFailoverController 290777 NodeManager 290943 JobHistoryServer [hadoop@bai1 ~]$ kill -9 290172 #杀掉nn1再来看 [hadoop@bai1 ~]$ hdfs haadmin -getServiceState nn1 21/12/14 10:55:54 INFO ipc.Client: Retrying connect to server: miguan3/10.20.140.34:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixe dSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)Operation failed: Call From miguan3/10.20.140.34 to miguan3:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused [hadoop@bai1 ~]$ hdfs haadmin -getServiceState nn2 active #看到nn2已经成为存活状态,nn1已经挂掉了,那我们再次启动nn1会是什么结果呢 [hadoop@bai1 ~]$ hadoop-daemon.sh start namenode [hadoop@bai1 ~]$ hdfs haadmin -getServiceState nn1 standby #nn1为备用 [hadoop@bai1 ~]$ hdfs haadmin -getServiceState nn2 active #nn2为主节点3.3、resourcemanager状态测试
[hadoop@bai1 ~]$ service/hadoop/bin/yarn rmadmin -getServiceState rm1 active #rm1为bai2,存活状态 [hadoop@bai1 ~]$ service/hadoop/bin/yarn rmadmin -getServiceState rm2 standby #rm2为bai3,备用状态
脚本可以正常的使用,将集群启动起来访问几个web页面测试一下
4、页面访问bai1:50070
bai2:50070
bai1:19888
bai2:8088



