1、搭建完成zookeeper,如果没有搭建,请跳转到zookeeper集群搭建保姆级教程
2、集群的jdk都安装完成,如果没有,请跳转到linux编写个脚本快速搭建jdk保姆级教程
3、在/opt/install 下有Hadoop2.6.0的包
4、高可用集群搭建结构表
1、首先对Hadoop包进行解压缩
[root@nnode2 install]# tar -zxf hadoop-2.6.0-cdh5.14.2.tar.gz -C /opt/soft/
为了后期使用Hadoop文件夹方便,我们将它更改个名字。mv hadoop-2.6.0-cdh5.14.2/ hadoop260
进入到etc/hadoop/目录下修改文件。vim hadoop-env.sh
vim yarn-env.sh
vim mapred-env.sh
修改完jdk后,接下来是重头戏。
vim core-site.xml
fs.defaultFS hdfs://mycluster hadoop.tmp.dir /opt/soft/hadoop260/hadooptmp ha.zookeeper.quorum nnode2:2181,nnode3:2181,nnode4:2181 hadoop.proxyuser.bigdata.hosts * hadoop.proxyuser.bigdata.groups *
vim hdfs-site.xml
dfs.replication 3 dfs.nameservices mycluster dfs.ha.namenodes.mycluster nn1,nn2 dfs.namenode.rpc-address.mycluster.nn1 nnode2:9000 dfs.namenode.http-address.mycluster.nn1 nnode2:50070 dfs.namenode.rpc-address.mycluster.nn2 nnode3:9000 dfs.namenode.http-address.mycluster.nn2 nnode3:50070 dfs.journalnode.edits.dir /opt/soft/hadoop260/journaldata dfs.namenode.shared.edits.dir qjournal://nnode2:8485;nnode3:8485;nnode4:8485/mycluster dfs.ha.automatic-failover.enabled true dfs.client.failover.proxy.provider.mycluster org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProvider dfs.ha.fencing.methods sshfence shell(/bin/true) dfs.ha.fencing.ssh.private-key-files /root/.ssh/id_rsa dfs.ha.fencing.ssh.connect-timeout 30000 dfs.webhdfs.enabled true
cp mapred-site.xml.template mapred-site.xml
vim mapred-site.xml
mapreduce.framework.name yarn mapreduce.jobhistory.address nnode5:10020 mapreduce.jobhistory.webapp.address nnode5:19888
vim yarn-site.xml
yarn.resourcemanager.ha.enabled true yarn.resourcemanager.cluster-id yrc yarn.resourcemanager.ha.rm-ids rm1,rm2 yarn.resourcemanager.hostname.rm1 nnode2 yarn.resourcemanager.hostname.rm2 nnode3 yarn.resourcemanager.zk-address nnode2:2181,nnode3:2181,nnode4:2181 yarn.nodemanager.aux-services mapreduce_shuffle yarn.log-aggregation-enable true yarn.log-aggregation.retain-seconds 86400 yarn.resourcemanager.recovery.enabled true yarn.resourcemanager.store.class org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
vim slaves 在slaves文件中,添加集群的各节点名称。
使用脚本将hadoop260文件夹分发到其他机器上。(rsync)
2、启动zookeeper集群(脚本启动)zkop start
zkop status 查看状态信息
3、启动journalnodehadoop-daemon.sh start journalnode
ssh nnode3 "source /etc/profile; hadoop-daemon.sh start journalnode"
ssh nnode4 "source /etc/profile; hadoop-daemon.sh start journalnode"
4、Hadoop格式化hadoop namenode -format
将nnode2格式化后的hadooptmp文件同步到nnode3
[root@nnode2 hadoop260]# scp -r hadooptmp/ root@nnode3:/opt/soft/hadoop260/
5、初始化zookeeper
hdfs zkfc -formatZK
6、启动hdfs
start-dfs.sh
7、启动yarn
start-yarn.sh
最后使用脚本去查看进程信息
配置完成~~记得去浏览器查看两台namenode



