前期工作:
docker搭建hadoop集群环境之Dockerfile编写
docker搭建hadoop集群环境之镜像建立与运行
docker搭建hadoop集群环境之配置ssh免密登录
docker搭建hadoop集群环境之Zookeeper配置
一、配置hadoop环境(5个节点均进行此操作)
进入/root/hadoop/etc/hadoop目录
cd /root/hadoop/etc/hadoop
在hadoop-env.sh,mapred-env.sh,yarn-env.sh中加入JAVA_HOME
vim hadoop-env.sh
vim mapred-env.sh
vim yarn-env.sh
export JAVA_HOME=/root/jdk11
修改配置文件core-site.xml、hdfs-site.xml、mapred-site.xml、yarn-site.xml,各文件内容如下:
core-site.xml:
fs.defaultFS hdfs://ns hadoop.tmp.dir /root/hadoop/tmp ha.zookeeper.quorum hadoop01:2181,hadoop02:2181,hadoop03:2181,hadoop04:2181,hadoop05:2181
hdfs-site.xml:
dfs.replication 3 dfs.nameservices ns dfs.ha.namenodes.ns nn1,nn2 dfs.namenode.rpc-address.ns.nn1 hadoop01:9000 dfs.namenode.http-address.ns.nn1 hadoop01:50070 dfs.namenode.rpc-address.ns.nn2 hadoop02:9000 dfs.namenode.http-address.ns.nns hadoop02:50070 dfs.namenode.shared.edits.dir qjournal://hadoop01:8485;hadoop02:8485;hadoop03:8485;hadoop04:8485;hadoop05:8485/ns dfs.journalnode.edits.dir /root/hadoop/journal/data dfs.ha.automatic-failover.enabled true dfs.client.failover.proxy.provider.ns org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.fencing.methods sshfence shell(/bin/true) dfs.ha.fencing.ssh.private-key-files /root/.ssh/id_rsa dfs.ha.fencing.ssh.connect-timeout 30000
mapred-site.xml:
mapreduce.framework.name yarn yarn.app.mapreduce.am.env HADOOP_MAPRED_HOME=/root/hadoop/ mapreduce.map.env HADOOP_MAPRED_HOME=/root/hadoop/ mapreduce.reduce.env HADOOP_MAPRED_HOME=/root/hadoop/
yarn-site.xml:
yarn.resourcemanager.ha.enabled true yarn.resourcemanager.cluster-id yrc yarn.resourcemanager.ha.rm-ids rm1,rm2 yarn.resourcemanager.hostname.rm1 hadoop01 yarn.resourcemanager.hostname.rm2 hadoop02 yarn.resourcemanager.zk-address hadoop01:2181,hadoop02:2181,hadoop03:2181,hadoop04:2181,hadoop05:2181 yarn.nodemanager.aux-services mapreduce_shuffle yarn.resourcemanager.webapp.address.rm1 hadoop01:8088 yarn.resourcemanager.scheduler.address.rm2 hadoop02:8088 yarn.resourcemanager.webapp.address.rm2 hadoop02:8088
修改配置文件workers,改为以下内容:
vim workers
hadoop01 hadoop02 hadoop03 hadoop04 hadoop05
启动JournalNode(5个节点均要执行):
hadoop-daemon.sh start journalnode
停止JournalNode:
hadoop-daemon.sh stop journalnode
二、在hadoop01节点上执行命令
格式化NameNode:
hdfs namenode -format
进入/root/hadoop,将tmp文件夹远程复制到hadoop02的/root/hadoop:
cd /root/hadoop
scp -r tmp/ root@hadoop02:/root/hadoop
格式化ZKCF:
hdfs zkfc -formatZK
修改start-dfs.sh:
vim start-dfs.sh
增加以下内容:
HDFS_NAMENODE_USER=root HDFS_DATANODE_USER=root HDFS_JOURNALNODE_USER=root HDFS_ZKFC_USER=root
修改start-yarn.sh:
vim start-yarn.sh
增加以下内容:
YARN_RESOURCEMANAGER_USER=root YARN_NODEMANAGER_USER=root
启动hdfs:
start-dfs.sh
启动yarn:
start-yarn.sh
使用jps查看各节点进程:
hadoop01:
hadoop02:
hadoop03:
hadoop04:
hadoop05:



