在opt下创建software文件夹
上传Hadoop安装包到software文件夹
解压安装包到src下
修改环境变量
#hadoop enviroment export HADOOP_HOME=/usr/local/src/hadoop #HADOOP_HOME 指向 JAVA 安装目录 export HADOOP_PREFIX=$HADOOP_HOME export HADOOP_MAPRED_HOME=$HADOOP_HOME export HADOOP_COMMON_HOME=$HADOOP_HOME export HADOOP_HDFS_HOME=$HADOOP_HOME export YARN_HOME=$HADOOP_HOME export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export HADOOP_INSTALL=$HADOOP_HOME export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib:$HADOOP_COMMON_LIB_NATIVE_DIR" export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbina
生效环境变量
进入hadoop目录
修改hadoop-evn.sh文件
vim hadoop-env.sh export JAVA_HOME=/usr/local/src/java
配置core-site.sh文件
fs.defaultFS hdfs://mycluster hadoop.tmp.dir file:/usr/local/src/hadoop/tmp ha.zookeeper.quorum master:2181,slave1:2181,slave2:2181 ha.zookeeper.session-timeout.ms 30000 ms fs.trash.interval 1440
修改 hdfs-site.xml文件
dfs.qjournal.start-segment.timeout.ms 60000 dfs.nameservices mycluster dfs.ha.namenodes.mycluster master,slave1 dfs.namenode.rpc-address.mycluster.master master:8020 dfs.namenode.rpc-address.mycluster.slave1 slave1:8020 dfs.namenode.http-address.mycluster.master master:50070 dfs.namenode.http-address.mycluster.slave1 slave1:50070 dfs.namenode.shared.edits.dir qjournal://master:8485;slave1:8485;slave2:8485/mycluster dfs.client.failover.proxy.provider.mycluster org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.fencing.methods sshfence shell(/bin/true) dfs.permissions.enabled false dfs.support.append true dfs.ha.fencing.ssh.private-key-files /root/.ssh/id_rsa dfs.replication 2 dfs.namenode.name.dir /usr/localc/hadoop/tmpfs/nn dfs.datanode.data.dir /usr/localc/hadoop/tmpfs/dn dfs.journalnode.edits.dir /usr/localc/hadoop/tmpfs/jn dfs.ha.automatic-failover.enabled true dfs.webhdfs.enabled true dfs.ha.fencing.ssh.connect-timeout 30000 ha.failover-controller.cli-check.rpc-timeout.ms 60000
复制 mared-site.xml.template文件
修改mared-site.xml文件
mapreduce.framework.name yarn mapreduce.jobhistory.address master:10020 mapreduce.jobhistory.webapp.address master:19888
创建或修改slaves
将修改权限给到Hadoop
chown -R hadoop:hadoop /usr/local/src/hadoop
修改yarn-site.xml
vim yarn-site.xmlyarn.resourcemanager.ha.enabled true yarn.resourcemanager.cluster-id yrc yarn.resourcemanager.ha.rm-ids rm1,rm2 yarn.resourcemanager.hostname.rm1 master yarn.resourcemanager.hostname.rm2 slave1 yarn.resourcemanager.zk-address master:2181,slave1:2181,slave2:2181 yarn.nodemanager.aux-services mapreduce_shuffle yarn.log-aggregation-enable true yarn.log-aggregation.retain-seconds 86400 yarn.resourcemanager.recovery.enabled true yarn.resourcemanager.store.class org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
namenode、datanode、journalnode 等存放数据的公共目录为/usr/local/src/hadoop/tmp;
在 master 上执行如下:
mkdir -p /usr/local/src/hadoop/tmp/hdfs/nn mkdir -p /usr/local/src/hadoop/tmp/hdfs/dn mkdir -p /usr/local/src/hadoop/tmp/hdfs/jn mkdir -p /usr/local/src/hadoop/tmp/logs
分发环境变量
scp -r /etc/profile root@slave1:/etc/ scp -r /etc/profile root@slave2:/etc/
分发hadoop
scp -r /usr/local/src/hadoop@slave1:/usr/local/src/ scp -r /usr/local/src/hadoop@slave2:/usr/local/src/
启动zookeeper
zkServer.sh start
查询
启动journnalnode守护进程
初始化namenode
hdfs namenode -format
返回值为0,格式化成功,若是1则不成功
注意:一定要开启守护进程
注册 Znode
hdfs zkfc -formatZK
启动 hdfs
start-dfs.sh
启动yarn
start-yarn.sh
同步master数据
复制 namenode 元数据到其它节点(在 master 节点执行)
scp -r /usr/local/src/hadoop/tmp/hdfs/nn/ slave1:/usr/local/src/hadoop/tmp/hdfs/nn/ scp -r /usr/local/src/hadoop/tmp/hdfs/nn/ slave2:/usr/local/src/hadoop/tmp/hdfs/nn/
在 slave1 上启动 resourcemanager 和 namenode 进程
yarn-daemon.sh start resourcemanager
hadoop-daemon.sh start namenode
启动 MapReduce 任务历史服务器
yarn-daemon.sh start proxyserver
mr-jobhistory-daemon.sh start historyserver
查看三台机器端口及进程
master:50070
slave1:50070
master:8088
HA测试
创建测试文件
vim a.txt
hdfs创建文件夹
hadoop fs -mkdir /input
将啊.txt传输到input
hadoop fs -put ~/a.txt /input
进入到jar包测试文件下
cd /usr/local/src/hadoop/share/hadoop/mapreduce/
测试mapreduce
hadoop jar hadoop-mapreduce-examples-2.7.1.jar wordcount /input/a.txt /output
查看hdfs传输结果
查看文件测试结果
hadoop fs -cat /output/part-r-00000
高可用验证
自动切换状态
查看状态
手动切换状态
在 maste 停止并启动 namenode
hadoop-daemon.sh stop namenode
查看状态
hdfs haadmin -getServiceState master hdfs haadmin -getServiceState slave1
在master启动namenode
hadoop-daemon.sh start namenode
查看状态
hdfs haadmin -getServiceState slave1 hdfs haadmin -getServiceState master



