虚拟机准备阶段操作
安全设置
防火墙相关指令关闭关闭selinux IP设置
查看机器IP修改主机名修改IP及主机名映射 SSH免密登陆 Hadoop伪分布式搭建
JDK配置
解压配置环境变量 Hadoop配置
解压文件修改配置文件配置Hadoop环境变量验证环境变量是否配置成功格式化NameNodeHadoop起停命令查看WebUI界面 Hadoop 3.0以上看这里 虚拟机准备阶段操作
本文是基于CentOS 7 系统搭建
相关资源下载
链接:https://pan.baidu.com/s/1FW228OfyURxEgnXW0qqpmA 密码:18uc
# 查看防火墙状态 firewall-cmd --state # 停止防火墙 [root@localhost ~]# systemctl stop firewalld.service # 禁止防火墙开机自启 [root@localhost ~]# systemctl disable firewalld.service关闭关闭selinux
[root@localhost ~]# vi /etc/selinux/config
将 SELINUX=enforcing改为 SELINUX=disabled
IP设置 查看机器IP[root@localhost ~]# ifconfig ip 为192.168.78.100修改主机名
[root@localhost ~]# vi /etc/hostname修改IP及主机名映射
[root@localhost ~]# vi /etc/hosts 192.168.78.100 CentOSSSH免密登陆
[root@localhost ~]# ssh-keygen -t rsa # 生产密钥 # 连续三次回车 # 将密钥发送给需要登陆本机的机器,这里只有一台机器 所以发给自己 [root@localhost ~]# ssh-copy-id root@CentOS # 测试ssh [root@localhost ~]# ssh root@CentOSHadoop伪分布式搭建
创建 install文件夹
[root@localhost ~]# mkdir /opt/install/
JDK配置这里选用JDK8
解压[root@localhost ~]# tar -zxvf jdk-8u144-linux-x64.tar.gz -C /opt/install/配置环境变量
[root@localhost jdk1.8.0_144]# vi /etc/profile # 加入配置 加入位置如下图所示 export JAVA_HOME=/opt/install/jdk1.8.0_144 export PATH=$PATH:$JAVA_HOME/bin # 保存后刷新环境变量 [root@localhost jdk1.8.0_144]# source /etc/profile
# 刷新完 执行命令验证JDK是否安装成功 [root@localhost jdk1.8.0_144]# java -version
成功界面
[root@localhost ~]# tar -zxvf hadoop-2.9.2.tar.gz -C /opt/install/修改配置文件
[root@localhost ~]# cd /opt/install/hadoop-2.9.2/etc/hadoop
hadoop-env.sh
export JAVA_HOME=/opt/install/jdk1.8.0_144
core-site.xml
fs.defaultFS hdfs://CentOS:8020 hadoop.tmp.dir /opt/install/hadoop-2.9.2/data/tmp
hdfs-site.xml
dfs.replication 3
dfs.permissions.enabled false dfs.namenode.http.address CentOS:50070
mapred-site.xml
首先拷贝一个mapred-site.xml
[root@localhost hadoop]# cp mapred-site.xml.template mapred-site.xml
mapreduce.framework.name yarn
yarn-site.xml
yarn.nodemanager.aux-services mapreduce_shuffle
slaves
这里配置DataNode的主机名 伪分布式情况下这里NameNode也充当DataNode
CentOS
配置Hadoop环境变量[root@localhost hadoop-2.9.2]# vim /etc/profile # 加入 export HADOOP_HOME=/opt/install/hadoop-2.9.2 export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
# 刷新环境变量 [root@localhost hadoop-2.9.2]# source /etc/profile验证环境变量是否配置成功
[root@localhost hadoop-2.9.2]# hadoop version格式化NameNode
目的作用:格式化hdfs系统,并且生成存储数据块的目录
[root@localhost hadoop-2.9.2]# hadoop namenode -format
格式化成功后如图显示
start-all.sh stop-all.sh
启动成后 jps查看进程
http://CentOS:50070 访问 hdfs
http://CentOS:8088 访问 yarn
在Hadoop3.0后会有一些身份的配置,如果照上面配置 启动后会抛出以下异常:
Starting namenodes on [namenode] ERROR: Attempting to operate on hdfs namenode as root ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation. Starting datanodes ERROR: Attempting to operate on hdfs datanode as root ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation. Starting secondary namenodes [datanode1] ERROR: Attempting to operate on hdfs secondarynamenode as root ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting operation. Starting resourcemanager ERROR: Attempting to operate on yarn resourcemanager as root ERROR: but there is no YARN_RESOURCEMANAGER_USER defined. Aborting operation. Starting nodemanagers ERROR: Attempting to operate on yarn nodemanager as root ERROR: but there is no YARN_NODEMANAGER_USER defined. Aborting operation.
此时需要去hadoop的sbin目录下做一下小改动
在start-dfs.sh 和 stop-dfs.sh 中 新增!!!:
HDFS_DATANODE_USER=root HADOOP_SECURE_DN_USER=hdfs HDFS_NAMENODE_USER=root HDFS_SECONDARYNAMENODE_USER=root
在start-yarn.sh 和 stop-yarn.sh 中 新增!!!:
YARN_RESOURCEMANAGER_USER=root HADOOP_SECURE_DN_USER=yarn YARN_NODEMANAGER_USER=root
修改完保存重启可解决



