Hadoop集群安装
1、准备工作1、三台机器的防火墙必须是关闭的 2、确保三台机器的网络配置正常(NAT模式、静态IP、主机名的配置) 3、确保/etc/hosts文件配置了IP和hosts的映射关系 4、确保配置了三台机器的免密登录认证 5、确保所有机器时间同步 6、JDK和hadoop的环境配置(JDK安装步骤省略) 注:没有特殊说明,所有配置hadoop01、hadoop02、hadoop03都执行1.1、主机分配
| Node | 内存大小(G) | ip | Applications | 系统 | JDK |
|---|---|---|---|---|---|
| hadoop01 | 4 | 192.168.159.128 | NameNode、DataNode、ResourceManager、NodeManager | CentOS7 | jdk1.8.0_144 |
| hadoop02 | 4 | 192.168.159.129 | SecondaryNameNode、DataNode、NodeManager | CentOS7 | jdk1.8.0_144 |
| hadoop03 | 4 | 192.168.159.130 | DataNode、NodeManager | CentOS7 | jdk1.8.0_144 |
1、systemctl stop firewalld 2、systemctl disable firewalld 3、systemctl stop NetworkManager 4、systemctl disable NetworkManager 5、vim /etc/selinux/config SELINUX=disabled1.3、修改主机名、配置hosts及配置静态ip 1.3.1、修改主机名
hostnamectl set-hostname hadoop01 #在192.168.159.128主机执行 hostnamectl set-hostname hadoop02 #在192.168.159.129主机执行 hostnamectl set-hostname hadoop03 #在192.168.159.130主机执行1.3.2、配置hosts
vim /etc/hosts
192.168.159.128 hadoop01 192.168.159.129 hadoop02 192.168.159.130 hadoop03
注:最好windows(C:WindowsSystem32driversetchosts)可以配置一下
1.3.3、配置静态ipvim /etc/sysconfig/network-scripts/ifcfg-ens33
onBOOT="yes" BOOTPROTO="static" IPADDR=192.168.159.128 NETMASK=255.255.255.0 GATEWAY=192.168.159.2 #可通过netstat -rn查看 DNS1=114.114.114.114 systemctl restart network
"ifcfg-ens33"为网卡名称,192.168.159.128为当前服务器ip
1.4、免密登录1、ssh-keygen -t rsa # 生成秘钥 2、ssh-copy-id hadoop01/hadoop02/hadoop03 #将秘钥拷贝至所有主机,hadoop01 -> hadoop01、hadoop02、hadoop03,hadoop02、hadoop03同理 3、ssh hadoop02(验证hadoop01登录hadoop02,免密登录)1.5、时间同步 1.5.1、安装ntp、ntpdate
yum install ntp -y yum install ntpdate -y1.5.2、hadoop01配置ntp.conf(hadoop02、hadoop03不用配置)
vim /etc/ntp.conf
# 新增以下配置 restrict 192.168.159.0 mask 255.255.255.0 nomodify notrap # 注释掉所有的server,并添加以下 server 127.127.1.01.5.3、hadoop02、hadoop03同步hadoop01服务器时间
1、ntpdate -u hadoop01 # 手动同步 2、执行:crontab -e,添加以下内容,定时同步hadoop01时间 * * * * * /usr/sbin/ntpdate -u hadoop01 > /dev/null 2>&12、集群安装 2.1、下载文件
文件存放在/root/softwares目录
https://downloads.apache.org/flume/1.8.0/apache-flume-1.8.0-bin.tar.gz https://archive.apache.org/dist/zookeeper/zookeeper-3.4.10/zookeeper-3.4.10.tar.gz http://archive.apache.org/dist/hive/hive-2.1.1/apache-hive-2.1.1-bin.tar.gz https://archive.apache.org/dist/hadoop/common/hadoop-2.7.6/hadoop-2.7.6.tar.gz2.2、安装hadoop
cd /root/softwares
2.2.1、解压、重命名hadoop文件夹1、解压文件 tar -zxvf hadoop-2.7.6.tar.gz -C /usr/local/ 2、文件重命名 cd /usr/local/ && mv hadoop-2.7.6 hadoop2.2.2、配置、刷新环境变量
vim /etc/profile
export HADOOP_HOME=/usr/local/hadoop export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
source /etc/profile
注:可通过"hadoop version"验证是否生效
2.2.3、将文件及配置拷贝至hadoop02、hadoop03将hadoop拷贝至hadoop02、hadoop03
cd /usr/local
scp -r hadoop/ hadoop02:$PWD scp -r hadoop/ hadoop03:$PWD
将环境配置拷贝至hadoop02、hadoop03
scp /etc/profile hadoop02:/etc/ source /etc/profile scp /etc/profile hadoop03:/etc/ source /etc/profile2.3、集群文件
$HADOOP_HOME/etc/hadoop
配置文件($HADOOP_HOME/share,有*-default.xml文件) hadoop-env.sh yarn-env.sh core-site.xml hdfs-site.xml mapred-site.xml yarn-site.xml 注意:配置优先级,代码设置 > *-site.xml > *-default.xml2.3.1、core-site.xml
2.3.2、hdfs-site.xmlfs.defaultFS hdfs://hadoop01:8020 hadoop.tmp.dir /usr/local/hadoop/tmp
2.3.3、mapred-site.xmldfs.namenode.name.dir file://${hadoop.tmp.dir}/dfs/name dfs.datanode.data.dir file://${hadoop.tmp.dir}/dfs/data dfs.replication 3 dfs.blocksize 134217728 dfs.namenode.secondary.http-address hadoop02:50090 dfs.namenode.http-address hadoop01:50070
cp mapred-site.xml.template mapred-site.xml vim mapred-site.xml2.3.4、yarn-site.xmlmapreduce.framework.name yarn mapreduce.jobhistory.address hadoop01:10020 mapreduce.jobhistory.webapp.address hadoop01:1988
2.3.5、hadoop-env.shyarn.nodemanager.aux-services mapreduce_shuffle yarn.resourcemanager.hostname hadoop01 yarn.nodemanager.aux-services.mapreduce_shuffle.class org.apache.hadoop.mapred.ShuffleHandler yarn.resourcemanager.address hadoop01:8032 yarn.resourcemanager.scheduler.address hadoop01:8030 yarn.resourcemanager.resource-tracker.address hadoop01:8031 yarn.resourcemanager.admin.address hadoop01:8033 yarn.resourcemanager.webapp.address hadoop01:8088
vim hadoop-env.sh
配置JDK地址::25行
export JAVA_HOME=/usr/local/java/jdk1.8.0_1442.3.6、yarn-env.sh
vim yarn-env.sh
配置JDK地址:26行
JAVA_HOME=/usr/local/java/jdk1.8.0_1442.3.7、slaves
vim slaves
此文件用于指定datanode守护进程所在的机器节点主机名
hadoop01 hadoop02 hadoop032.3.8、配置文件复制到hadoop02、hadoop02
将$HADOOP_HOME/etc/hadoop复制到hadoop02、hadoop03
cd $HADOOP_HOME/etc scp -r hadoop/ hadoop02:$PWD scp -r hadoop/ hadoop03:$PWD2.4、集群启动 2.4.1、hadoop01格式化
hdfs namenode -format
如果失败,可能是之前格式化过,尝试清空core-site.xml文件的配置项hadoop.tmp.dir对应的目录
2.4.2、集群启动和关闭start-all.sh、stop-all.sh 或 (start-dfs.sh + start-yarn.sh)、(stop-dfs.sh + stop-yarn.sh)2.4.3、hdfs web访问
hadoop01:50070 (hdfs-site.xml配置dfs.namenode.http-address)



