栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 前沿技术 > 大数据 > 大数据系统

Hadoop完全分布式搭建

Hadoop完全分布式搭建

hadoop完全分布式搭建
  • 1. 准备工作
    • 1.1. 软件版本
    • 1.2. 集群规划
  • 2. 环境搭建
    • 1.修改主机名
    • 2. 关闭防火墙
    • 3.修改hosts文件
    • 4.配置ssh,无密码登录
    • 5.安装jdk
    • 6.安装hadoop
      • 1.解压
      • 2.将Hadoop添加到环境变量,vi /etc/profile
      • 3.将 profile分配到其他节点,再source一下生效
      • 4.创建hdfs存储目录
      • 5.修改/hadoop-2.9.2/etc/jadoop/hadoop-env.sh文件,设置JAVA_HOME 为实际路径
      • 6.修改/hadoop-2.9.2/etc/jadoop/yarn-env.sh文件,设置JAVA_HOME 为实际路径
      • 7.配置/hadoop-2.9.2/etc/hadoop/core-site.xml
      • 8.配置/hadoop-2.9.2/etc/hadoop/hdfs-site.xml
      • 9.配置/hadoop-2.9.2/etc/hadoop/mapred-site.xml
      • 10. 配置/hadoop-2.9.2/etc/hadoop/yarn-site.xml
      • 11. 配置/hadoop-2.9.2/etc/hadoop/slaves
      • 12.发送到其他节点上
      • 13.格式化namenode
      • 14.启动hadoop
      • 15.访问web页面
      • 16.运行实例

1. 准备工作 1.1. 软件版本

jdk: 1.8

hadoop:2.9.2

系统:centos7

安装包统一放在 /usr/local/src目录下

1.2. 集群规划
编号主机名ip地址节点类型
1master192.168.1.101NameNode、SecondaryNameNode、ResourceManager
2slave1192.168.1.102NodeManager、DataNode
3slave2192.168.1.103NodeManager、DataNode
2. 环境搭建 1.修改主机名

在三个节点上分别执行

hostnamectl set-hostname master
hostnamectl set-hostname slave1
hostnamectl set-hostname slave2
2. 关闭防火墙

集群上每个节点的防火墙都需要关闭

systemctl stop firewalld 
systemctl disable firewalld
3.修改hosts文件
vi /etc/hosts

hosts添加下面三行

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.1.101 master
192.168.1.102 slave1
192.168.1.103 slave2                              

把hosts文件复制到其他节点上(需要输入yes,然后输入目标节点的用户密码

 scp /etc/hosts root@slave1:/etc/
 scp /etc/hosts root@slave2:/etc/
4.配置ssh,无密码登录

生成公钥和私钥

ssh-keygen -t rsa

连续按回车出现以下图像

Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:0f4Tz5jw1zbR3t9j8RH1bOcwhg1BZwC7jkf1sUDfQTM root@master
The key's randomart image is:
+---[RSA 2048]----+
|          .o=o+E |
|         . o.+ .=|
|        . o o+o.+|
|         o o.o=+*|
|        S = ..o*+|
|         + + * =+|
|        . o * +.O|
|         .   o +=|
|              . +|
+----[SHA256]-----+

将公钥拷贝到要免密登录的目标机器上

ssh-copy-id master
ssh-copy-id slave1
ssh-copy-id slave2

测试效果

[root@master src]# ssh slave1
Last login: Wed Nov 10 15:34:09 2021 from 192.168.1.17
[root@slave1 ~]# 
5.安装jdk

解压,重命名文件夹

tar -xvf jdk-8u261-linux-x64.tar.gz
mv jdk1.8.0_261 jdk1.8

追加环境变量

vi /etc/profile

在文件末尾添加

# java environment
export JAVA_HOME=/usr/local/src/jdk1.8  # java解压的路径
export PATH=$PATH:$JAVA_HOME/bin

让修改后的文件生效

source /etc/profile

测试安装是否成功

[root@master src]# java -version
java version "1.8.0_261"
Java(TM) SE Runtime Environment (build 1.8.0_261-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.261-b12, mixed mode)

把jdk和profile复制到其他节点

scp -r /usr/local/src/jdk1.8 root@slave1:/usr/local/src/
scp -r /usr/local/src/jdk1.8 root@slave2:/usr/local/src/
scp /etc/profile root@slave1:/etc/
scp /etc/profile root@slave2:/etc/

在其他节点使用 source /etc/profile 让环境生效

6.安装hadoop 1.解压
tar -zxvf hadoop-2.9.2.tar.gz 
2.将Hadoop添加到环境变量,vi /etc/profile
#hadoop envrionment
export HADOOP_HOME=/usr/local/src/hadoop-2.9.2
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
3.将 profile分配到其他节点,再source一下生效
scp /etc/profile root@slave1:/etc/
scp /etc/profile root@slave2:/etc/
4.创建hdfs存储目录

(注意hadoop-2.9.2在/usr/local/src/目录下

/hadoop-2.9.2/hdfs/name --存储namenode文件
/hadoop-2.9.2/hdfs/data --存储数据
/hadoop-2.9.2/hdfs/tmp --存储临时文件

cd /usr/local/src/hadoop-2.9.2
mkdir hdfs
cd hdfs
mkdir name data tmp
5.修改/hadoop-2.9.2/etc/jadoop/hadoop-env.sh文件,设置JAVA_HOME 为实际路径
cd /usr/local/src/hadoop-2.9.2/etc/hadoop/

vi hadoop-env.sh 

把原来的注释掉

# The java implementation to use.
# export JAVA_HOME=${JAVA_HOME}
export JAVA_HOME=/usr/local/src/jdk1.8
6.修改/hadoop-2.9.2/etc/jadoop/yarn-env.sh文件,设置JAVA_HOME 为实际路径
vi yarn-env.sh 

把原来的注释下面添加

# some Java parameters
# export JAVA_HOME=/home/y/libexec/jdk1.6.0/
export JAVA_HOME=/usr/local/src/jdk1.8
7.配置/hadoop-2.9.2/etc/hadoop/core-site.xml
vi core-site.xml

在configuration中添加


		# 临时存储目
        
                hadoop.tmp.dir
                /usr/local/src/hadoop-2.9.2/hdfs/tmp
        
        # hdfs文件系统地址和端口
        
                fs.default.name
                hdfs://master:9000
        

8.配置/hadoop-2.9.2/etc/hadoop/hdfs-site.xml
vi hdfs-site.xml 

在configuration中添加


		# 数据副本数量
        
                dfs.replication
                3
        
        # namenode存储目录
        
                dfs.name.dir
                /usr/local/src/hadoop-2.9.2/hdfs/name
        
        # 数据存储目录
        
                dfs.data.dir
                /usr/local/src/hadoop-2.9.2/hdfs/data
        
        # 关闭上传hdfs文件权限检查
        
                dfs.permissions
                false
        

9.配置/hadoop-2.9.2/etc/hadoop/mapred-site.xml

根据目标复制一份出来

cp mapred-site.xml.template mapred-site.xml

vi mapred-site.xml

在configuration中添加


		# 指定mapreduce在yarn平台运行
        
                mapreduce.framework.name
                yarn
        

10. 配置/hadoop-2.9.2/etc/hadoop/yarn-site.xml
 vi yarn-site.xml 

在configuration中添加




		# resourcemanager地址
        
                yarn.resourcemanager.hostname
                master
        
        # reducer获取数据的方式
        
                yarn.nodemanager.aux-services
                mapreduce_shuffle
        
        # 忽略虚拟内存检查
        
                yarn.nodemanager.vmem-check-enabled
                false
        



11. 配置/hadoop-2.9.2/etc/hadoop/slaves
vi slaves

删除原有的内容,添加如下内容

slave1
slave2
12.发送到其他节点上
cd /usr/local/src/

scp -r hadoop-2.9.2 root@slave1:$PWD    # $PWD获取当前所在目录下的绝对路径
scp -r hadoop-2.9.2 root@slave2:$PWD
13.格式化namenode
hadoop namenode -format

如果有这一行说明格式化成功

14.启动hadoop
start-all.sh

查看各个节点情况

jps # jdk的命令

master

[root@master src]# jps
15636 NameNode
17014 Jps
16493 ResourceManager
16255 SecondaryNameNode

slave1,slave2

[root@slave1 src]# jps
14134 NodeManager
15739 Jps
13565 DataNode
15.访问web页面

访问hdfs页面 http://192.168.1.101:50070

访问yarn页面 http://192.168.1.101:8088

16.运行实例
cd hadoop-2.9.2/share/hadoop/mapreduce/
hadoop jar hadoop-mapreduce-examples-2.9.2.jar pi 5 10
[root@master mapreduce]# hadoop jar hadoop-mapreduce-examples-2.9.2.jar pi 5 10
Number of Maps  = 5
Samples per Map = 10
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Starting Job
21/11/12 10:57:01 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.101:8032
21/11/12 10:57:01 INFO input.FileInputFormat: Total input files to process : 5
21/11/12 10:57:01 INFO mapreduce.JobSubmitter: number of splits:5
21/11/12 10:57:01 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
21/11/12 10:57:01 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1636685278166_0001
21/11/12 10:57:02 INFO impl.YarnClientImpl: Submitted application application_1636685278166_0001
21/11/12 10:57:02 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1636685278166_0001/
21/11/12 10:57:02 INFO mapreduce.Job: Running job: job_1636685278166_0001
21/11/12 10:57:08 INFO mapreduce.Job: Job job_1636685278166_0001 running in uber mode : false
21/11/12 10:57:08 INFO mapreduce.Job:  map 0% reduce 0%
21/11/12 10:57:19 INFO mapreduce.Job:  map 100% reduce 0%
21/11/12 10:57:24 INFO mapreduce.Job:  map 100% reduce 100%
21/11/12 10:57:24 INFO mapreduce.Job: Job job_1636685278166_0001 completed successfully
21/11/12 10:57:24 INFO mapreduce.Job: Counters: 49
        File System Counters
                FILE: Number of bytes read=116
                FILE: Number of bytes written=1192839
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=1300
                HDFS: Number of bytes written=215
                HDFS: Number of read operations=23
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=3
        Job Counters 
                Launched map tasks=5
                Launched reduce tasks=1
                Data-local map tasks=5
                Total time spent by all maps in occupied slots (ms)=42055
                Total time spent by all reduces in occupied slots (ms)=2317
                Total time spent by all map tasks (ms)=42055
                Total time spent by all reduce tasks (ms)=2317
                Total vcore-milliseconds taken by all map tasks=42055
                Total vcore-milliseconds taken by all reduce tasks=2317
                Total megabyte-milliseconds taken by all map tasks=43064320
                Total megabyte-milliseconds taken by all reduce tasks=2372608
        Map-Reduce framework
                Map input records=5
                Map output records=10
                Map output bytes=90
                Map output materialized bytes=140
                Input split bytes=710
                Combine input records=0
                Combine output records=0
                Reduce input groups=2
                Reduce shuffle bytes=140
                Reduce input records=10
                Reduce output records=0
                Spilled Records=20
                Shuffled Maps =5
                Failed Shuffles=0
                Merged Map outputs=5
                GC time elapsed (ms)=4892
                CPU time spent (ms)=2690
                Physical memory (bytes) snapshot=1675964416
                Virtual memory (bytes) snapshot=12723679232
                Total committed heap usage (bytes)=1073741824
        Shuffle Errors
                BAD_ID=0
                ConNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters 
                Bytes Read=590
        File Output Format Counters 
                Bytes Written=97
Job Finished in 23.816 seconds
Estimated value of Pi is 3.28000000000000000000
[root@master mapreduce]# 

转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/467335.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号