NameNode进程挂了并且存储的数据也丢失了,如何恢复NameNode
6.1.2、故障模拟(1)kill -9 NameNode进程
[atguigu@hadoop102 current]$ kill -9 19886
(2)删除NameNode存储的数据(/opt/module/hadoop-3.1.3/data/tmp/dfs/name)
[atguigu@hadoop102 hadoop-3.1.3]$ rm -rf /opt/module/hadoop-3.1.3/data/dfs/name/*6.1.3、问题解决
(1)拷贝SecondaryNameNode中数据到原NameNode存储数据目录
[atguigu@hadoop102 dfs]$ scp -r atguigu@hadoop104:/opt/module/hadoop-3.1.3/data/dfs/namesecondary/* ./name/
(2)重新启动NameNode
[atguigu@hadoop102 hadoop-3.1.3]$ hdfs --daemon start namenode
(3)向集群上传一个文件,即可恢复正常,可能会存在少量数据的丢失可能性,正常情况下,一般是使用namenode的高可用,而不是SecondaryNameNode。
6.2、 集群安全模式&磁盘修复 6.2.1、安全模式:文件系统只接受读数据请求,而不接受删除、修改等变更请求 6.2.2、进入安全模式场景NameNode在加载镜像文件和编辑日志期间处于安全模式;NameNode再接收DataNode注册时,处于安全模式
dfs.namenode.safemode.min.datanodes:最小可用datanode数量,默认0 dfs.namenode.safemode.threshold-pct:副本数达到最小要求的block占系统总block数的百分比,默认0.999f。(只允许丢一个块) dfs.namenode.safemode.extension:稳定时间,默认值30000毫秒,即30秒6.2.4、基本语法
集群处于安全模式,不能执行重要操作(写操作)。集群启动完成后,自动退出安全模式。
(1)bin/hdfs dfsadmin -safemode get (功能描述:查看安全模式状态) (2)bin/hdfs dfsadmin -safemode enter (功能描述:进入安全模式状态) (3)bin/hdfs dfsadmin -safemode leave (功能描述:离开安全模式状态) (4)bin/hdfs dfsadmin -safemode wait (功能描述:等待安全模式状态)6.2.5、案例1:启动集群进入安全模式
(1)重新启动集群 ```xml [atguigu@hadoop102 subdir0]$ myhadoop.sh stop [atguigu@hadoop102 subdir0]$ myhadoop.sh start ``` (2)集群启动后,立即来到集群上删除数据,提示集群处于安全模式6.2.6、案例2:磁盘修复
需求:数据块损坏,进入安全模式,如何处理 (1)分别进入hadoop102、hadoop103、hadoop104的/opt/module/hadoop-3.1.3/data/dfs/data/current/BP-1015489500-192.168.10.102-
1611909480872/current/finalized/subdir0/subdir0目录,统一删除某2个块信息 [atguigu@hadoop102 subdir0]$ pwd /opt/module/hadoop-3.1.3/data/dfs/data/current/BP-1015489500-192.168.10.102-1611909480872/current/finalized/subdir0/subdir0 [atguigu@hadoop102 subdir0]$ rm -rf blk_1073741847 blk_1073741847_1023.meta [atguigu@hadoop102 subdir0]$ rm -rf blk_1073741865 blk_1073741865_1042.meta
说明:hadoop103/hadoop104重复执行以上命令
(2)重新启动集群
[atguigu@hadoop102 subdir0]$ myhadoop.sh stop [atguigu@hadoop102 subdir0]$ myhadoop.sh start
(3)观察http://hadoop102:9870/dfshealth.html#tab-overview
说明:安全模式已经打开,块的数量没有达到要求。
(4)离开安全模式
[atguigu@hadoop102 subdir0]$ hdfs dfsadmin -safemode get Safe mode is ON [atguigu@hadoop102 subdir0]$ hdfs dfsadmin -safemode leave Safe mode is OFF
(5)观察http://hadoop102:9870/dfshealth.html#tab-overview
(6)将元数据删除
(7)观察http://hadoop102:9870/dfshealth.html#tab-overview,集群已经正常
需求:模拟等待安全模式
(1)查看当前模式
[atguigu@hadoop102 hadoop-3.1.3]$ hdfs dfsadmin -safemode get Safe mode is OFF
(2)先进入安全模式
[atguigu@hadoop102 hadoop-3.1.3]$ bin/hdfs dfsadmin -safemode enter
(3)创建并执行下面的脚本
在/opt/module/hadoop-3.1.3路径上,编辑一个脚本safemode.sh
[atguigu@hadoop102 hadoop-3.1.3]$ vim safemode.sh
#!/bin/bash hdfs dfsadmin -safemode wait hdfs dfs -put /opt/module/hadoop-3.1.3/README.txt /
[atguigu@hadoop102 hadoop-3.1.3]$ chmod 777 safemode.sh [atguigu@hadoop102 hadoop-3.1.3]$ ./safemode.sh
(4)再打开一个窗口,执行
[atguigu@hadoop102 hadoop-3.1.3]$ bin/hdfs dfsadmin -safemode leave
(5)再观察上一个窗口
Safe mode is OFF
(6)HDFS集群上已经有上传的数据了
6.3、 慢磁盘监控“慢磁盘”指的时写入数据非常慢的一类磁盘。其实慢性磁盘并不少见,当机器运行时间长了,上面跑的任务多了,磁盘的读写性能自然会退化,严重时就会出现写入数据延时的问题。
6.3.1、如何发现慢磁盘?正常在HDFS上创建一个目录,只需要不到1s的时间。如果你发现创建目录超过1分钟及以上,而且这个现象并不是每次都有。只是偶尔慢了一下,就很有可能存在慢磁盘。可以采用如下方法找出是哪块磁盘慢:
6.3.1.1、通过心跳未联系时间。一般出现慢磁盘现象,会影响到DataNode与NameNode之间的心跳。正常情况心跳时间间隔是3s。超过3s说明有异常。
(1)顺序读测试
[atguigu@hadoop102 ~]# sudo yum install -y fio [atguigu@hadoop102 ~]# sudo fio -filename=/home/atguigu/test.log -direct=1 -iodepth 1 -thread -rw=read -ioengine=psync -bs=16k -size=2G -numjobs=10 -runtime=60 -group_reporting -name=test_r Run status group 0 (all jobs): READ: bw=360MiB/s (378MB/s), 360MiB/s-360MiB/s (378MB/s-378MB/s), io=20.0GiB (21.5GB), run=56885-56885msec
结果显示,磁盘的总体顺序读速度为360MiB/s。
(2)顺序写测试
[atguigu@hadoop102 ~]# sudo fio -filename=/home/atguigu/test.log -direct=1 -iodepth 1 -thread -rw=write -ioengine=psync -bs=16k -size=2G -numjobs=10 -runtime=60 -group_reporting -name=test_w Run status group 0 (all jobs): WRITE: bw=341MiB/s (357MB/s), 341MiB/s-341MiB/s (357MB/s-357MB/s), io=19.0GiB (21.4GB), run=60001-60001msec
结果显示,磁盘的总体顺序写速度为341MiB/s。
(3)随机写测试
[atguigu@hadoop102 ~]# sudo fio -filename=/home/atguigu/test.log -direct=1 -iodepth 1 -thread -rw=randwrite -ioengine=psync -bs=16k -size=2G -numjobs=10 -runtime=60 -group_reporting -name=test_randw Run status group 0 (all jobs): WRITE: bw=309MiB/s (324MB/s), 309MiB/s-309MiB/s (324MB/s-324MB/s), io=18.1GiB (19.4GB), run=60001-60001msec
结果显示,磁盘的总体随机写速度为309MiB/s。
(4)混合随机读写:
[atguigu@hadoop102 ~]# sudo fio -filename=/home/atguigu/test.log -direct=1 -iodepth 1 -thread -rw=randrw -rwmixread=70 -ioengine=psync -bs=16k -size=2G -numjobs=10 -runtime=60 -group_reporting -name=test_r_w -ioscheduler=noop Run status group 0 (all jobs): READ: bw=220MiB/s (231MB/s), 220MiB/s-220MiB/s (231MB/s-231MB/s), io=12.9GiB (13.9GB), run=60001-60001msec WRITE: bw=94.6MiB/s (99.2MB/s), 94.6MiB/s-94.6MiB/s (99.2MB/s-99.2MB/s), io=5674MiB (5950MB), run=60001-60001msec
结果显示,磁盘的总体混合随机读写,读速度为220MiB/s,写速度94.6MiB/s。
6.4、 小文件归档 6.4.1、HDFS存储小文件弊端
每个文件均按块存储,每个块的元数据存储在NameNode的内存中,因此HDFS存储小文件会非常低效。因为大量的小文件会耗尽NameNode中的大部分内存。
但注意,存储小文件所需要的磁盘容量和数据块的大小无关。例如,一个1MB的文件设置为128MB的块存储,实际使用的是1MB的磁盘空间,而不是128MB。
HDFS存档文件或HAR文件,是一个更高效的文件存档工具,它将文件存入HDFS块,在减少NameNode内存使用的同时,允许对文件进行透明的访问。具体说来,HDFS存档文件对内还是一个一个独立文件,对NameNode而言却是一个整体,减少了NameNode的内存。
(1)需要启动YARN进程
[atguigu@hadoop102 hadoop-3.1.3]$ start-yarn.sh
(2)归档文件
把/input目录里面的所有文件归档成一个叫input.har的归档文件,并把归档后文件存储到/output路径下。
[atguigu@hadoop102 hadoop-3.1.3]$ hadoop archive -archiveName input.har -p /input /output
(3)查看归档
[atguigu@hadoop102 hadoop-3.1.3]$ hadoop fs -ls /output/input.har [atguigu@hadoop102 hadoop-3.1.3]$ hadoop fs -ls har:///output/input.har
(4)解归档文件
[atguigu@hadoop102 hadoop-3.1.3]$ hadoop fs -cp har:///output/input.har/* /7、 HDFS—集群迁移 7.1 、Apache和Apache集群间数据拷贝
1)scp实现两个远程主机之间的文件复制
scp -r hello.txt root@hadoop103:/user/atguigu/hello.txt // 推 push scp -r root@hadoop103:/user/atguigu/hello.txt hello.txt // 拉 pull scp -r root@hadoop103:/user/atguigu/hello.txt root@hadoop104:/user/atguigu //是通过本地主机中转实现两个远程主机的文件复制;如果在两个远程主机之间ssh没有配置的情况下可以使用该方式。
2)采用distcp命令实现两个Hadoop集群之间的递归数据复制,两个namenode服务器进行数据拷贝
[atguigu@hadoop102 hadoop-3.1.3]$ bin/hadoop distcp hdfs://hadoop102:8020/user/atguigu/hello.txt hdfs://hadoop105:8020/user/atguigu/hello.txt8、 MapReduce生产经验 8.1、 MapReduce跑的慢的原因
MapReduce程序效率的瓶颈在于两点:
1)计算机性能
CPU、内存、磁盘、网络
2)I/O操作优化
(1)数据倾斜
(2)Map运行时间太长,导致Reduce等待过久
(3)小文件过多
数据频率倾斜——某一个区域的数据量要远远大于其他区域。
数据大小倾斜——部分记录的大小远远大于平均值。
(1)首先检查是否空值过多造成的数据倾斜
生产环境,可以直接过滤掉空值;如果想保留空值,就自定义分区,将空值加随机数打散。最后再二次聚合。
(2)能在map阶段提前处理,最好先在Map阶段处理。如:Combiner、MapJoin
(3)设置多个reduce个数
具体参考之前的文档
yarn的调优方式
HDFS上每个文件都要在NameNode上创建对应的元数据,这个元数据的大小约为150byte,这样当小文件比较多的时候,就会产生很多的元数据文件,一方面会大量占用NameNode的内存空间,另一方面就是元数据文件过多,使得寻址索引速度变慢。
小文件过多,在进行MR计算时,会生成过多切片,需要启动过多的MapTask。每个MapTask处理的数据量小,导致MapTask的处理时间比启动时间还小,白白消耗资源。
1)在数据采集的时候,就将小文件或小批数据合成大文件再上传HDFS(数据源头)
2)Hadoop Archive(存储方向)
是一个高效的将小文件放入HDFS块中的文件存档工具,能够将多个小文件打包成一个HAR文件,从而达到减少NameNode的内存使用
3)CombineTextInputFormat(计算方向)
CombineTextInputFormat用于将多个小文件在切片过程中生成一个单独的切片或者少量的切片。
4)开启uber模式,实现JVM重用(计算方向)
默认情况下,每个Task任务都需要启动一个JVM来运行,如果Task任务计算的数据量很小,我们可以让同一个Job的多个Task运行在一个JVM中,不必为每个Task都开启一个JVM。
(1)未开启uber模式,在/input路径上上传多个小文件并执行wordcount程序
[atguigu@hadoop102 hadoop-3.1.3]$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount /input /output2
(2)观察控制台
2021-02-14 16:13:50,607 INFO mapreduce.Job: Job job_1613281510851_0002 running in uber mode : false
(3)观察http://hadoop103:8088/cluster
(4)开启uber模式,在mapred-site.xml中添加如下配置
mapreduce.job.ubertask.enable true mapreduce.job.ubertask.maxmaps 9 mapreduce.job.ubertask.maxreduces 1 mapreduce.job.ubertask.maxbytes
(5)分发配置
[atguigu@hadoop102 hadoop]$ xsync mapred-site.xml
(6)再次执行wordcount程序
[atguigu@hadoop102 hadoop-3.1.3]$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount /input /output2
(7)观察控制台
2021-02-14 16:28:36,198 INFO mapreduce.Job: Job job_1613281510851_0003 running in uber mode : true
(8)观察http://hadoop103:8088/cluster
10.2、 测试MapReduce计算性能使用Sort程序评测MapReduce
注:一个虚拟机不超过150G磁盘尽量不要执行这段代码
(1)使用RandomWriter来产生随机数,每个节点运行10个Map任务,每个Map产生大约1G大小的二进制随机数
[atguigu@hadoop102 mapreduce]$ hadoop jar /opt/module/hadoop-3.1.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar randomwriter random-data
(2)执行Sort程序
[atguigu@hadoop102 mapreduce]$ hadoop jar /opt/module/hadoop-3.1.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar sort random-data sorted-data
(3)验证数据是否真正排好序了
[atguigu@hadoop102 mapreduce]$ hadoop jar /opt/module/hadoop-3.1.3/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.1.3-tests.jar testmapredsort -sortInput random-data -sortOutput sorted-data10.3、 企业开发场景案例 10.3.1、 需求
(1)需求:从1G数据中,统计每个单词出现次数。服务器3台,每台配置4G内存,4核CPU,4线程。
(2)需求分析:
1G / 128m = 8个MapTask;1个ReduceTask;1个mrAppMaster
平均每个节点运行10个 / 3台 ≈ 3个任务(4 3 3)
(1)修改:hadoop-env.sh
export HDFS_NAMENODE_OPTS="-Dhadoop.security.logger=INFO,RFAS -Xmx1024m" export HDFS_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS -Xmx1024m"
(2)修改hdfs-site.xml
(3)修改core-site.xml dfs.namenode.handler.count 21 (4)分发配置 [atguigu@hadoop102 hadoop]$ xsync hadoop-env.sh hdfs-site.xml core-site.xml 10.3.3 MapReduce参数调优 (1)修改mapred-site.xml fs.trash.interval 60 mapreduce.task.io.sort.mb 100 mapreduce.map.sort.spill.percent 0.80 mapreduce.task.io.sort.factor 10 mapreduce.map.memory.mb -1 The amount of memory to request from the scheduler for each map task. If this is not specified or is non-positive, it is inferred from mapreduce.map.java.opts and mapreduce.job.heap.memory-mb.ratio. If java-opts are also not specified, we set it to 1024. mapreduce.map.cpu.vcores 1 mapreduce.map.maxattempts 4 mapreduce.reduce.shuffle.parallelcopies 5 mapreduce.reduce.shuffle.input.buffer.percent 0.70 mapreduce.reduce.shuffle.merge.percent 0.66 mapreduce.reduce.memory.mb -1 The amount of memory to request from the scheduler for each reduce task. If this is not specified or is non-positive, it is inferred from mapreduce.reduce.java.opts and mapreduce.job.heap.memory-mb.ratio. If java-opts are also not specified, we set it to 1024. mapreduce.reduce.cpu.vcores 2 mapreduce.reduce.maxattempts 4 mapreduce.job.reduce.slowstart.completedmaps 0.05 mapreduce.task.timeout 600000
(2)分发配置
[atguigu@hadoop102 hadoop]$ xsync mapred-site.xml10.3.4、 Yarn参数调优
(1)修改yarn-site.xml配置参数如下:
The class to use as the resource scheduler. yarn.resourcemanager.scheduler.class org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler Number of threads to handle scheduler interface. yarn.resourcemanager.scheduler.client.thread-count 8 Enable auto-detection of node capabilities such as memory and CPU. yarn.nodemanager.resource.detect-hardware-capabilities false Flag to determine if logical processors(such as hyperthreads) should be counted as cores. only applicable on Linux when yarn.nodemanager.resource.cpu-vcores is set to -1 and yarn.nodemanager.resource.detect-hardware-capabilities is true. yarn.nodemanager.resource.count-logical-processors-as-cores false Multiplier to determine how to convert phyiscal cores to vcores. This value is used if yarn.nodemanager.resource.cpu-vcores is set to -1(which implies auto-calculate vcores) and yarn.nodemanager.resource.detect-hardware-capabilities is set to true. The number of vcores will be calculated as number of CPUs * multiplier. yarn.nodemanager.resource.pcores-vcores-multiplier 1.0 Amount of physical memory, in MB, that can be allocated for containers. If set to -1 and yarn.nodemanager.resource.detect-hardware-capabilities is true, it is automatically calculated(in case of Windows and Linux). In other cases, the default is 8192MB. yarn.nodemanager.resource.memory-mb 4096 Number of vcores that can be allocated for containers. This is used by the RM scheduler when allocating resources for containers. This is not used to limit the number of CPUs used by YARN containers. If it is set to -1 and yarn.nodemanager.resource.detect-hardware-capabilities is true, it is automatically determined from the hardware in case of Windows and Linux. In other cases, number of vcores is 8 by default. yarn.nodemanager.resource.cpu-vcores 4 The minimum allocation for every container request at the RM in MBs. Memory requests lower than this will be set to the value of this property. Additionally, a node manager that is configured to have less memory than this value will be shut down by the resource manager. yarn.scheduler.minimum-allocation-mb 1024 The maximum allocation for every container request at the RM in MBs. Memory requests higher than this will throw an InvalidResourceRequestException. yarn.scheduler.maximum-allocation-mb 2048 The minimum allocation for every container request at the RM in terms of virtual CPU cores. Requests lower than this will be set to the value of this property. Additionally, a node manager that is configured to have fewer virtual cores than this value will be shut down by the resource manager. yarn.scheduler.minimum-allocation-vcores 1 The maximum allocation for every container request at the RM in terms of virtual CPU cores. Requests higher than this will throw an InvalidResourceRequestException. yarn.scheduler.maximum-allocation-vcores 2 Whether virtual memory limits will be enforced for containers. yarn.nodemanager.vmem-check-enabled false Ratio between virtual memory to physical memory when setting memory limits for containers. Container allocations are expressed in terms of physical memory, and virtual memory usage is allowed to exceed this allocation by this ratio. yarn.nodemanager.vmem-pmem-ratio 2.1
(2)分发配置
[atguigu@hadoop102 hadoop]$ xsync yarn-site.xml10.3.5、 执行程序
(1)重启集群
[atguigu@hadoop102 hadoop-3.1.3]$ sbin/stop-yarn.sh [atguigu@hadoop103 hadoop-3.1.3]$ sbin/start-yarn.sh
(2)执行WordCount程序
[atguigu@hadoop102 hadoop-3.1.3]$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount /input /output
(3)观察Yarn任务执行页面
http://hadoop103:8088/cluster/apps



