栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 前沿技术 > 大数据 > 大数据系统

Hadoop Ha + Hbase + Spark高可用集群搭建手册

Hadoop Ha + Hbase + Spark高可用集群搭建手册

Hadoop Ha + Hbase + Spark高可用集群搭建 1.前言

记录Hadoop Ha + Hbase+ Spark高可用集群的搭建,主要包括每个组件的配置信息,以及启动步骤。

在ubuntu18.04环境下,集群可以正常使用,运行。

2.Ling-Ha集群架构信息
节点NnRmDFSZKDnNmJnZoosparkHmHr
node1
node2
node3
node4
node5
node6
集群版本号端口
Hadoop3.2.29870
Yarn3.2.28088
MapReduce JobHistory Server3.2.219888
Spark-master3.1.28080
Spark-histoory3.1.24000
hbase2.2.716010
Zookeeper3.4.62181
缩写全称作用
NmNamenode元数据节点
RmResourceManageryarn资源管理节点
DFSZKDFSZKFailoverControllerzookeeper监控节点,Ha配置
DnDatanode数据节点
NmNodeManageryarn单节点管理,与Rm通信
JnJournalNode同步NameNode之间数据,Ha配置
ZooZookeeperzookeeper集群
HmHMasterHbase主节点
HrHRegionServerHbase从节点

3. Hadoop-Ha配置文件
  • core-site.xml

        
        
                fs.defaultFS
                hdfs://Ling-Ha
        
       
        
                hadoop.tmp.dir
                file:/usr/local/hadoop/tmp
                Abase for other temporary directories.
        
        
        
                ha.zookeeper.quorum
                node3:2181,node4:2181,node5:2181
        
        
        
                ha.zookeeper.session-timeout.ms
                10000
        

  • hdfs-site.xml

        
        
                dfs.nameservices
                Ling-Ha
        
        
        
                dfs.ha.namenodes.Ling-Ha
                nn1,nn2
        
        
        
         
                dfs.namenode.rpc-address.Ling-Ha.nn1 
                node1:9000 
        
        
	
	       dfs.namenode.http-address.Ling-Ha.nn1
	       node1:9870
	
        
        
                dfs.namenode.rpc-address.Ling-Ha.nn2
                node2:9000
        
        
	       dfs.namenode.http-address.Ling-Ha.nn2
	       node2:9870
	
         
         
        
               dfs.namenode.shared.edits.dir
               qjournal://node3:8485;node4:8485;node5:8485/Ling-Ha
        
        
         
        
              dfs.journalnode.edits.dir
              /usr/local/hadoop/journal
        
        
        
             dfs.ha.automatic-failover.enabled
             true
        
        
        
             dfs.client.failover.proxy.provider.Ling-Ha
             org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
        
        
        
             dfs.ha.fencing.methods
	     sshfence
	     
	     shell(true)
        
        
        
             dfs.ha.fencing.ssh.private-key-files
	     /home/node1/.ssh/id_rsa
        
        
        
             dfs.ha.fencing.ssh.connect-timeout
             60000
        
        
        
        
                dfs.replication
                3
        
         
        
                dfs.namenode.name.dir
                file:/usr/local/hadoop/tmp/dfs/name
        
        
        
                dfs.datanode.data.dir
                file:/usr/local/hadoop/tmp/dfs/data
        
        
                dfs.permissions
                false
        

  • yarn-site.xml

 
	   
		yarn.resourcemanager.cluster-id  
		Ling-yarn  
	   
         
         
                yarn.resourcemanager.ha.enabled
                true
         
         
          
                yarn.resourcemanager.ha.rm-ids 
                rm1,rm2 
         
         
          
                yarn.resourcemanager.hostname.rm1 
                node1 
              
          
		yarn.resourcemanager.hostname.rm2 
		node2 
         
	 
	
		yarn.resourcemanager.webapp.address.rm1
		node1:8088
	
	  
	
		yarn.resourcemanager.webapp.address.rm2
		node2:8088
        
        
        
                yarn.resourcemanager.zk-address
                node3:2181,node4:2181,node5:2181
        
          
	
		yarn.resourcemanager.ha.automatic-failover.enabled
		true
	
	
		yarn.resourcemanager.ha.automatic-failover.zk-base-path
		/yarn-leader-election
	
          
	 
	        yarn.resourcemanager.recovery.enabled  
		true  
	 
	
	        yarn.resourcemanager.store.class
	        org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
	
 
        
                yarn.nodemanager.aux-services
                mapreduce_shuffle
        

  • mapred-site.xml

        
        
                mapreduce.framework.name
                yarn
        
        
        
                mapreduce.jobhistory.address
                node1:10020
        
        
        
                mapreduce.jobhistory.webapp.address
                node1:19888
        
        
                yarn.app.mapreduce.am.env
                HADOOP_MAPRED_HOME=/usr/local/hadoop
        
        
                mapreduce.map.env
                HADOOP_MAPRED_HOME=/usr/local/hadoop
        
        
                mapreduce.reduce.env
                HADOOP_MAPRED_HOME=/usr/local/hadoop
         

  • workers
node3
node4
node5
node6

4.Hadoop-Ha初始化启动顺序
  • 启动zookeep集群
  • 依次启动所有journalnode
 #sbin目录下
 ./hadoop-daemon.sh start journalnode
  • 格式化主节点namenode,然后启动
hdfs namenode -format
./hadoop-daemon.sh start namenode
  • 在从节点 同步主节点namenode到备份namenode
hdfs namenode -bootstrapStandby
  • 在主节点namenode格式化zk
hdfs zkfc -formatZK
  • 停止主节点namenode和所有节点的journalnode
 #sbin目录下
./hadoop-daemon.sh stop namenode
./hadoop-daemon.sh stop journalnode
  • 执行脚本,启动所有节点
start-dfs.sh

5.Hadoop Ha模式下 配置spark和日志服务器
  • 配置spark-env.sh
#!/usr/bin/env bash
export SPARK_DIST_CLASSPATH=$(/usr/local/hadoop/bin/hadoop classpath)
export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop
export SPARK_MASTER_IP="填写主节点ip"
#"配置日志服务器"
export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=4000 -Dspark.history.retainedApplications=3 -Dspark.history.fs.logDirectory=hdfs://Ling-Ha/sparkLog"
  • 配置spark-default.conf
spark.eventLog.enabled           true
spark.eventLog.dir               hdfs://Ling-Ha/sparkLog
spark.eventLog.compress          true
spark.files                      file:///usr/local/spark/conf/hdfs-site.xml,file:///usr/local/spark/conf/core-site.xml
  • 配置workers
node1
node2
node3
node4
node5
node6
  • spark.eventLog.dir 路径需修改成hdfs-site.xml 中dfs.nameservices配置的值,且不需要指定端口
  • 将hadoop的core-site.xml和hdfs-site.xml,放到spark的conf目录下,让spark能找到Hadoop的配置
6.Hadoop Ha模式下配置Hbase Ha
  • 修改hbase-env.sh
#java
export JAVA_HOME=/usr/loca/jvm/hbase
#hbase
export Hbase_CLASSPATH=/usr/local/hbase/conf
#不使用自带zookeeper
export Hbase_MANAGES_ZK=false
#不包含hadoop包
export Hbase_DISABLE_HADOOP_CLASSPATH_LOOKUP="true"
  • 修改hbase-site.xml
	
	    hbase.cluster.distributed
	    true
	
  
	
	    hbase.rootdir
	    hdfs://Ling-Ha/hbase
	
  
	
	    hbase.zookeeper.property.dataDir
	    /usr/local/hbase/zookeeper
	
  
	
	    hbase.zookeeper.quorum
	    node3:2181,node4:2181,node5:2181
	
  • 修改regionservers
#添加节点名称
node3
node4
node5
node6
  • 将hadoop的core-site.xml和hdfs-site.xml,放到Hbase的conf目录下,让Hbase能找到Hadoop的配置
  • 配置Hbase Ha模式 在conf目录下创建backup-masters文件,输入备份节点名称
转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/487385.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号