hadoop高可用搭建之配置@TOC
hadoop配置
1、core-site.xml
fs.defaultFS hdfs://hacluster hadoop.tmp.dir /opt/Bigdata/hadoop-3.2.2/data ha.zookeeper.quorum hadoop-168-0-110:2181,hadoop-168-0-112:2181,hadoop-168-0-114:2181,hadoop-168-0-116:2181,hadoop-168-0-118:2181 ipc.client.connect.max.retries 100 hadoop.http.staticuser.user xiaoman
2、hdfs-site.xml
dfs.replication 3 dfs.nameservices hacluster dfs.ha.namenodes.hacluster nn1,nn2 dfs.namenode.rpc-address.hacluster.nn1 hadoop-168-0-110:8020 dfs.namenode.rpc-address.hacluster.nn2 hadoop-168-0-114:8020 dfs.namenode.http-address.hacluster.nn1 hadoop-168-0-110:9870 dfs.namenode.rpc-address.hacluster.nn2 hadoop-168-0-114:9870 dfs.namenode.shared.edits.dir qjournal://hadoop-168-0-110:8485;hadoop-168-0-112:8485;hadoop-168-0-114:8485;hadoop-168-0-116:8485;hadoop-168-0-118:8485/hacluster dfs.ha.fencing.methods sshfence dfs.ha.fencing.ssh.private-key-files /home/xiaoman/.ssh/id_rsa dfs.journalnode.edits.dir /opt/Bigdata/hadoop-3.2.2/data/journalnode dfs.permissions.enabled true dfs.client.failover.proxy.provider.hacluster org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.automatic-failover.enabled true
3、mapred-site.xml
mapreduce.framework.name yarn mapreduce.jobhistory.address hadoop-168-0-112:10020 mapreduce.jobhistory.webapp.address hadoop-168-0-112:19888 mapreduce.jobhistory.joblist.cache.size 2000 mapreduce.application.classpath $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*, $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/* yarn.app.mapreduce.am.env HADOOP_MAPRED_HOME=${HADOOP_HOME} mapreduce.map.env HADOOP_MAPRED_HOME=${HADOOP_HOME} mapreduce.reduce.env HADOOP_MAPRED_HOME=${HADOOP_HOME}
4、yarn-site.xml
yarn.nodemanager.aux-services mapreduce_shuffle yarn.resourcemanager.ha.enabled true yarn.resourcemanager.cluster-id yarncluster yarn.resourcemanager.ha.rm-ids rm1,rm2 yarn.resourcemanager.hostname.rm1 hadoop-168-0-112 yarn.resourcemanager.hostname.rm2 hadoop-168-0-116 yarn.resourcemanager.zk-address hadoop-168-0-110:2181,hadoop-168-0-112:2181,hadoop-168-0-114:2181,hadoop-168-0-116:2181,hadoop-168-0-118:2181 yarn.resourcemanager.recovery.enabled true yarn.resourcemanager.store.class org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore yarn.nodemanager.env-whitelist JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_HOME,PATH,LANG,TZ Environment variables that should be forwarded from the NodeManager's environment to the container's, specified as a comma separated list of VARNAME=value pairs. To define environment variables individually, you can specify multiple properties of the form yarn.nodemanager.admin-env.VARNAME, where VARNAME is the name of the environment variable. This is the only way to add a variable when its value contains commas. yarn.log-aggregation-enable true yarn.log.server.url http://hadoop-168-0-112:19888/jobhistory/logs 由于我的历史服务器放在了112节点上,所以日志聚集也放在112上好了,方便历史服务器查看 yarn.log-aggregation-retain-seconds 604800 设置为7天 yarn.resourcemanager.webapp.address.rm1 hadoop-168-0-112:8088 yarn.resourcemanager.webapp.address.rm2 hadoop-168-0-116:8088 yarn.resourcemanager.scheduler.address.rm2 hadoop-168-0-116 yarn.resourcemanager.scheduler.address.rm1 hadoop-168-0-112
5、workers
写你的数据节点(哪些节点需要安装datanode和NodeManager)
zookeeper配置
zoo.cfg是用zoo_sample.cfg复制出来重命名的
1、zoo.cfg
# zookeeper 之后会读取zoo.cfg作为默认配置文件 # 修改 zoo.cfg 配置文件 # The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=/opt/Bigdata/zookeeper-3.5.7/data dataLogDir=/opt/Bigdata/zookeeper-3.5.7/logs # the port at which the clients will connect clientPort=2181 # the maximum number of client connections. # increase this if you need to handle more clients #maxClientCnxns=60 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 #server.1 这个1是服务器的标识,可以是任意有效数字,标识这是第几个服务器节点,这个标识要写到dataDir目录下面myid文件里(myid文件要自己创建),等号右边指明集群间通讯端口和选举端口 server.1=hadoop-168-0-110:2888:3888 server.2=hadoop-168-0-112:2888:3888 server.3=hadoop-168-0-114:2888:3888 server.4=hadoop-168-0-116:2888:3888 server.5=hadoop-168-0-118:2888:3888



