hadoop-env.sh:配置Hadoop环境变量
# distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR ConDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # Set Hadoop-specific environment variables here. # The only required environment variable is JAVA_HOME. All others are # optional. When running a distributed configuration it is best to # set JAVA_HOME in this file, so that it is correctly defined on # remote nodes. # The java implementation to use. export JAVA_HOME=/usr/apps/jdk
core-site.xml:在该文件中配置HDFS端口,指定Hadoop临时目录和zookeeper集群地址
fs.defaultFS hdfs://ns1 hadoop.tmp.dir /usr/apps/hadoop/tmp ha.zookeeper.quorum master1:2181,master2:2181,slave:2181
hdfs.site.xml:配置两台NameNode端口地址和通信方式并指定NameNode的元数据上的存放位置,开启NameNode失败自动切换以及配置sshfence(通过ssh远程登录到前一个Active NameNode并将其结束进程)
dfs.replication 2 dfs.namenode.name.dir file:/usr/apps/hadoop/data dfs.webhdfs.enabled true dfs.nameservice ns1 dfs.ha.namenode.ns1 nn1,nn2 dfs.namenode.rpc-address.ns1.nn1 master1:9000 dfs.namenode.http-address.ns1.nn1 master1:50070 dfs.namenode.rpc-address.ns1.nn2 master2:9000 dfs.namenode.http-address.ns1.nn2 master2:50070 dfs.namenode.shared.edits.dir qjournal://master1:8485;master2:8485;slave:8485/ns1 dfs.journalnode.edits.dir /usr/apps/hadoop/journaldata dfs.ha.automatic-failover.enabled true dfs.cilent.failover.proxy.provider.ns1 org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.fencing.methods sshfence shell(/bin/true) dfs.ha.fencing.ssh.private-key-files /root/.ssh/id_rsa dfs.ha.fencing.ssh.connect-timeout 3000
mapred-site.xml:配置MapReduce计算框架为yarn方式
mapreduce.framework.name yarn
yarn-site.xml:开启ResourceManager高可用,指定ResourceManager的端口名称地址,并配置Zookeeper集群地址
yarn.nodemanager.resource.memory-mb 2048 yarn.scheduler.maximum-allocation-mb 2048 yarn.nodemanager.resource.cpu-vcorse 1 yarn.resourcemanager.ha.enabled true yarn.resourcemanager.cluster-id yrc yarn.resourcemanager.ha.rm-ids rm1,rm2 yarn.resourcemanager.hostname.rm1 master1 yarn.resourcemanager.hostname.rm2 master2 yarn.resourcemanager.hostname.zk-address master1:2181,master2:2181,slave:2181 yarn.nodemanager.aux-services mapreduce_shuffle
slaves:集群从机配置
master1 master2 slave
===
- 启动zookeeper:zkServer.sh start
查看启动状态:zkServer.sh status
【或进入到zookeeper的sbin目录下启动】
# 启动 ./zkServer.sh start # 查看 ./zkServer.sh status
- 启动集群各个节点监控NameNode的管理日志的JournalNode:
hadoop-daemon.sh start journalnode
- 在master1节点格式化NameNode,并将格式化后的目录拷贝到master2中:
hadoop namenode -format scp -r /usr/apps/hadoop/data master2:/usr/apps/hadoop
- 在master1节点上格式化ZKFC:
hdfs zkfc -formatZK
- 在master1上启动HDFS:
start-dfs.sh
- 在master1节点上启动Yarn:
start-yarn.sh



