环境准备:
虚拟机信息:
192.168.2.201 hadoop001
192.168.2.202 hadoop002
192.168.2.203 hadoop003
1、安装JDK(注意下面的tips,安装命令懒得改了)所有节点执行,安装包下载地址:
https://www.oracle.com/java/technologies/downloads/
安装命令
mkdir -p /usr/java tar -zxvf jdk-17_linux-x64_bin.tar.gz -C /usr/java/
在 /etc/profile中添加如下代码,并执行source /etc/profile生效
export JAVA_HOME=/usr/java/jdk-17
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
2、节点之间免密与关闭selinux
ssh-keygen ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop001 ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop002 ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop003
关闭selinux
sed -i "s#SELINUX=enforcing#SELINUX=disabled#g" /etc/selinux/config reboot3、安装Hadoop 1.下载Hadoop
下载地址:
https://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz
解压:
tar -zxvf hadoop-3.3.1.tar.gz -C /usr/local/
配置环境变量并 source /etc/profile 立刻生效:
export HADOOP_HOME=/usr/local/hadoop-3.3.1 export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME:/sbin
以上在所有节点执行
2.在master节点进行hadoop相关配置配置文件在解压路径下etc文件夹内 /usr/local/hadoop-3.3.1/etc/hadoop
hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_301 export HDFS_SECONDARYNAMENODE_USER=root export HADOOP_SHELL_EXECNAME=root export HDFS_DATANODE_USER=root export HDFS_NAMENODE_USER=root export HDFS_DATANODE_SECURE_USER=root export YARN_RESOURCEMANAGER_USER=root export YARN_NODEMANAGER_USER=root
core-site.xml
fs.defaultFS hdfs://hadoop001:9000 hadoop.tmp.dir /home/hadoop/tmp
hdfs-site.xml
dfs.namenode.secondary.http-address hadoop001:50090 dfs.replication 3 dfs.namenode.name.dir file:/home/hadoop/namenode/data true dfs.datanode.data.dir file:/home/hadoop/datanode/data true dfs.webhdfs.enabled true dfs.permissions.enabled false
mapred-site.xml
mapreduce.framework.name yarn mapreduce.jobhistory.address master:10020 mapreduce.jobhistory.webapp.address master:19888 mapreduce.application.classpath /usr/local/hadoop-3.3.1/etc/hadoop, /usr/local/hadoop-3.3.1/share/hadoop/common/*, /usr/local/hadoop-3.3.1/share/hadoop/common/lib/*, /usr/local/hadoop-3.3.1/share/hadoop/hdfs/*, /usr/local/hadoop-3.3.1/share/hadoop/hdfs/lib/*, /usr/local/hadoop-3.3.1/share/hadoop/mapreduce/*, /usr/local/hadoop-3.3.1/share/hadoop/mapreduce/lib/*, /usr/local/hadoop-3.3.1/share/hadoop/yarn/*, /usr/local/hadoop-3.3.1/share/hadoop/yarn/lib/*
yarn-site.xml
yarn.nodemanager.aux-services mapreduce_shuffle yarn.resourcemanager.address hadoop001:8032 yarn.resourcemanager.scheduler.address hadoop001:8030 yarn.log-aggregation-enable true yarn.resourcemanager.resource-tracker.address hadoop001:8031 yarn.resourcemanager.admin.address hadoop001:8033 yarn.resourcemanager.webapp.address hadoop001:8088
workers
hadoop001 hadoop002 hadoop0033.格式化namenode
hdfs namenode -format4.启动hadoop
在sbin下执行
./start-all.shtips: 深坑1: jdk版本只能是jdk8,jdk8 目前已经为商业版。
否则yarn会出现如下报错
2021-09-28 09:06:08,285 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting ResourceManager java.lang.ExceptionInInitializerError at com.google.inject.internal.cglib.reflect.$FastClassEmitter.深坑2:(FastClassEmitter.java:67) at com.google.inject.internal.cglib.reflect.$FastClass$Generator.generateClass(FastClass.java:72) at com.google.inject.internal.cglib.core.$DefaultGeneratorStrategy.generate(DefaultGeneratorStrategy.java:25) at com.google.inject.internal.cglib.core.$AbstractClassGenerator.create(AbstractClassGenerator.java:216) at com.google.inject.internal.cglib.reflect.$FastClass$Generator.create(FastClass.java:64) at com.google.inject.internal.BytecodeGen.newFastClass(BytecodeGen.java:204) at com.google.inject.internal.ProviderMethod$FastClassProviderMethod. (ProviderMethod.java:256) at com.google.inject.internal.ProviderMethod.create(ProviderMethod.java:71) at com.google.inject.internal.ProviderMethodsModule.createProviderMethod(ProviderMethodsModule.java:275) at com.google.inject.internal.ProviderMethodsModule.getProviderMethods(ProviderMethodsModule.java:144) at com.google.inject.internal.ProviderMethodsModule.configure(ProviderMethodsModule.java:123) at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340) at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:349) at com.google.inject.AbstractModule.install(AbstractModule.java:122) at com.google.inject.servlet.ServletModule.configure(ServletModule.java:52) at com.google.inject.AbstractModule.configure(AbstractModule.java:62) at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340) at com.google.inject.spi.Elements.getElements(Elements.java:110) at com.google.inject.internal.InjectorShell$Builder.build(InjectorShell.java:138) at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:104) at com.google.inject.Guice.createInjector(Guice.java:96) at com.google.inject.Guice.createInjector(Guice.java:73) at com.google.inject.Guice.createInjector(Guice.java:62) at org.apache.hadoop.yarn.webapp.WebApps$Builder.build(WebApps.java:417) at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:465) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startWepApp(ResourceManager.java:1389) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1498) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1699) Caused by: java.lang.reflect.InaccessibleObjectException: Unable to make protected final java.lang.Class java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain) throws java.lang.ClassFormatError accessible: module java.base does not "opens java.lang" to unnamed module @46d21ee0
hdfs namenode -format后环境起不来,需要清理下datanode下的文件,是否对存储的数据有影响,还需要测试
深坑3:3.x版本后,访问namenode节点的端口为9870
yarn管理页面为 http://namenodeip:8088/cluster
深坑4:hdfs-site.xml 中路径要加file:/ 否则上传文件会出现
hadoop fs -ls / ls: Call From 127.0.1.1 to 0.0.0.0:9000 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused 查看日志报错如下: 2021-09-28 09:59:31,621 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /home/hadoop/namenode/data is in an inconsistent state: storage directory does not exist or is not accessible.4、安装Spark 1. 下载对应版本spark
https://spark.apache.org/downloads.html
2、上传并解压到自指定路径下tar -zxvf spark-3.1.2-bin-hadoop3.2.tgz mv spark-3.1.2-bin-hadoop3.2 /usr/local/spark-3.1.23、修改/etc/profile并生效
export SPARK_HOME=/usr/local/spark-3.1.2 export PATH=$PATH:$SPARK_HOME:/bin:$SPARK_HOME:/sbin4、配置配置文件
/usr/local/spark-3.1.2/conf
spark-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_301 export HADOOP_CONF_DIR=/usr/local/hadoop-3.3.1/etc/hadoop export SPARK_MASTER_HOST=hadoop001 export SPARK_LOCAL_DIRS=/usr/local/spark-3.1.2
workers
hadoop001 hadoop002 hadoop0035、将spark分发给其他节点 6、启动spark
sbin下执行
./start-all.sh
workers中记得要删除原来的localhost,否则会出现无法启动spark的现象



