mv /usr/local/spark/conf/slaves.template /usr/local/spark/conf/slavesb) 编辑slaves文件,使用以下命令:
vim /usr/local/spark/conf/slavesc) 替换原有的localhost为以下内容:
master slave1 slave22.通过以下步骤,配置Spark集群运行参数: a) 重命名spark-env.sh.template配置文件为spark-env.sh
mv /usr/local/spark/conf/spark-env.sh.template /usr/local/spark/conf/spark-env.shb) 编辑spark-env.sh文件,在最后追加以下内容:
vim /usr/local/spark/conf/spark-env.sh
# 设置 JDK 目录 export JAVA_HOME=/usr/local/lib/jdk1.8.0_212 # 设置 web 监控页面端口号 export SPARK_MASTER_WEB_PORT=7077 # 设置 zookeeper 集群地址,实现高可用 export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=master:2181,slave1:2181,slave2:2181 -Dspark.deploy.zookeeper.dir=/usr/local/spark" # 设置 YARN 的配置文件目录 export YARN_CONF_DIR=/usr/local/hadoop/etc/hadoop # 设置 HDFS 的配置文件目录 export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop3.修改web端口为8085
vim /usr/local/spark/sbin/start-master.sh4.通过以下步骤,部署Spark到slave1和slave2: a) 创建spark目录,在slave1和slave2使用以下命令:
sudo mkdir /usr/local/sparkb) 修改spark目录的所有者为hadoop用户,在slave1和slave2使用以下命令:
sudo chown hadoop /usr/local/spark/c) 发送spark给slave1和slave2,在master使用以下命令:
scp -r /usr/local/spark/* hadoop@slave1:/usr/local/spark/ scp -r /usr/local/spark/* hadoop@slave2:/usr/local/spark/d)分别进入到/usr/local/spark中查看是否发送成功
scp /home/hadoop/.bashrc hadoop@slave1:/home/hadoop/ scp /home/hadoop/.bashrc hadoop@slave2:/home/hadoop/f) 刷新环境变量,在slave1和slave2使用以下命令:
source /home/hadoop/.bashrc测试 1.启动zookeeper(三台虚拟机都要启动)
zkServer.sh start2.在master上启动spark
一定要先进入到spark中
cd /usr/local/spark/
sbin/start-all.sh3.在slave1启动备用master,在slave1使用以下命令:
start-master.sh4.查看进程
jps5.查看web端口8085
大家只有三个workers Id即可
# 关闭spark集群(在master上) sbin/stop-all.sh # 关闭 master(在salve1上) stop-master.sh # 关闭zookeeper(三台都要执行) zkServer.sh stop



