文章目录
1、解压 Spark 安装包2、配置 spark 系统环境3、配置 集群节点4、配置 spark-env.sh5、分发 spark6、spark 集群启动
准备环境:
Hadoop 完全分布式集群环境Scala 安装包:https://www.scala-lang.org/download/all.htmlSpark 安装包:http://archive.apache.org/dist/spark/Scala 的安装参见本人博文:【CentOS】scala安装Spark 基础模式参见本人博文:【CentOS】Spark 运行环境(Local、Standalone)
1、解压 Spark 安装包
上传本地的 spark 安装包到虚机上:
解压安装后重命名:
[root@server download]# tar -zxvf spark-2.4.2-bin-hadoop2.6.tgz -C /usr/local/src/ [root@server download]# cd /usr/local/src/ [root@server download]# ll
返回顶部
2、配置 spark 系统环境
打开 /etc/profile 添加如下内容配置spark环境:
# set spark environment export SPARK_HOME=/usr/local/src/spark export PATH=$PATH:$SPARK_HOME/bin
配置完成后保存退出 ,source 命令使其生效!
返回顶部
3、配置 集群节点
进入解压缩后路径的 conf 目录,修改 slaves.template 文件名为 slaves,在其中删除 localhost,并添加虚机的主机名(一行一个):
# Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR ConDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # # A Spark Worker will be started on each of the machines listed below. server # 此处可以省略server主节点的配置 agent1 agent2
返回顶部
4、配置 spark-env.sh
复制 spark-env.sh.template 文件名为 spark-env.sh,添加 JAVA_HOME 环境变量和集群对应的 master 主节点,以及用到的 hadoop 对应配置:
# 环境配置 export JAVA_HOME=/usr/local/src/java export SCALA_HOME=/usr/local/src/scala export HADOOP_HOME=/usr/local/src/hadoop export HADOOP_CONF_DIR=/usr/local/src/hadoop/etc/hadoop # 主节点配置 export SPARK_MASTER_HOST=server export SPARK_MASTER_IP=192.168.64.183 export SPARK_MASTER_PORT=7077 export SPARK_WORKER_MEMORY=1G export SPARK_EXECUTOR_CORES=2
返回顶部
5、分发 spark
将主节点的 spark 分发给各从节点虚机:
root@server sbin]# scp -r /usr/local/src/spark root@agent1:/usr/local/src/ root@server sbin]# scp -r /usr/local/src/spark root@agent2:/usr/local/src/
分发完成后,各虚拟机配置环境,并使其生效!
6、spark 集群启动
首先启动 hadoop集群,然后在主节点使用启动 sbin/start-all.sh 脚本进行启动,从节点便会跟着启动:
浏览器输入:http://server:8080 ,查看 Master 资源监控 Web UI 界面:
返回顶部



