spark部署(spark集群部署)

文章目录

1、解压 Spark 安装包2、配置 spark 系统环境3、配置集群节点4、配置 spark-env.sh5、分发 spark6、spark 集群启动

准备环境：

Hadoop 完全分布式集群环境Scala 安装包：https://www.scala-lang.org/download/all.htmlSpark 安装包：http://archive.apache.org/dist/spark/Scala 的安装参见本人博文：【CentOS】scala安装Spark 基础模式参见本人博文：【CentOS】Spark 运行环境（Local、Standalone）

1、解压 Spark 安装包

上传本地的 spark 安装包到虚机上：

解压安装后重命名：

[root@server download]# tar -zxvf spark-2.4.2-bin-hadoop2.6.tgz -C /usr/local/src/
[root@server download]# cd /usr/local/src/
[root@server download]# ll

返回顶部

2、配置 spark 系统环境

打开 /etc/profile 添加如下内容配置spark环境：

# set spark environment
export SPARK_HOME=/usr/local/src/spark
export PATH=$PATH:$SPARK_HOME/bin

配置完成后保存退出，source 命令使其生效！

返回顶部

3、配置集群节点

进入解压缩后路径的 conf 目录，修改 slaves.template 文件名为 slaves，在其中删除 localhost，并添加虚机的主机名（一行一个）：

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR ConDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# A Spark Worker will be started on each of the machines listed below.
server  # 此处可以省略server主节点的配置
agent1
agent2

返回顶部

4、配置 spark-env.sh

复制 spark-env.sh.template 文件名为 spark-env.sh，添加 JAVA_HOME 环境变量和集群对应的 master 主节点，以及用到的 hadoop 对应配置：

# 环境配置
export JAVA_HOME=/usr/local/src/java
export SCALA_HOME=/usr/local/src/scala
export HADOOP_HOME=/usr/local/src/hadoop
export HADOOP_CONF_DIR=/usr/local/src/hadoop/etc/hadoop

# 主节点配置
export SPARK_MASTER_HOST=server
export SPARK_MASTER_IP=192.168.64.183
export SPARK_MASTER_PORT=7077
export SPARK_WORKER_MEMORY=1G
export SPARK_EXECUTOR_CORES=2

返回顶部

5、分发 spark

将主节点的 spark 分发给各从节点虚机：

root@server sbin]# scp -r /usr/local/src/spark root@agent1:/usr/local/src/
root@server sbin]# scp -r /usr/local/src/spark root@agent2:/usr/local/src/

分发完成后，各虚拟机配置环境，并使其生效！

6、spark 集群启动

首先启动 hadoop集群，然后在主节点使用启动 sbin/start-all.sh 脚本进行启动，从节点便会跟着启动：

浏览器输入：http://server:8080 ，查看 Master 资源监控 Web UI 界面：

返回顶部

spark部署(spark集群部署)

大数据系统相关栏目本月热门文章