前往JAVA Downloads下载合适Java版本MAC JAVA 安装流程在命令行中输入 java -version测试安装是否成功
2. 安装Spark
前往spark官网下载所需安装的版本
Downloads are pre-packaged for a handful of popular Hadoop versions. Users can also download a “Hadoop free” binary and run Spark with any Hadoop version by augmenting Spark’s classpath.
此处下载已经打包了Hadoop的版本将安装包移动到安装路径下,并进行解压
sudo tar -zvxf ***.tgz
在安装路径下启动命令行,测试是否已经安装成功
配置环境变量,方便在终端中打开
export SPARK_HOME=/opt/spark-3.2.1-bin-hadoop3.2 export PATH=$PATH:$SPARK_HOME/bin
配置文件生效
source ~/.zshrc
成功后,测试如下
Spark-shell本地测试
创建一个文件words.txt
hello world hadoop spark spark hello world
在命令行键入
val lines = sc.textFile("/tmp/words.txt")
lines.count()
lines.first()
退出spark
:quit
Pyspark本地测试 窗口测试
在命令行输入pyspark调动Python接口
在命令行键入
lines = sc.textFile("/tmp/words.txt")
lines.count()
lines.first()
Python程序执行
import findspark
findspark.init() # 初始化找到本机安装的spark环境
from pyspark import SparkConf
from pyspark import SparkContext
sc = SparkContext("local", "count app")
words = sc.parallelize(
["scala",
"java",
"hadoop",
"spark",
"akka",
"spark vs hadoop",
"pyspark",
"pyspark and spark"
])
counts = words.count()
print("Number of elements in RDD -> %i" % counts)
print(words.first())



