将src/main/java改名为src/main/scala
修改pom.xml
pom.xml
4.0.0 com.example sparkTest 1.0-SNAPSHOT org.scala-lang scala-library 2.11.8 org.apache.spark spark-core_2.12 3.1.2 org.apache.hadoop hadoop-client 3.3.0
引入scala依赖
File->Project Structure
这样scala目录就可以创建scala文件了。
wordCount
package com.example
import org.apache.spark.{SparkConf, SparkContext}
object wordCount {
def main(args: Array[String]): Unit = {
val conf=new SparkConf()
conf.setAppName("MyFirstSparkApplication")
conf.setMaster("yarn")
val sc=new SparkContext(conf)
val data=sc.textFile(args(0))
val words=data.flatMap(_.split("n")).flatMap(_.split(" ")).map((_,1)).reduceByKey(_+_).collect()
words foreach println
}
}
IDEA打包
File->Project Structure
集群上运行
spark-submit --master yarn --name wordCount --class com.example.wordCount hdfs:///sparkTest.jar /user/hadoop/input
其中/user/hadoop/input为hdfs路径。



