Caused by: org.apache.spark.SparkException: This RDD lacks a SparkContext. It could happen in the

这个报错是因为RDD的transformation中嵌套transformation或action，导致计算失败

可以先从报错那一行找到嵌套的transformation或action操作，把这个操作拿出来运算

报错

Caused by: org.apache.spark.SparkException: This RDD lacks a SparkContext. It could happen in the following cases:
(1) RDD transformations and actions are NOT invoked by the driver, but inside of other transformations; for example, rdd1.map(x => rdd2.values.count() * x) is invalid because the values transformation and count action cannot be performed inside of the rdd1.map transformation. For more information, see SPARK-5063.
(2) When a Spark Streaming job recovers from checkpoint, this exception will be hit if a reference to an RDD not defined by the streaming job is used in DStream operations. For more information, See SPARK-13758.

原因：org.apache.spark.SparkException：此RDD缺少SparkContext。在以下情况下可能发生这种情况：
（1） RDD转换和操作不是由驱动程序调用的，而是在其他转换内部调用的；例如，rdd1.map（x=>rdd2.values.count（）*x）无效，因为无法在rdd1.map转换内部执行值转换和计数操作。有关更多信息，请参阅SPARK-5063。
（2）当Spark流作业从检查点恢复时，如果在数据流操作中使用了对未由流作业定义的RDD的引用，则会发生此异常。有关更多信息，请参阅SPARK-13758。

我这里的错误就是第一种情况，附上代码

import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.rdd.RDD


object Practice2 {
    def main(args: Array[String]): Unit = {
        val sparkContext = new SparkContext(new SparkConf().setAppName("Test").setMaster("local"))
        val value: RDD[String] = sparkContext.textFile("file//day1012Practice2")
        sparkContext.setLogLevel("error")
        val value1: RDD[(String, Int)] = value.flatMap(_.split("\s+|\.")).map((_, 1)).reduceByKey(_ + _)
        value1.cache()
        //输出Spark出现的次数
        value1.filter(_._1=="Spark").map(_._2).foreach(println)
        //获取最大出现次数
        val max: Int = value1.map(_._2).max()
        value1.filter(_._2==max).foreach(println)

        
    }
}

运行结果

源文件格式

以空格和.分割的单词

Caused by: org.apache.spark.SparkException: This RDD lacks a SparkContext. It could happen in the

大数据系统相关栏目本月热门文章