@Dikei的答案实际上是正确的,但我相信您正在寻找的是
sc.getPersistentRDDs:
scala> val rdd1 = sc.makeRDD(1 to 100)# rdd1: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at makeRDD at <console>:27scala> val rdd2 = sc.makeRDD(10 to 1000)# rdd2: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[1] at makeRDD at <console>:27scala> rdd2.cache.setName("rdd_2")# res0: rdd2.type = rdd_2 ParallelCollectionRDD[1] at makeRDD at <console>:27scala> sc.getPersistentRDDs# res1: scala.collection.Map[Int,org.apache.spark.rdd.RDD[_]] = Map(1 -> rdd_2 ParallelCollectionRDD[1] at makeRDD at <console>:27)scala> rdd1.cache.setName("foo")# res2: rdd1.type = foo ParallelCollectionRDD[0] at makeRDD at <console>:27scala> sc.getPersistentRDDs# res3: scala.collection.Map[Int,org.apache.spark.rdd.RDD[_]] = Map(1 -> rdd_2 ParallelCollectionRDD[1] at makeRDD at <console>:27, 0 -> foo ParallelCollectionRDD[0] at makeRDD at <console>:27)现在让我们添加另一个
RDD并命名它:
scala> rdd3.setName("bar")# res4: rdd3.type = bar ParallelCollectionRDD[2] at makeRDD at <console>:27scala> sc.getPersistentRDDs# res5: scala.collection.Map[Int,org.apache.spark.rdd.RDD[_]] = Map(1 -> rdd_2 ParallelCollectionRDD[1] at makeRDD at <console>:27, 0 -> foo ParallelCollectionRDD[0] at makeRDD at <console>:27)我们注意到它实际上并没有持久。



