value型:
输入输出一对一:map flatMap mapPartitions
输入输出多对一:union cartesian
输入输出多对多:groupBy
输出为输入子集:filter distinct subtract sample takeSample
cache型:cache persist
key-value型:
一对一:mapValues
单个RDD聚集:combineByKey reduceByKey partitionBy
两个RDD:cogroup
连接:join leftOutJoin rightOutJoin
无输出(不到hdfs、本地):foreach
HDFS:saveAsTextFile saveAsObjectFile
scala集合数据类型:collect collectAsMap reduceByKeyLocally lookup count
top reduce fold aggregate



