您需要将该列转换
low为日期,然后才能
datediff()与结合使用
lit()。使用 Spark 2.2 :
from pyspark.sql.functions import datediff, to_date, litdf.withColumn("test", datediff(to_date(lit("2017-05-02")), to_date("low","yyyy/MM/dd"))).show()+----------+----+------+-----+| low|high|normal| test|+----------+----+------+-----+|1986/10/15| z| null|11157||1986/10/15| z| null|11157||1986/10/15| c| null|11157||1986/10/15|null| null|11157||1986/10/16|null| 4.0|11156|+----------+----+------+-----+使用 < Spark 2.2,我们需要首先将该
low列转换为class
timestamp:
from pyspark.sql.functions import datediff, to_date, lit, unix_timestampdf.withColumn("test", datediff(to_date(lit("2017-05-02")), to_date(unix_timestamp('low', "yyyy/MM/dd").cast("timestamp")))).show()


