spark-standalone集群模式下运行pyspark程序,报这个错误:
Exception: Python in worker has different version 2.6 than that in driver 2.7, PySpark cannot run with different minor versions.Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set.
原因:master节点是用的安装的anaconda的2.7的python,而二个worker节点是用的linux自带ode默认的2.6版本的python。
查看master节点的python版本
[atguigu@hadoop101 ~]$ python --version
Python 2.7.12 :: Anaconda 4.2.0 (64-bit)
查看worker节点的python版本
[atguigu@hadoop103 software]$ python --version
Python 2.6.6
解决办法:在每一个worker节点安装相同版本的anaconda,并设置好环境变量默认使用anaconda的python。



