栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 前沿技术 > 大数据 > 大数据系统

pyhive、pyspark配置

pyhive、pyspark配置

pyhive 检查HiveServer2
  • HiveServer2的启动
$HIVE_HOME/bin/hiveserver2
  • 测试客户端连接
$HIVE_HOME/bin/beeline
!connect jdbc:hive2://localhost:10000

报错:User: xxx is not allowed to impersonate anonymous,进行如下配置

  • hadoop配置



    hadoop.proxyuser.root.hosts
    *


    hadoop.proxyuser.root.groups
    *



    hadoop.proxyuser.spark.hosts
    *


    hadoop.proxyuser.spark.groups
    *

安装pyhive
  • 安装依赖
pip install sasl
pip install thrift
pip install thrift-sasl
pip install pyhive
  • sasl安装可能会出错
sudo apt-get install libsasl2-dev
连接pyhive
from pyhive import hive

conn = hive.Connection(host='127.0.0.1',
                       port=10000,
                       auth="CUSTOM",
                       username='root',
                       password='hive')
cursor = conn.cursor()
cursor.execute('select * from t limit 10')
for result in cursor.fetchall():
    print(result)
cursor.close()
conn.close()
pyspark
  • 环境变量下配置好python环境即可
#Java Environment
export JAVA_HOME=/opt/java
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
#Hadoop Environment
export HADOOP_HOME=/opt/hadoop
export CLASSPATH=$($HADOOP_HOME/bin/hadoop classpath):$CLASSPATH
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
#Hive Enviroment
export HIVE_HOME=/opt/hive
export PATH=$PATH:$HIVE_HOME/bin
#scala environment
export SCALA_HOME=/opt/scala
export PATH=${SCALA_HOME}/bin:$PATH
#Spark environment
export SPARK_HOME=/opt/spark
export PATH=${SPARK_HOME}/bin:$PATH
#jupyter直接运行pyspark配置
export PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-0.10.9-src.zip:$PYTHONPATH
export PYSPARK_PYTHON=/home/spark/envs/py3/bin/python3
export PYSPARK_DRIVER_PYTHON=/home/spark/envs/py3/bin/python3
#sbt环境
export PATH=/opt/sbt/:$PATH
参考文献
  • 使用Python连接Hive
  • pyhive的安装
  • 解决beeline无法连接hive数据库的问题
  • 解决sasl安装问题
  • 使用PyHive连接Hive数据仓库
转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/335156.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号