栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 前沿技术 > 大数据 > 大数据系统

Spark 读写Hbase

Spark 读写Hbase

启动

zookeeper----》hadoop----》hbase

创建hbase表student

create 'student' ,'info'

添加数据

put 'student' ,'1' ,'info:name','James'

put 'student' ,'1' ,'info:age','23'

put 'student' ,'1' ,'info:gender','F'

put 'student' ,'2' ,'info:name','Smith'

put 'student' ,'2' ,'info:age','24'

put 'student' ,'2' ,'info:gender','M'

根据rowkey查询一条记录

get 'student','1'

读取hbase数据,在mycode目录下创建SparkOperateHbase.py文件,添加如下代码

#!/usr/bin/env python3

from pyspark import SparkConf,SparkContext

conf = SparkConf().setMaster('local').setAppName("ReadHbase")
sc = SparkContext(conf = conf)
host = 'localhost'
table = 'student'
conf = {"hbase.zookeeper.quorum": host,"hbase.mapreduce.inputtable": table}
keyConv = "org.apache.spark.examples.pythonconverters.ImmutableBytesWritableToStringConverter"
valueConv="org.apache.spark.examples.pythonconverters,HbaseResultToStringConverter"
hbase_rdd=sc.newAPIHadoopRDD("org.apache.hadoop.hbase.mapreduce.TableInputFormat","org.apache.hadoop.hbase.io.ImmutableBytesWritable","org.apache.hadoop.hbase.client.Result",keyConverter=keyConv,valueConverter=valueConv,conf=conf)
count=hbase_rdd.count()
hbase_rdd.cache()
output=hbase_rdd.collect()
for(k,v) in output:
        print(k,v)

运行程序

./spark-submit /usr/local/software/spark/mycode/SparkOperateHbase.py

把spark读取hbase的支持jar包导入spark的jars目录下 

cp /usr/local/software/hbase/hbase-2.4.9/lib/hbase*.jar   /usr/local/software/spark/spark-3.0.3-bin-hadoop2.7/tars

再次运行

缺少把Hbase数据转换成python可读数据的jar包 

在spark jars目录下新建hbase目录

mkdir hbase

下载转换包到spark   的jars/hbase目录下

wget https://repo.typesafe.com/typesafe/maven-releases/org/apache/spark/spark-examples_2.11/1.6.0-typesafe-001/spark-examples_2.11-1.6.0-typesafe-001.jar

配置spark-env.sh

export SPARK_DIST_CLASSPATH=$(/usr/local/software/hadoop/hadoop-3.3.0/bin/hadoop classpath):$(/usr/local/software/hbase/hbase-2.4.9/bin/hbase classpath):/usr/local/software/spark/spark-3.0.3-bin-hadoop2.7/jars/hbase/*

再次运行

启动Hbase

再次运行

 

 读取成功!

转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/774380.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号