1:启动hive.metastore
nohup hive --service metastore &
2:启动spark中thriftserver
说明:经过测试如果直接启动hive中的hiveserver2然后启动thriftserver会报冲突,猜测spark中已经包含了hiveserver2
所以无需在hive中启动hiveserver2,主要目的是方便在datagrid中远程访问如图:
./start-thriftserver.sh
3:启动spark master,和slaver主要用于打包好的jar包执行
./start-master.sh -h 本机IP地址 ./start-slave.sh spark://本机IP地址:7077
查看是否启动成功 jps -m 如下图所示
4:编写java测试代码
public class SparkSqlToHive {
public static void main(String[] args) {
SparkSession session = SparkSession.builder().appName("SparkSessionApp")
.config("hive.metastore.uris", "thrift://127.0.0.1:9083")
//直接连接hive
.enableHiveSupport()
.getOrCreate();
session.sql("show databases").show();
}
}
注意如果事先在开发工具中开发打jar包麻烦,可以直接远程连接调试。代码如下
public class SparkSqlToHive {
public static void main(String[] args) {
SparkSession session = SparkSession.builder().appName("SparkSessionApp")
.master("spark://172.10.70.196:7077")
.config("hive.metastore.uris", "thrift://172.10.70.196:9083")
//直接连接hive
.enableHiveSupport()
.getOrCreate();
session.sql("show databases").show();
}
}
5:打包 pom文件实例
4.0.0 org.example SparkTest 1.0-SNAPSHOT SparkTest http://www.example.com UTF-8 1.7 1.7 org.apache.spark spark-sql_2.11 2.4.8 org.apache.spark spark-hive_2.11 2.4.8 org.spark-project.hive hive-jdbc 1.2.1.spark2 net.alchim31.maven scala-maven-plugin 3.2.2 org.apache.maven.plugins maven-compiler-plugin 3.5.1 org.apache.maven.plugins maven-compiler-plugin compile compile org.apache.maven.plugins maven-shade-plugin 2.4.3 package shade *:* meta-INF/*.SF meta-INF/*.DSA meta-INF/*.RSA
6:将打包好的jar包上传服务器运行
spark-submit --class org.example.SparkSqlToHive --master spark://第三步启动master指定的IP地址:7077 /usr/local/src/test/SparkTest-1.0-SNAPSHOT.jar
7:查看执行结果



