创建相关用户
groupadd sqoop
useradd -g sqoop -G hadoop sqoop
解压,创建软链
sqoop -> sqoop-1.4.7.bin__hadoop-2.6.0
配置环境变量
export SQOOP_HOME=/app/sqoop export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HIVE_HOME/lib/* PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$JAVA_HOME/bin:$Hbase_HOME/bin:$GEOMESA_HOME/bin:$ZOOKEEPER_HOME/bin:$SQOOP_HOME/bin:$RANGER_HOME:$PATH
Java安全配置:
vim $JAVA_HOME/jre/lib/security/java.policy
在grant{}内部添加如下内容:
permission javax.management.MBeanTrustPermission "register";
Lib增加配置:
mysql-connector-java-5.1.47.jar postgresql-9.4.1212.jar derby.jar
copy HDFS Lib:
common hadoop-common-2.7.3.jar hdfs hadoop-hdfs-2.7.3.jar mapreduce-client hadoop-yarn-client-2.7.3.jar
替换HDFS依赖(兼容):
cp /app/hadoop/share/hadoop/common/hadoop-common-2.7.3.jar . cp /app/hadoop/share/hadoop/hdfs/hadoop-hdfs-2.7.3.jar . cp /app/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.7.3.jar .
jackson-annotations 依赖包(Hive兼容):
cp /app/hive/lib/jackson-databind-2.6.5.jar . cp /app/hive/lib/jackson-core-2.6.5.jar . cp /app/hive/lib/jackson-annotations-2.6.0.jar .
配置:
1. cp sqoop-env-template.sh sqoop-env.sh
export HADOOP_COMMON_HOME=/app/hadoop export HADOOP_MAPRED_HOME=/app/hadoop export Hbase_HOME=/app/hbase export HIVE_HOME=/app/hive export ZOOCFGDIR=/app/zookeeper
2. sqoop-site.xml
权限目录配置:
hadoop fs -chmod -R 777 /user/hive su - hdfs hadoop fs -mkdir -p /tmp/logs/sqoop hadoop fs -chmod -R 777 /tmp/logs/sqoop hadoop fs -mkdir -p /user/sqoop hadoop fs -chown -R sqoop:sqoop /user/sqoop hadoop fs -chmod -R 777 /user/hive
测试关系型数据库查询:
sqoop list-databases --connect jdbc:mysql://10.66.24.57:33071 --username hive --password hive*#123
同步Hive
create database if not exists data; CREATE TABLE data.pub_trade(id INT, cluster_name STRING, app STRING, create_time TIMESTAMP) STORED AS TEXTFILE; sqoop import --hive-import --connect "jdbc:mysql://10.69.13.148:33071" --dbauser "xxx" --password "xxx" --verbose -m 1 --table "hdbs.meta_hbase" --hive-table "data_cloud.pub_trade" --columns "id,cluster_name,app,create_time" --check-column 'id' --incremental 'lastmodified' --last-value '0' --null-string '\N' --null-non-string '\N' --map-column-hive "id=INT, cluster_name=STRING, app=STRING, create_time=TIMESTAMP"
PG同步Hive
sqoop-import -Dorg.apache.sqoop.splitter.allow_text_splitter=true --hive-import --create-hive-table --hive-database pg --hive-table 't_payment_order_settle_info' --table 't_payment_order_settle_info' --connect "jdbc:postgresql://10.69.17.236:3725/alphapay_order_settle" --username xxx --password xxxx --target-dir "/tmp/`date +%s`"
Mysql同步Hive
sqoop-import --hive-import --create-hive-table --hive-database default --hive-table 'pub_trade' --table 'meta_hbase' --connect "jdbc:mysql://xxx-mysql-master.xxx.com.cn:33071/hdbs?useSSL=false" --username dbauser --password xxxx --target-dir "/tmp/`date +%s`"
使用sqoop从mysql导数据到hive报错:
ERROR Could not register mbeans java.security.AccessControlException: access denied
vim $JAVA_HOME/jre/lib/security/java.policy
在grant{}内部添加如下内容:
permission javax.management.MBeanTrustPermission "register";
("javax.management.MBeanTrustPermission" "register")
解决办法:将hive-site.xml复制到${SQOOP_HOME}/conf下即可.
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Table 'hive1.deleteme1529656403521' doesn't exist in engine
网上的解决办法是
在线上导入${HIVE_HOME}/scripts/metastore/upgrade/mysql/hive-schema-0.9.0.mysql.sql,将尚未创建的表创建好(比如我们没有用过Hive的权限管理,所以DataNucleus没有自动创建DB_PRIVS表);
在hive-site.xml中配置 datanucleus.fixedDataStore=true;datanecleus.autoCreateSchema=false。
在${HIVE_HOME}scripts/metastore/upgrade/mysql/目录下有各个版本的元数据表创建sql脚本,选择自己对应的版本即可.
ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly.
解决方法:
1)/etc/profile最后加入 export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HIVE_HOME/lib/*
然后刷新配置,source /etc/profile
2)复制hive库文件中hive-exec*到sqoop的库文件中



