最近在负责的一款数据产品,
报错 Invalid query handle: xxxx报错信息如下:
ERROR c.a.druid.pool.DruidPooledStatement - clearResultSet error org.apache.hive.service.cli.HiveSQLException: Invalid query handle: d84d9133d8a6ce9c:9a77cd100000000 at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:266) at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:252) at org.apache.hive.jdbc.HiveStatement.closeStatementIfNeeded(HiveStatement.java:210) at org.apache.hive.jdbc.HiveStatement.closeClientOperation(HiveStatement.java:221) at org.apache.hive.jdbc.HiveQueryResultSet.close(HiveQueryResultSet.java:308) at com.alibaba.druid.pool.DruidPooledResultSet.close(DruidPooledResultSet.java:86) at com.alibaba.druid.pool.DruidPooledStatement.clearResultSet(DruidPooledStatement.java:206) at com.alibaba.druid.pool.DruidPooledStatement.close(DruidPooledStatement.java:514)
报错代码片段:
finally {
if (ps != null) {
ps.close();
}
if (con != null) {
con.close();
}
}
发生在Statement.close()处。
任务超时被kill。大数据平台资源有限,不可能让用户的查询SQL无限期执行下去。具体的任务查杀规则:
具体的报错信息如下:
java.sql.SQLException: Sender timed out waiting for receiver fragment instance: 394c696029ddcce6:a51b7cab000007cc, dest node: 66 at org.apache.hive.jdbc.HiveStatement.waitForOperationToComplete(HiveStatement.java:381) at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:260) at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:473) at com.alibaba.druid.pool.DruidPooledStatement.executeQuery(DruidPooledStatement.java:308)
有时候重试成功,有时候失败。
登录到Hadoop集群机器里,查看任务调度执行日志:
发现这个SQL,居然要全表扫描一张1.2W+分区,扫描20.1TB数据。
任务执行失败,具体的报错信息如下:
java.sql.SQLException: Disk I/O error: Failed to open HDFS file hdfs://ppdhdpha/user/hive/warehouse/test.db/chengzhangquanyi_huolizhiguoqi_chuda/2c43254ab60211d3-cf0e47d200000235_298950249_data.0.
Error(2): No such file or directory
Root cause: RemoteException: File does not exist: /user/hive/warehouse/test.db/chengzhangquanyi_huolizhiguoqi_chuda/2c43254ab60211d3-cf0e47d200000235_298950249_data.0.
at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:85)
at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:75)
at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:152)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1909)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:735)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:415)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
报错原因:数据表不存在。尝试方法:refresh或invalidate
Error(s) deleting partition directories. First error (of 37) was: Hdfs op. Input/output errorjava.sql.SQLException: Error(s) deleting partition directories. First error (of 37) was: Hdfs op (DELETe hdfs://ppdhdpha/user/hive/warehouse/cszc.db/zmj_sop_m0_base_snp/1d460f3a4d87ea14-a4c4521100000091_870332460_data.0.) failed, error was: hdfs://ppdhdpha/user/hive/warehouse/cszc.db/zmj_sop_m0_base_snp/1d460f3a4d87ea14-a4c4521100000091_870332460_data.0. Error(5): Input/output errorSQLException: Cancelled
具体报错信息如下:
java.sql.SQLException: Cancelled at org.apache.hive.jdbc.HiveStatement.waitForOperationToComplete(HiveStatement.java:381) at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:260) at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:473) at com.alibaba.druid.pool.DruidPooledStatement.executeQuery(DruidPooledStatement.java:308)Cancelled from Impala’s debug web interface 参考
Sender timed out waiting for receiver fragment instance: , dest node: 66
Invalid query handle,感觉借鉴意义不大



