2022-03-01 10:53:14,868 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: Java heap space at java.nio.HeapByteBuffer.报错原因(HeapByteBuffer.java:57) at java.nio.ByteBuffer.allocate(ByteBuffer.java:335) at org.apache.hadoop.hive.ql.io.orc.OutStream.getNewInputBuffer(OutStream.java:107) at org.apache.hadoop.hive.ql.io.orc.OutStream.write(OutStream.java:140) at com.google.protobuf.CodedOutputStream.refreshBuffer(CodedOutputStream.java:833) at com.google.protobuf.CodedOutputStream.flush(CodedOutputStream.java:843) at com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:80) at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:724) at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1609) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1991) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:2283) at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:106) at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.abortWriters(FileSinkOperator.java:252) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:1026) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:199) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 2022-03-01 10:53:14,893 INFO [communication thread] org.apache.hadoop.mapred.Task: Communication exception: java.io.IOException: The client is stopped at org.apache.hadoop.ipc.Client.getConnection(Client.java:1534) at org.apache.hadoop.ipc.Client.call(Client.java:1478) at org.apache.hadoop.ipc.Client.call(Client.java:1439) at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:243) at com.sun.proxy.$Proxy9.statusUpdate(Unknown Source) at org.apache.hadoop.mapred.Task$TaskReporter.run(Task.java:790) at java.lang.Thread.run(Thread.java:748)
--查看详细表结构
desc extended `table_nane`;
.......
location:hdfs://solway-ha/user/hive/warehouse/xx.db/xxx_table_name, inputFormat:org.apache.hadoop.hive.ql.io.orc.OrcInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.ql.io.orc.OrcSerde, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[FieldSchema(name:date, type:string, comment:null)], parameters:{orc.compress=ZLIB, transient_lastDdlTime=1645002562}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE)
ORCFile 写入分区时需要使用内存进行写入 , 默认大小为256Kb , 需要通过调节alter table table_name set tblproperties("orc.compress.size"="65536") 来降低内存使用,我这里降低为原来的四分之一,进行参数调试
方案>alter table gdm.gdm_wt_minute set tblproperties("orc.compress.size"="65536")
-- 重跑SQL后无问题
>insert overwrite table xx.xxx_table_name partition(date) select * from xx.xxx_tmp_table;
>
......
MapReduce Jobs Launched:
Stage-Stage-1: Map: 19 Cumulative CPU: 1089.54 sec HDFS Read: 5407892294 HDFS Write: 312056735 SUCCESS
Stage-Stage-3: Map: 139 Cumulative CPU: 308.08 sec HDFS Read: 332061143 HDFS Write: 311133769 SUCCESS
Total MapReduce CPU Time Spent: 23 minutes 17 seconds 620 msec
OK



