栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 面试经验 > 面试问答

Hadoop 0.20.205的CombineFileInputFormat的实现

面试问答 更新时间: 发布时间: IT归档 最新发布 模块sitemap 名妆网 法律咨询 聚返吧 英语巴士网 伯小乐 网商动力

Hadoop 0.20.205的CombineFileInputFormat的实现

这是我为您准备的实现:

import java.io.IOException;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapred.FileSplit;import org.apache.hadoop.mapred.InputSplit;import org.apache.hadoop.mapred.JobConf;import org.apache.hadoop.mapred.LineRecordReader;import org.apache.hadoop.mapred.RecordReader;import org.apache.hadoop.mapred.Reporter;import org.apache.hadoop.mapred.lib.CombineFileInputFormat;import org.apache.hadoop.mapred.lib.CombineFileRecordReader;import org.apache.hadoop.mapred.lib.CombineFileSplit;@SuppressWarnings("deprecation")public class CombinedInputFormat extends CombineFileInputFormat<LongWritable, Text> {    @SuppressWarnings({ "unchecked", "rawtypes" })    @Override    public RecordReader<LongWritable, Text> getRecordReader(InputSplit split, JobConf conf, Reporter reporter) throws IOException {        return new CombineFileRecordReader(conf, (CombineFileSplit) split, reporter, (Class) myCombineFileRecordReader.class);    }    public static class myCombineFileRecordReader implements RecordReader<LongWritable, Text> {        private final LineRecordReader linerecord;        public myCombineFileRecordReader(CombineFileSplit split, Configuration conf, Reporter reporter, Integer index) throws IOException { FileSplit filesplit = new FileSplit(split.getPath(index), split.getOffset(index), split.getLength(index), split.getLocations()); linerecord = new LineRecordReader(conf, filesplit);        }        @Override        public void close() throws IOException { linerecord.close();        }        @Override        public LongWritable createKey() { // TODO Auto-generated method stub return linerecord.createKey();        }        @Override        public Text createvalue() { // TODO Auto-generated method stub return linerecord.createvalue();        }        @Override        public long getPos() throws IOException { // TODO Auto-generated method stub return linerecord.getPos();        }        @Override        public float getProgress() throws IOException { // TODO Auto-generated method stub return linerecord.getProgress();        }        @Override        public boolean next(LongWritable key, Text value) throws IOException { // TODO Auto-generated method stub return linerecord.next(key, value);        }    }}

在您的作业中,首先

mapred.max.split.size
根据您想要合并输入文件的大小来设置参数。在您的 run()中执行以下操作

... if (argument != null) {     conf.set("mapred.max.split.size", argument); } else {     conf.set("mapred.max.split.size", "134217728"); // 128 MB }... conf.setInputFormat(CombinedInputFormat.class);...


转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/496136.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号