码迷,mamicode.com
首页 > 其他好文 > 详细

InputFormat到key-value生成流程

时间:2015-09-01 21:39:22      阅读:188      评论:0      收藏:0      [点我收藏+]

标签:

public abstract class InputFormat<K, V> {

  public abstract
    List<InputSplit> getSplits(JobContext context
                               ) throws IOException, InterruptedException;
 
  public abstract
    RecordReader<K,V> createRecordReader(InputSplit split,
                                         TaskAttemptContext context
                                        ) throws IOException,
                                                 InterruptedException;

}

public abstract class FileInputFormat<K, V> extends InputFormat<K, V> {

public class TextInputFormat extends FileInputFormat<LongWritable, Text> {

getSplits方法,获得对输入文件的切分数量,每一个split对应一个map。
创建RecordReader,该RecordReader接收切分好的split,实现nextKeyValue、getCurrentKey、getCurrentValue。

如下所示,每个map类都会继承Mapper类,在Mapper类中,run方法会调用InputFormat中的RecordReader来获得key、value

public class Mapper<KEYIN, VALUEIN, KEYOUT, VALUEOUT> {
  /**
   * Expert users can override this method for more complete control over the
   * execution of the Mapper.
   * @param context
   * @throws IOException
   */
  public void run(Context context) throws IOException, InterruptedException {
    setup(context);
    try {
      while (context.nextKeyValue()) {
        map(context.getCurrentKey(), context.getCurrentValue(), context);
      }
    } finally {
      cleanup(context);
    }
  }
}

InputFormat到key-value生成流程

标签:

原文地址:http://my.oschina.net/sniperLi/blog/500341

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!