Hadoop Mapreduce之WordCount实现

时间：2017-06-11 15:40:38 阅读：224 评论：0 收藏：0 [点我收藏+]

标签：apach cep long auto ble ora tin override on()

1.新建一个WCMapper继承Mapper

public class WCMapper extends Mapper<LongWritable, Text, Text, LongWritable> {

@Override

protected void map(LongWritable key, Text value, Context context)

throws IOException, InterruptedException {

//接收数据V1

String line = value.toString();

//切分数据

String[] wordsStrings = line.split(" ");

//循环

for (String w: wordsStrings) {

//出现一次，记一个一，输出

context.write(new Text(w), new LongWritable(1));

}

2.新建一个WCReducer继承Reducer

public class WCReducer extends Reducer<Text, LongWritable, Text, LongWritable> {

@Override

protected void reduce(Text key, Iterable<LongWritable> v2s, Context context)

throws IOException, InterruptedException {

// TODO Auto-generated method stub

//接收数据

//Text k3 = k2;

//定义一个计算器

long counter = 0;

//循环v2s

for (LongWritable i : v2s)

{

counter += i.get();

}

//输出

context.write(key, new LongWritable(counter));

}

3.WordCount类实现Main方法

* 1.分析具体的业力逻辑，确定输入输出数据样式

* 2.自定义一个类，这个类要继承import org.apache.hadoop.mapreduce.Mapper;

* 重写map方法，实现具体业务逻辑，将新的kv输出

* 3.自定义一个类，这个类要继承import org.apache.hadoop.mapreduce.Reducer;

* 重写reduce，实现具体业务逻辑

* 4.将自定义的mapper和reducer通过job对象组装起来

public class WordCount {

public static void main(String[] args) throws Exception {

// 构建Job对象

Job job = Job.getInstance(new Configuration());

// 注意：main方法所在的类

job.setJarByClass(WordCount.class);

// 设置Mapper相关属性

job.setMapperClass(WCMapper.class);

job.setMapOutputKeyClass(Text.class);

job.setMapOutputValueClass(LongWritable.class);

FileInputFormat.setInputPaths(job, new Path("/words.txt"));

// 设置Reducer相关属性

job.setReducerClass(WCReducer.class);

job.setOutputKeyClass(Text.class);

job.setOutputValueClass(LongWritable.class);

FileOutputFormat.setOutputPath(job, new Path("/wcount619"));

// 提交任务

job.waitForCompletion(true);

}

4.打包为wc.jar，并上传到linux，并在Hadoop下运行

hadoop jar /root/wc.jar

Hadoop Mapreduce之WordCount实现

标签：apach cep long auto ble ora tin override on()

原文地址：http://www.cnblogs.com/dulixiaoqiao/p/6985237.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行