首页 > 代码库 > Hadoop读书笔记(十三)MapReduce中Top算法
Hadoop读书笔记(十三)MapReduce中Top算法
Hadoop读书笔记系列文章:http://blog.csdn.net/caicongyang/article/category/2166855
1.说明:
从给定的文件中的找到最大值
2.代码:
TopApp.java
package suanfa; import java.io.IOException; import java.net.URI; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.NullWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapred.TestMiniMRClientCluster.MyReducer; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; /** * <p> * Title: TopApp.java * Package suanfa * </p> * <p> * Description: 从算1000w个数据中找到最大值 * <p> * @author Tom.Cai * @created 2014-12-2 下午10:28:33 * @version V1.0 * */ public class TopApp { private static final String INPUT_PATH = "hdfs://192.168.80.100:9000/top_input"; private static final String OUT_PATH = "hdfs://192.168.80.100:9000/top_out"; public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); final FileSystem fileSystem = FileSystem.get(new URI(INPUT_PATH), conf); final Path outPath = new Path(OUT_PATH); if(fileSystem.exists(outPath)){ fileSystem.delete(outPath, true); } final Job job = new Job(conf , TopApp.class.getSimpleName()); FileInputFormat.setInputPaths(job, INPUT_PATH); job.setMapperClass(MyMapper.class); job.setReducerClass(MyReducer.class); job.setOutputKeyClass(LongWritable.class); job.setOutputValueClass(NullWritable.class); FileOutputFormat.setOutputPath(job, outPath); job.waitForCompletion(true); } static class MyMapper extends Mapper<LongWritable, Text, LongWritable, NullWritable>{ long max = Long.MAX_VALUE; @Override protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { long temp = Long.parseLong(value.toString()); if(temp>max){ max = temp; } } @Override protected void cleanup(Context context) throws IOException, InterruptedException { context.write(new LongWritable(max), NullWritable.get()); } } static class MyReduceR extends Reducer<LongWritable, NullWritable, LongWritable, NullWritable>{ long max = Long.MAX_VALUE; @Override protected void reduce(LongWritable key, Iterable<NullWritable> value, Context context) throws IOException, InterruptedException { long temp = Long.parseLong(value.toString()); if(temp>max){ max = temp; } } @Override protected void cleanup(Context context) throws IOException, InterruptedException { context.write(new LongWritable(max), NullWritable.get()); } } }
欢迎大家一起讨论学习!
有用的自己收!
记录与分享,让你我共成长!欢迎查看我的其他博客;我的博客地址:http://blog.csdn.net/caicongyang
Hadoop读书笔记(十三)MapReduce中Top算法
声明:以上内容来自用户投稿及互联网公开渠道收集整理发布,本网站不拥有所有权,未作人工编辑处理,也不承担相关法律责任,若内容有误或涉及侵权可进行投诉: 投诉/举报 工作人员会在5个工作日内联系你,一经查实,本站将立刻删除涉嫌侵权内容。