首页 > 代码库 > MapReduce :通过数据具有爷孙关系的结果
MapReduce :通过数据具有爷孙关系的结果
1)启动环境
start-all.sh
2)产看状态
jps
0613 NameNode
10733 DataNode
3455 NodeManager
15423 Jps
11082 ResourceManager
10913 SecondaryNameNode
3)利用Eclipse编写jar
测试数据: job-liu fei-hh hh-uu qq-ww ee-bb bb-yy |
1.编写 MapCal类
package com.mp; import java.io.IOException; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; public class MapCal extends Mapper<LongWritable, Text, Text, Text> { @Override protected void map(LongWritable lon, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); String[] peps = line.split("-"); // 键值对 context.write(new Text(peps[0]), new Text("s" + peps[1])); context.write(new Text(peps[1]), new Text("g" + peps[0])); } } |
2.编写ReduceCal类
public class ReduceCal extends Reducer<Text, Text, Text, Text> { @Override protected void reduce(Text arg0, Iterable<Text> arg1, Context context) throws IOException, InterruptedException { ArrayList<Text> grands = new ArrayList<Text>(); ArrayList<Text> sons = new ArrayList<Text>(); // 把这些值写入集合 for (Text text : arg1) { String str = text.toString(); if (str.startsWith("g")) { grands.add(new Text(str.subString(1))); } else { sons.add(new Text(str.subString(1))); } } // 输出 for (int i = 0; i < grands.size(); i++) { for (int j = 0; j < sons.size(); j++) { context.write(grands.get(i), sons.get(j)); } } } } |
3. 编写Jobrun类
package com.mp; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class JobRun { public static void main(String[] args) { try { Configuration conf = new Configuration(); FileSystem fs = FileSystem.get(conf);
Job job = Job.getInstance(conf); job.setJobName("jc11"); job.setJarByClass(JobRun.class); job.setMapperClass(MapCal.class); job.setReducerClass(ReduceCal.class); job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(Text.class); FileInputFormat.addInputPath(job, new Path("/shuju.txt")); if (fs.exists(new Path("/outg"))) { fs.delete(new Path("/outg")); } FileOutputFormat.setOutputPath(job, new Path("/outg")); boolean f = job.waitForCompletion(true); if (f) { System.out.println("ok"); } } catch (Exception e) { e.printStackTrace(); } } } |
4)导出jar包.
5)通过ftp上传jar到linux目录
6)运行jar包
hadoop jar shuju.jar com.mc.RunJob / /outg
7)如果map和reduce都100%
Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=45 File Output Format Counters Bytes Written=18 |
表示运行成功!!
8)产看结果
hadoop fs -tail /outg/part-r-00000
eeyy feiuu |
MapReduce :通过数据具有爷孙关系的结果