MapReduce :通过数据具有爷孙关系的结果

首页 > 代码库 > MapReduce :通过数据具有爷孙关系的结果

MapReduce :通过数据具有爷孙关系的结果

2024-09-16 03:59:40 217人阅读

1)启动环境

start-all.sh

2)产看状态

jps

0613 NameNode

10733 DataNode

3455 NodeManager

15423 Jps

11082 ResourceManager

10913 SecondaryNameNode

3)利用Eclipse编写jar

测试数据:

job-liu

fei-hh

hh-uu

qq-ww

ee-bb

bb-yy

1.编写 MapCal类

package com.mp;

import java.io.IOException;

import org.apache.hadoop.io.LongWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Mapper;

public class MapCal extends Mapper<LongWritable, Text, Text, Text> {

@Override

protected void map(LongWritable lon, Text value, Context context)

throws IOException, InterruptedException {

String line = value.toString();

String[] peps = line.split("-");

// 键值对

context.write(new Text(peps[0]), new Text("s" + peps[1]));

context.write(new Text(peps[1]), new Text("g" + peps[0]));

}

2.编写ReduceCal类

public class ReduceCal extends Reducer<Text, Text, Text, Text> {

@Override

protected void reduce(Text arg0, Iterable<Text> arg1, Context context)

throws IOException, InterruptedException {

ArrayList<Text> grands = new ArrayList<Text>();

ArrayList<Text> sons = new ArrayList<Text>();

// 把这些值写入集合

for (Text text : arg1) {

String str = text.toString();

if (str.startsWith("g")) {

grands.add(new Text(str.subString(1)));

} else {

sons.add(new Text(str.subString(1)));

}

// 输出

for (int i = 0; i < grands.size(); i++) {

for (int j = 0; j < sons.size(); j++) {

context.write(grands.get(i), sons.get(j));

}

3. 编写Jobrun类

package com.mp;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.FileSystem;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Job;

import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class JobRun {

public static void main(String[] args) {

try {

Configuration conf = new Configuration();

FileSystem fs = FileSystem.get(conf);

Job job = Job.getInstance(conf);

job.setJobName("jc11");

job.setJarByClass(JobRun.class);

job.setMapperClass(MapCal.class);

job.setReducerClass(ReduceCal.class);

job.setMapOutputKeyClass(Text.class);

job.setMapOutputValueClass(Text.class);

FileInputFormat.addInputPath(job, new Path("/shuju.txt"));

if (fs.exists(new Path("/outg"))) {

fs.delete(new Path("/outg"));

}

FileOutputFormat.setOutputPath(job, new Path("/outg"));

boolean f = job.waitForCompletion(true);

if (f) {

System.out.println("ok");

}

} catch (Exception e) {

e.printStackTrace();

}

4)导出jar包.

技术分享

5)通过ftp上传jar到linux目录

6)运行jar包

hadoop jar shuju.jar com.mc.RunJob / /outg

7)如果map和reduce都100%

Shuffle Errors

BAD_ID=0

CONNECTION=0

IO_ERROR=0

WRONG_LENGTH=0

WRONG_MAP=0

WRONG_REDUCE=0

File Input Format Counters

Bytes Read=45

File Output Format Counters

Bytes Written=18

表示运行成功!!

8)产看结果

hadoop fs -tail /outg/part-r-00000

eeyy

feiuu

MapReduce :通过数据具有爷孙关系的结果

声明：以上内容来自用户投稿及互联网公开渠道收集整理发布，本网站不拥有所有权，未作人工编辑处理，也不承担相关法律责任，若内容有误或涉及侵权可进行投诉：投诉/举报工作人员会在5个工作日内联系你，一经查实，本站将立刻删除涉嫌侵权内容。

联系
我们

首页 > 代码库 > MapReduce :通过数据具有爷孙关系的结果

MapReduce :通过数据具有爷孙关系的结果

看完仍有疑问？有类似问题直接问程序猿