首页 > 代码库 > hadoop编程小技巧(8)---Unit Testing (单元测试)

hadoop编程小技巧(8)---Unit Testing (单元测试)

所需环境:

Hadoop相关jar包(下载官网发行版即可);

下载junit包(最新为好);

下载mockito包;

下载mrunit包;

下载powermock-mockito包;

相关包截图如下(相关下载参考:http://download.csdn.net/detail/fansy1990/7690977):





应用场景:

在进行Hadoop的一般MR编程时,需要验证我们的业务逻辑,或者说是验证数据流的时候可以使用此环境,这个环境不要求真实的云平台,只是针对算法或者代码逻辑进行验证,方便调试代码。

实例:

Mapper:

package fz.mrtest;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class SMSCDRMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
	 
	  private Text status = new Text();
	  private final static IntWritable addOne = new IntWritable(1);
	 
	  /**
	   * Returns the SMS status code and its count
	   */
	  protected void map(LongWritable key, Text value, Context context)
	      throws java.io.IOException, InterruptedException {
	 
	    //655209;1;796764372490213;804422938115889;6 is the Sample record format
	    String[] line = value.toString().split(";");
	    // If record is of SMS CDR
	    if (Integer.parseInt(line[1]) == 1) {
	      status.set(line[4]);
	      context.write(status, addOne);
	    }
	  }
	}
Reducer:

package fz.mrtest;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;


public class SMSCDRReducer extends
  Reducer<Text, IntWritable, Text, IntWritable> {
 
  protected void reduce(Text key, Iterable<IntWritable> values, Context context) throws java.io.IOException, InterruptedException {
    int sum = 0;
    for (IntWritable value : values) {
      sum += value.get();
    }
    context.write(key, new IntWritable(sum));
  }
}

测试主程序:

package fz.mrtest;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
 
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mrunit.mapreduce.MapDriver;
import org.apache.hadoop.mrunit.mapreduce.MapReduceDriver;
import org.apache.hadoop.mrunit.mapreduce.ReduceDriver;
import org.junit.Before;
import org.junit.Test;
 
public class SMSCDRMapperReducerTest {
 
  MapDriver<LongWritable, Text, Text, IntWritable> mapDriver;
  ReduceDriver<Text, IntWritable, Text, IntWritable> reduceDriver;
  MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable> mapReduceDriver;
 
  @Before
  public void setUp() {
    SMSCDRMapper mapper = new SMSCDRMapper();
    SMSCDRReducer reducer = new SMSCDRReducer();
    mapDriver = MapDriver.newMapDriver(mapper);;
    reduceDriver = ReduceDriver.newReduceDriver(reducer);
    mapReduceDriver = MapReduceDriver.newMapReduceDriver(mapper, reducer);
  }
 
  @Test
  public void testMapper() throws IOException {
    mapDriver.withInput(new LongWritable(), new Text(
        "655209;1;796764372490213;804422938115889;6"));
    mapDriver.withOutput(new Text("6"), new IntWritable(1));
    mapDriver.runTest();
  }
 
  @Test
  public void testReducer() throws IOException {
    List<IntWritable> values = new ArrayList<IntWritable>();
    values.add(new IntWritable(1));
    values.add(new IntWritable(1));
    reduceDriver.withInput(new Text("6"), values);
    reduceDriver.withOutput(new Text("6"), new IntWritable(2));
    reduceDriver.runTest();
  }
  @Test
  public void testMR() throws IOException{
	  mapReduceDriver.withInput(new LongWritable(), new Text(
        "655209;1;796764372490213;804422938115889;6"));
	  mapReduceDriver.withInput(new LongWritable(), new Text(
		        "6552092;1;796764372490213;804422938115889;6"));
	  mapReduceDriver.withOutput(new Text("6"), new IntWritable(2));
	  mapReduceDriver.runTest();
  }
}
(代码源于MRUnit的官网,最后的测试主程序加了个对整个的测试)测试主程序一共有三个测试方法,分别测试Mapper、Reducer、以及Mapper和Reducer的联合测试。


总结:使用Hadoop的单元测试可以方便验证编写程序的正确性,而不需要使用真实环境验证代码的正确性为高效开发提供了可能。但是针对一些特殊的情况还是需要真实环境测试代码,这点需要特殊考虑,不过一般情况下,此单元测试环境对编写的MR都适用。


分享,成长,快乐

转载请注明blog地址:http://blog.csdn.net/fansy1990