首页 > 代码库 > Hadoop中的辅助类ToolRunner和Configured的用法详解

Hadoop中的辅助类ToolRunner和Configured的用法详解

在开始学习hadoop时,最痛苦的一件事就是难以理解所写程序的执行过程,让我们先来看这个实例,这个测试类ToolRunnerTest继承Configured的基础上实现了Tool接口,下面对其用到的基类源码进行分析,就可以理解其执行过程是如此简单。。。。。。

 1 package xml; 2  3 import org.apache.hadoop.conf.Configuration; 4 import org.apache.hadoop.conf.Configured; 5 import org.apache.hadoop.util.Tool; 6 import org.apache.hadoop.util.ToolRunner; 7  8 public class ToolRunnerTest extends Configured implements Tool { 9 10     @Override11     public int run(String[] arg0) throws Exception {12         //调用基类Configured的getConf获取环境变量实例13         Configuration conf=getConf();14         //获取属性值15         System.out.println("flower is " + conf.get("flower"));16         System.out.println("color id "+ conf.get("color"));17         System.out.println("blossom ? "+conf.get("blossom"));18         System.out.println("this is the host default name ="+conf.get("fs.default.name"));    19         return 0;20     }21 22     /**23      * @param args24      * @throws Exception 25      */26     public static void main(String[] args) throws Exception {27         // TODO Auto-generated method stub28         //获取当前环境变量29         Configuration conf=new Configuration();30         //使用ToolRunner的run方法对自定义的类型进行处理31         ToolRunner.run(conf, new ToolRunnerTest(), args);32         33     }34 35 }

基类Configured实现了Configurable接口,而Configurable接口源码如下

1 Public interface Configurable{2     Void setConf(Configuration conf);3     Configuration getConf();4 }

Configured则必须实现Configurable类的两个方法,源码如下

 1 Public class Configured implements  Configurable{ 2 Private Configuration conf; 3 Public Configured(Configuration conf){setConf(conf);}//构造方法 4 Public void setConf(Configuration conf) 5 { 6 This.conf=conf; 7 } 8 Public getConf() 9 {10 Return conf;11 }12 }

Tool的源码如下所示:

public interface Tool extends Configurable {

int run(String [] args) throws Exception;

}

就这么一点点

ToolRunner类的源码如下

 1 public class ToolRunner { 2 public static int run(Configuration conf, Tool tool, String[] args)  3     throws Exception{ 4     if(conf == null) { 5       conf = new Configuration(); 6     } 7     GenericOptionsParser parser = new GenericOptionsParser(conf, args); 8     //set the configuration back, so that Tool can configure itself 9     tool.setConf(conf)10     String[] toolArgs = parser.getRemainingArgs();11     return tool.run(toolArgs);12   }13    public static int run(Tool tool, String[] args) 14     throws Exception{15     return run(tool.getConf(), tool, args);16   }17   18   public static void printGenericCommandUsage(PrintStream out) {19     GenericOptionsParser.printGenericCommandUsage(out);20   }21   22 }

解析:当程序执行ToolRunner.run(conf, new ToolRunnerTest(), args);时,会转到ToolRunner类的run方法部分,因为Configuration已经实例,所以直至执行到tool.run(toolArgs);又因为Tool是一个只含有一个run方法框架的接口,所以将执行实现这个接口的类ToolRunnerTestrun方法。完成其输出。其实在看完这几个类的源码后,其执行过程是很简单的

该实例的运行结果如下:

 

Hadoop中的辅助类ToolRunner和Configured的用法详解