首页 > 代码库 > MapReduce 编程 系列四 MapReduce例子程序运行
MapReduce 编程 系列四 MapReduce例子程序运行
MapReduce程序编译是可以在普通的Java环境下进行,现在来到真实的环境上运行。
首先,将日志文件放到HDFS目录下
$ hdfs dfs -put *.csv /user/chenshu/share/logs/ 14/09/27 17:03:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable $hdfs dfs -ls /user/chenshu/share/logs 14/09/27 17:03:35 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Found 10 items -rw-r--r-- 3 chenshu chenshu 539026 2014-09-27 17:03 /user/chenshu/share/logs/sign_2014-05-10.0.csv -rw-r--r-- 3 chenshu chenshu 33212 2014-09-27 17:03 /user/chenshu/share/logs/sign_2014-05-12.0.csv -rw-r--r-- 3 chenshu chenshu 1117191 2014-09-27 17:03 /user/chenshu/share/logs/sign_2014-05-13.0.csv -rw-r--r-- 3 chenshu chenshu 2642634 2014-09-27 17:03 /user/chenshu/share/logs/sign_2014-05-18.0.csv -rw-r--r-- 3 chenshu chenshu 4676438 2014-09-27 17:03 /user/chenshu/share/logs/sign_2014-06-01.1.csv -rw-r--r-- 3 chenshu chenshu 633015 2014-09-27 17:03 /user/chenshu/share/logs/sign_2014-06-27.0.csv -rw-r--r-- 3 chenshu chenshu 4749439 2014-09-27 17:03 /user/chenshu/share/logs/sign_2014-07-01.1.csv -rw-r--r-- 3 chenshu chenshu 1551312 2014-09-27 17:03 /user/chenshu/share/logs/sign_2014-08-01.2.csv -rw-r--r-- 3 chenshu chenshu 2957316 2014-09-27 17:03 /user/chenshu/share/logs/sign_2014-08-11.0.csv -rw-r--r-- 3 chenshu chenshu 4032863 2014-09-27 17:03 /user/chenshu/share/logs/sign_2014-09-01.1.csv
$ hdfs dfs -rm -r /user/chenshu/share/output
然后运行程序,观察输出结果
[chenshu@hadoopMaster example1]$ hadoop jar target/mr1_example1-1.0-SNAPSHOT.jar org.freebird.LogJob /user/chenshu/share/logs /user/chenshu/share/output 14/09/27 17:57:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/09/27 17:57:42 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 14/09/27 17:57:42 INFO input.FileInputFormat: Total input paths to process : 10 14/09/27 17:57:46 INFO mapred.JobClient: Running job: job_201404261903_548924 14/09/27 17:57:47 INFO mapred.JobClient: map 0% reduce 0% 14/09/27 17:58:11 INFO mapred.JobClient: map 40% reduce 0% 14/09/27 17:58:12 INFO mapred.JobClient: map 80% reduce 0% 14/09/27 17:58:20 INFO mapred.JobClient: map 100% reduce 0% 14/09/27 17:58:25 INFO mapred.JobClient: map 100% reduce 8% ... 14/09/27 17:59:12 INFO mapred.JobClient: Reduce input records=163643 14/09/27 17:59:12 INFO mapred.JobClient: Reduce input groups=162631 14/09/27 17:59:12 INFO mapred.JobClient: Combine output records=163643 14/09/27 17:59:12 INFO mapred.JobClient: Physical memory (bytes) snapshot=6787436544 14/09/27 17:59:12 INFO mapred.JobClient: Reduce output records=162631 14/09/27 17:59:12 INFO mapred.JobClient: Virtual memory (bytes) snapshot=43714277376 14/09/27 17:59:12 INFO mapred.JobClient: Map output records=354282
现在将output目录都复制到本地磁盘,查看结果:
hdfs dfs -get /user/chenshu/share/output/
[chenshu@hadoopMaster output]$ ll -alh total 4.3M drwxrwxr-x 3 chenshu chenshu 4.0K Sep 27 18:02 . drwxrwxr-x 6 chenshu chenshu 4.0K Sep 27 18:03 .. drwxrwxr-x 3 chenshu chenshu 4.0K Sep 27 18:02 _logs -rw-r--r-- 1 chenshu chenshu 286K Sep 27 18:02 part-r-00000 -rw-r--r-- 1 chenshu chenshu 290K Sep 27 18:02 part-r-00001 -rw-r--r-- 1 chenshu chenshu 284K Sep 27 18:02 part-r-00002 -rw-r--r-- 1 chenshu chenshu 284K Sep 27 18:02 part-r-00003 -rw-r--r-- 1 chenshu chenshu 289K Sep 27 18:02 part-r-00004 -rw-r--r-- 1 chenshu chenshu 284K Sep 27 18:02 part-r-00005 -rw-r--r-- 1 chenshu chenshu 290K Sep 27 18:02 part-r-00006 -rw-r--r-- 1 chenshu chenshu 284K Sep 27 18:02 part-r-00007 -rw-r--r-- 1 chenshu chenshu 285K Sep 27 18:02 part-r-00008 -rw-r--r-- 1 chenshu chenshu 286K Sep 27 18:02 part-r-00009 -rw-r--r-- 1 chenshu chenshu 290K Sep 27 18:02 part-r-00010 -rw-r--r-- 1 chenshu chenshu 288K Sep 27 18:02 part-r-00011 -rw-r--r-- 1 chenshu chenshu 287K Sep 27 18:02 part-r-00012 -rw-r--r-- 1 chenshu chenshu 281K Sep 27 18:02 part-r-00013 -rw-r--r-- 1 chenshu chenshu 286K Sep 27 18:02 part-r-00014 -rw-r--r-- 1 chenshu chenshu 0 Sep 27 18:02 _SUCCESS
打开其中一个part-r文件,可以看到结果如下:
536dbba04700aab274729ce9 2 536dbba14700aab274729cf9 2 536dbba14700aab274729cff 2 536dbba14700aab274729d02 2 536dbba14700aab274729d11 2 536dbba14700aab274729d20 2 536dbba14700aab274729d89 2 536dbba14700aab274729d8f 2 536dbba14700aab274729d98 3 536dbba14700aab274729d9e 3 536dbba14700aab274729de9 2 536dbba14700aab274729def 2 536dbba14700aab274729df8 2 536dbba14700aab274729dfe 2 536dbba14700aab274729e8e 2 536dbba14700aab274729ed9 3 536dbba14700aab274729eee 2
程序运行成功了。
所有代码在gitlab.com上。
git@gitlab.com:hadoop/share.git
MapReduce 编程 系列四 MapReduce例子程序运行
声明:以上内容来自用户投稿及互联网公开渠道收集整理发布,本网站不拥有所有权,未作人工编辑处理,也不承担相关法律责任,若内容有误或涉及侵权可进行投诉: 投诉/举报 工作人员会在5个工作日内联系你,一经查实,本站将立刻删除涉嫌侵权内容。