首页 > 代码库 > pig 调试(explain&illerstrate)

pig 调试(explain&illerstrate)

grunt> cat t.txt
kw1     2
kw3     1
kw2     4
kw1     5
kw2     2

cat test.pig 
A = LOAD ‘/user/input/t.txt‘ as (k:chararray,c:int);
B = group A BY k;
C = foreach B generate group,SUM(A.c);
-- DUMP C;
store C into ‘test.output‘;
$ pig -e ‘illustrate -script test.pig‘
2014-05-03 17:11:25,182 [main] INFO  org.apache.pig.Main - Logging error messages to: /opt/dataset/pig_1399108285179.log
2014-05-03 17:11:25,330 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://10.0.3.142:9000
2014-05-03 17:11:25,514 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: 10.0.3.142:9001
2014-05-03 17:11:26,103 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://10.0.3.142:9000
2014-05-03 17:11:26,104 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: 10.0.3.142:9001
2014-05-03 17:11:26,291 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,305 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,306 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,315 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,330 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,474 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2014-05-03 17:11:26,475 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2014-05-03 17:11:26,513 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,520 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,521 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,521 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,522 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,523 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,531 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,534 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,534 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,597 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,599 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,599 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,600 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,601 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,601 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,608 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,611 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,611 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,639 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,641 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,642 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,642 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,643 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,643 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,650 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,652 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,652 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,677 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,679 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,679 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,680 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,680 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,681 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,686 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,688 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,688 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,710 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,712 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,712 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,713 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,714 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,714 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,721 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,724 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,724 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,744 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,746 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,746 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,747 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,747 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,748 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,754 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,757 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,757 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,772 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,774 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,774 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,775 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,775 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,776 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,782 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,784 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,784 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,804 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,806 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,806 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,807 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,807 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,808 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,812 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,821 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,821 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
(kw1,2)
2014-05-03 17:11:26,840 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,842 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,842 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,842 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,843 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,843 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,846 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,849 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,849 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,862 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,863 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,863 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,864 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,864 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,865 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,868 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,870 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,870 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,882 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,884 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,884 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,884 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,885 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,885 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,887 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,889 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,890 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,901 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,903 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,903 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,903 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,904 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,904 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,906 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,908 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,908 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,919 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,920 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,920 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,921 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,921 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,922 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,924 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,926 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,926 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,937 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,938 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,938 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,938 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,939 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,939 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,941 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,943 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,943 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,954 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,955 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,955 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,956 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,956 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,956 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,959 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,961 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,961 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,973 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,974 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,974 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,974 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,975 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,975 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,978 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,980 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,980 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
-------------------------------------
| A     | k:chararray    | c:int    | 
-------------------------------------
|       | kw1            | 2        | 
|       | kw1            | 5        | 
-------------------------------------
-----------------------------------------------------------------------------
| B     | group:chararray    | A:bag{:tuple(k:chararray,c:int)}             | 
-----------------------------------------------------------------------------
|       | kw1                | {(kw1, 2), (kw1, 5)}                         | 
-----------------------------------------------------------------------------
-----------------------------------------
| C     | group:chararray    | :long    | 
-----------------------------------------
|       | kw1                | 7        | 
-----------------------------------------
-------------------------------------------------
| Store : C     | group:chararray    | :long    | 
-------------------------------------------------
|               | kw1                | 7        | 
-------------------------------------------------
$ pig -e ‘explain -script test.pig‘2014-05-03 17:19:59,359 [main] INFO  org.apache.pig.Main - Logging error messages to: /opt/dataset/pig_1399108799355.log2014-05-03 17:19:59,497 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://10.0.3.142:90002014-05-03 17:19:59,685 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: 10.0.3.142:9001#-----------------------------------------------# New Logical Plan:#-----------------------------------------------C: (Name: LOStore Schema: group#19:chararray,#34:long)||---C: (Name: LOForEach Schema: group#19:chararray,#34:long)    |   |    |   (Name: LOGenerate[false,false] Schema: group#19:chararray,#34:long)ColumnPrune:InputUids=[19, 30]ColumnPrune:OutputUids=[34, 19]    |   |   |    |   |   group:(Name: Project Type: chararray Uid: 19 Input: 0 Column: (*))    |   |   |    |   |   (Name: UserFunc(org.apache.pig.builtin.IntSum) Type: long Uid: 34)    |   |   |    |   |   |---(Name: Dereference Type: bag Uid: 33 Column:[1])    |   |       |    |   |       |---A:(Name: Project Type: bag Uid: 30 Input: 1 Column: (*))    |   |    |   |---(Name: LOInnerLoad[0] Schema: group#19:chararray)    |   |    |   |---A: (Name: LOInnerLoad[1] Schema: k#19:chararray,c#20:int)    |    |---B: (Name: LOCogroup Schema: group#19:chararray,A#30:bag{#37:tuple(k#19:chararray,c#20:int)})        |   |        |   k:(Name: Project Type: chararray Uid: 19 Input: 0 Column: 0)        |        |---A: (Name: LOForEach Schema: k#19:chararray,c#20:int)            |   |            |   (Name: LOGenerate[false,false] Schema: k#19:chararray,c#20:int)ColumnPrune:InputUids=[19, 20]ColumnPrune:OutputUids=[19, 20]            |   |   |            |   |   (Name: Cast Type: chararray Uid: 19)            |   |   |            |   |   |---k:(Name: Project Type: bytearray Uid: 19 Input: 0 Column: (*))            |   |   |            |   |   (Name: Cast Type: int Uid: 20)            |   |   |            |   |   |---c:(Name: Project Type: bytearray Uid: 20 Input: 1 Column: (*))            |   |            |   |---(Name: LOInnerLoad[0] Schema: k#19:bytearray)            |   |            |   |---(Name: LOInnerLoad[1] Schema: c#20:bytearray)            |            |---A: (Name: LOLoad Schema: k#19:bytearray,c#20:bytearray)RequiredFields:null#-----------------------------------------------# Physical Plan:#-----------------------------------------------C: Store(hdfs://namenode:9000/user/deve_test_user/test.output:org.apache.pig.builtin.PigStorage) - scope-19||---C: New For Each(false,false)[bag] - scope-18    |   |    |   Project[chararray][0] - scope-12    |   |    |   POUserFunc(org.apache.pig.builtin.IntSum)[long] - scope-16    |   |    |   |---Project[bag][1] - scope-15    |       |    |       |---Project[bag][1] - scope-14    |    |---B: Package[tuple]{chararray} - scope-9        |        |---B: Global Rearrange[tuple] - scope-8            |            |---B: Local Rearrange[tuple]{chararray}(false) - scope-10                |   |                |   Project[chararray][0] - scope-11                |                |---A: New For Each(false,false)[bag] - scope-7                    |   |                    |   Cast[chararray] - scope-2                    |   |                    |   |---Project[bytearray][0] - scope-1                    |   |                    |   Cast[int] - scope-5                    |   |                    |   |---Project[bytearray][1] - scope-4                    |                    |---A: Load(/user/input/t.txt:org.apache.pig.builtin.PigStorage) - scope-02014-05-03 17:20:00,316 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false2014-05-03 17:20:00,326 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.CombinerOptimizer - Choosing to move algebraic foreach to combiner2014-05-03 17:20:00,349 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 12014-05-03 17:20:00,349 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1#--------------------------------------------------# Map Reduce Plan                                  #--------------------------------------------------MapReduce node scope-20Map PlanB: Local Rearrange[tuple]{chararray}(false) - scope-33|   ||   Project[chararray][0] - scope-34||---C: New For Each(false,false)[bag] - scope-21    |   |    |   Project[chararray][0] - scope-22    |   |    |   POUserFunc(org.apache.pig.builtin.IntSum$Initial)[tuple] - scope-23    |   |    |   |---Project[bag][1] - scope-24    |       |    |       |---Project[bag][1] - scope-25    |    |---Pre Combiner Local Rearrange[tuple]{Unknown} - scope-35        |        |---A: New For Each(false,false)[bag] - scope-7            |   |            |   Cast[chararray] - scope-2            |   |            |   |---Project[bytearray][0] - scope-1            |   |            |   Cast[int] - scope-5            |   |            |   |---Project[bytearray][1] - scope-4            |            |---A: Load(/user/input/t.txt:org.apache.pig.builtin.PigStorage) - scope-0--------Combine PlanB: Local Rearrange[tuple]{chararray}(false) - scope-37|   ||   Project[chararray][0] - scope-38||---C: New For Each(false,false)[bag] - scope-26    |   |    |   Project[chararray][0] - scope-27    |   |    |   POUserFunc(org.apache.pig.builtin.IntSum$Intermediate)[tuple] - scope-28    |   |    |   |---Project[bag][1] - scope-29    |    |---POCombinerPackage[tuple]{chararray} - scope-31--------Reduce PlanC: Store(hdfs://namenode:9000/user/deve_test_user/test.output:org.apache.pig.builtin.PigStorage) - scope-19||---C: New For Each(false,false)[bag] - scope-18    |   |    |   Project[chararray][0] - scope-12    |   |    |   POUserFunc(org.apache.pig.builtin.IntSum$Final)[long] - scope-16    |   |    |   |---Project[bag][1] - scope-30    |    |---POCombinerPackage[tuple]{chararray} - scope-39--------Global sort: false----------------