linux pig 安裝使用
程序员文章站
2022-03-20 22:06:34
...
0.准备工作 hadoop 服务器
10.156.50.35 yanfabu2-35.base.app.dev.yf zk1 hadoop1 master1 master 10.156.50.36 yanfabu2-36.base.app.dev.yf zk2 hadoop2 master2 10.156.50.37 yanfabu2-37.base.app.dev.yf zk3 hadoop3 slaver1
2.解压pig
tar xf pig-0.17.0.tar.gz mv pig-0.17.0 pig vim ~/.bash_profile export PIG_HOME=/home/zkkafka/pig export PATH=$PATH:$PIG_HOME/bin source ~/.bash_profile scp -r ~/.bash_profile zkkafka@10.156.50.36:/home/zkkafka/
3.配置文件修改
vim pig.properties fs.default.name=hdfs://master #core-site 配置 mapred.job.tracker=master1:10020 #maper-site 配置 jobhistory scp -r ../conf/ zkkafka@10.156.50.36:/home/zkkafka/pig/conf/ scp -r ../conf/ zkkafka@10.156.50.37:/home/zkkafka/pig/conf/
4.pig 版本
pig -version [zkkafka@yanfabu2-35 pig]$ pig -version 19/06/05 19:58:19 INFO Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS Apache Pig version 0.17.0 (r1797386) compiled Jun 02 2017, 15:41:58
5.准备数据
vim tel.txt
1363157985066 13726230503 00-FD-07-A4-72-B8:CMCC 120.196.100.82 i02.c.aliimg.com 24 27 2481 24681 200
hdfs dfs -mkdir -p /hdfs/pig/ hdfs dfs -put /home/zkkafka/pig/data/tel.txt /hdfs/pig/ hdfs dfs -lsr /hdfs/pig
[zkkafka@yanfabu2-35 conf]$ hdfs dfs -lsr /hdfs/pig lsr: DEPRECATED: Please use 'ls -R' instead. -rw-r--r-- 2 zkkafka supergroup 2546 2019-06-05 21:03 /hdfs/pig/tel.txt
6.进入pig 命令
[zkkafka@yanfabu2-37 ~]$ pig 19/06/06 16:44:27 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL 19/06/06 16:44:27 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE 19/06/06 16:44:27 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType 2019-06-06 16:44:27,558 [main] INFO org.apache.pig.Main - Apache Pig version 0.17.0 (r1797386) compiled Jun 02 2017, 15:41:58 2019-06-06 16:44:27,558 [main] INFO org.apache.pig.Main - Logging error messages to: /home/zkkafka/pig_1559810667556.log 2019-06-06 16:44:27,605 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /home/zkkafka/.pigbootup not found SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/zkkafka/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/zkkafka/hbase/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 2019-06-06 16:44:28,312 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 2019-06-06 16:44:28,312 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://master/ 2019-06-06 16:44:28,859 [main] INFO org.apache.pig.PigServer - Pig Script ID for the session: PIG-default-3d2427ca-7fdf-4252-ab78-cfb6ed2be36e 2019-06-06 16:44:28,859 [main] WARN org.apache.pig.PigServer - ATS is disabled since yarn.timeline-service.enabled set to false
7.使用pig
7.1导入数据到hive
t_wlan = LOAD '/hdfs/pig/tel.txt' USING PigStorage('\t') AS (t0:long, msisdn:chararray, t2:chararray, t3:chararray, t4:chararray, t5:chararray, t6:long, t7:long, t8:long, t9:long, t10:chararray);
7.2 查询 表 t_wlan
dump t_wlan; grunt> dump t_wlan; 2019-06-06 16:59:05,805 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN 2019-06-06 16:59:05,840 [main] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized 2019-06-06 16:59:05,840 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NestedLimitOptimizer, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]} 2019-06-06 16:59:05,847 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false 2019-06-06 16:59:05,848 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1 2019-06-06 16:59:05,848 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1 2019-06-06 16:59:05,880 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job 2019-06-06 16:59:05,881 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 2019-06-06 16:59:05,883 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process 2019-06-06 16:59:06,472 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp-1906860032/tmp-489322267/pig-0.17.0-core-h2.jar 2019-06-06 16:59:06,598 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp-1906860032/tmp1532488090/automaton-1.11-8.jar 2019-06-06 16:59:07,094 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp-1906860032/tmp731737639/antlr-runtime-3.4.jar 2019-06-06 16:59:07,190 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp-1906860032/tmp-2081706505/joda-time-2.9.3.jar 2019-06-06 16:59:07,192 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job 2019-06-06 16:59:07,192 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code. 2019-06-06 16:59:07,193 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche 2019-06-06 16:59:07,193 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize [] 2019-06-06 16:59:07,202 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission. 2019-06-06 16:59:07,264 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set. User classes may not be found. See Job or Job#setJar(String). 2019-06-06 16:59:07,286 [JobControl] INFO org.apache.pig.builtin.PigStorage - Using PigTextInputFormat 2019-06-06 16:59:07,289 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2019-06-06 16:59:07,289 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1 2019-06-06 16:59:07,291 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1 2019-06-06 16:59:07,487 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1 2019-06-06 16:59:07,590 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1559370613628_0014 2019-06-06 16:59:07,598 [JobControl] INFO org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources. 2019-06-06 16:59:07,856 [JobControl] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1559370613628_0014 2019-06-06 16:59:07,862 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://master1:8088/proxy/application_1559370613628_0014/ 2019-06-06 16:59:07,862 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1559370613628_0014 2019-06-06 16:59:07,862 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases t_wlan 2019-06-06 16:59:07,862 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: t_wlan[4,9],t_wlan[-1,-1] C: R: 2019-06-06 16:59:07,872 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete 2019-06-06 16:59:07,873 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0014] 2019-06-06 16:59:20,161 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete 2019-06-06 16:59:20,161 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0014] 2019-06-06 16:59:23,200 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 16:59:23,409 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 16:59:23,505 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 16:59:23,573 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete 2019-06-06 16:59:23,574 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: HadoopVersion PigVersion UserId StartedAt FinishedAt Features 2.6.5 0.17.0 zkkafka 2019-06-06 16:59:05 2019-06-06 16:59:23 UNKNOWN Success! Job Stats (time in seconds): JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outputs job_1559370613628_0014 1 0 4 4 4 4 0 0 0 0 t_wlan MAP_ONLY hdfs://master/tmp/temp-1906860032/tmp1645766804, Input(s): Successfully read 1 records (459 bytes) from: "/hdfs/pig/tel.txt" Output(s): Successfully stored 1 records (106 bytes) in: "hdfs://master/tmp/temp-1906860032/tmp1645766804" Counters: Total records written : 1 Total bytes written : 106 Spillable Memory Manager spill count : 0 Total bags proactively spilled: 0 Total records proactively spilled: 0 Job DAG: job_1559370613628_0014 2019-06-06 16:59:23,582 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 16:59:23,639 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 16:59:23,698 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 16:59:23,753 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Encountered Warning ACCESSING_NON_EXISTENT_FIELD 1 time(s). 2019-06-06 16:59:23,753 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success! 2019-06-06 16:59:23,755 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code. 2019-06-06 16:59:23,764 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2019-06-06 16:59:23,764 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1 (1363157985066,13726230503,00-FD-07-A4-72-B8:CMCC,120.196.100.82,i02.c.aliimg.com,24,27,2481,24681,200,)
7.2 A 表中抽出数据成B 表
t_wlan_simple = FOREACH t_wlan GENERATE msisdn, t6, t7, t8, t9; dump t_wlan_simple;
grunt> dump t_wlan_simple; 2019-06-06 17:03:42,827 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN 2019-06-06 17:03:42,869 [main] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized 2019-06-06 17:03:42,870 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NestedLimitOptimizer, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]} 2019-06-06 17:03:42,884 [main] INFO org.apache.pig.newplan.logical.rules.ColumnPruneVisitor - Columns pruned for t_wlan: $0, $2, $3, $4, $5, $10 2019-06-06 17:03:42,891 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false 2019-06-06 17:03:42,893 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1 2019-06-06 17:03:42,893 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1 2019-06-06 17:03:42,923 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job 2019-06-06 17:03:42,923 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 2019-06-06 17:03:42,924 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process 2019-06-06 17:03:43,081 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp-1906860032/tmp1408006038/pig-0.17.0-core-h2.jar 2019-06-06 17:03:43,178 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp-1906860032/tmp1149486211/automaton-1.11-8.jar 2019-06-06 17:03:43,281 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp-1906860032/tmp1835019327/antlr-runtime-3.4.jar 2019-06-06 17:03:43,378 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp-1906860032/tmp2065709292/joda-time-2.9.3.jar 2019-06-06 17:03:43,382 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job 2019-06-06 17:03:43,383 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code. 2019-06-06 17:03:43,383 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche 2019-06-06 17:03:43,383 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize [] 2019-06-06 17:03:43,399 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission. 2019-06-06 17:03:43,481 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set. User classes may not be found. See Job or Job#setJar(String). 2019-06-06 17:03:43,510 [JobControl] INFO org.apache.pig.builtin.PigStorage - Using PigTextInputFormat 2019-06-06 17:03:43,519 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2019-06-06 17:03:43,519 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1 2019-06-06 17:03:43,522 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1 2019-06-06 17:03:44,131 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1 2019-06-06 17:03:44,228 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1559370613628_0015 2019-06-06 17:03:44,232 [JobControl] INFO org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources. 2019-06-06 17:03:44,471 [JobControl] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1559370613628_0015 2019-06-06 17:03:44,475 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://master1:8088/proxy/application_1559370613628_0015/ 2019-06-06 17:03:44,475 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1559370613628_0015 2019-06-06 17:03:44,475 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases t_wlan,t_wlan_simple 2019-06-06 17:03:44,475 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: t_wlan[4,9],t_wlan_simple[-1,-1] C: R: 2019-06-06 17:03:44,480 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete 2019-06-06 17:03:44,480 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0015] 2019-06-06 17:03:58,648 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete 2019-06-06 17:03:58,649 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0015] 2019-06-06 17:04:04,679 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 17:04:04,910 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 17:04:04,977 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 17:04:05,043 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete 2019-06-06 17:04:05,044 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: HadoopVersion PigVersion UserId StartedAt FinishedAt Features 2.6.5 0.17.0 zkkafka 2019-06-06 17:03:42 2019-06-06 17:04:05 UNKNOWN Success! Job Stats (time in seconds): JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outputs job_1559370613628_0015 1 0 4 4 4 4 0 0 0 0 t_wlan,t_wlan_simple MAP_ONLY hdfs://master/tmp/temp-1906860032/tmp1236017200, Input(s): Successfully read 1 records (459 bytes) from: "/hdfs/pig/tel.txt" Output(s): Successfully stored 1 records (29 bytes) in: "hdfs://master/tmp/temp-1906860032/tmp1236017200" Counters: Total records written : 1 Total bytes written : 29 Spillable Memory Manager spill count : 0 Total bags proactively spilled: 0 Total records proactively spilled: 0 Job DAG: job_1559370613628_0015 2019-06-06 17:04:05,058 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 17:04:05,137 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 17:04:05,223 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 17:04:05,335 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success! 2019-06-06 17:04:05,337 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code. 2019-06-06 17:04:05,382 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2019-06-06 17:04:05,382 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1 (13726230503,27,2481,24681,200)
7.3 分组数据
t_wlan_simple_group = GROUP t_wlan_simple BY msisdn; dump t_wlan_simple_group;
grunt> dump t_wlan_simple_group; 2019-06-06 17:06:28,589 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: GROUP_BY 2019-06-06 17:06:28,640 [main] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized 2019-06-06 17:06:28,641 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NestedLimitOptimizer, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]} 2019-06-06 17:06:28,646 [main] INFO org.apache.pig.newplan.logical.rules.ColumnPruneVisitor - Columns pruned for t_wlan: $0, $2, $3, $4, $5, $10 2019-06-06 17:06:28,661 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false 2019-06-06 17:06:28,674 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1 2019-06-06 17:06:28,674 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1 2019-06-06 17:06:28,715 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job 2019-06-06 17:06:28,716 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 2019-06-06 17:06:28,717 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers. 2019-06-06 17:06:28,723 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator 2019-06-06 17:06:28,729 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=102 2019-06-06 17:06:28,729 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1 2019-06-06 17:06:28,730 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process 2019-06-06 17:06:28,929 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp-1906860032/tmp-412980928/pig-0.17.0-core-h2.jar 2019-06-06 17:06:29,039 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp-1906860032/tmp-1182557529/automaton-1.11-8.jar 2019-06-06 17:06:29,543 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp-1906860032/tmp-1112811524/antlr-runtime-3.4.jar 2019-06-06 17:06:30,043 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp-1906860032/tmp432932811/joda-time-2.9.3.jar 2019-06-06 17:06:30,046 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job 2019-06-06 17:06:30,047 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code. 2019-06-06 17:06:30,047 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche 2019-06-06 17:06:30,047 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize [] 2019-06-06 17:06:30,111 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission. 2019-06-06 17:06:30,174 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set. User classes may not be found. See Job or Job#setJar(String). 2019-06-06 17:06:30,189 [JobControl] INFO org.apache.pig.builtin.PigStorage - Using PigTextInputFormat 2019-06-06 17:06:30,191 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2019-06-06 17:06:30,191 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1 2019-06-06 17:06:30,193 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1 2019-06-06 17:06:30,391 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1 2019-06-06 17:06:30,488 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1559370613628_0016 2019-06-06 17:06:30,492 [JobControl] INFO org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources. 2019-06-06 17:06:30,734 [JobControl] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1559370613628_0016 2019-06-06 17:06:30,738 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://master1:8088/proxy/application_1559370613628_0016/ 2019-06-06 17:06:30,738 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1559370613628_0016 2019-06-06 17:06:30,738 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases t_wlan,t_wlan_simple,t_wlan_simple_group 2019-06-06 17:06:30,738 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: t_wlan[4,9],t_wlan_simple[-1,-1],t_wlan_simple_group[6,22] C: R: 2019-06-06 17:06:30,745 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete 2019-06-06 17:06:30,745 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0016] 2019-06-06 17:06:44,943 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete 2019-06-06 17:06:44,943 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0016] 2019-06-06 17:06:50,964 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0016] 2019-06-06 17:06:55,983 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 17:06:56,181 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 17:06:56,283 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 17:06:56,335 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete 2019-06-06 17:06:56,335 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: HadoopVersion PigVersion UserId StartedAt FinishedAt Features 2.6.5 0.17.0 zkkafka 2019-06-06 17:06:28 2019-06-06 17:06:56 GROUP_BY Success! Job Stats (time in seconds): JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outputs job_1559370613628_0016 1 1 4 4 4 4 4 4 4 4 t_wlan,t_wlan_simple,t_wlan_simple_group GROUP_BY hdfs://master/tmp/temp-1906860032/tmp912427234, Input(s): Successfully read 1 records (459 bytes) from: "/hdfs/pig/tel.txt" Output(s): Successfully stored 1 records (46 bytes) in: "hdfs://master/tmp/temp-1906860032/tmp912427234" Counters: Total records written : 1 Total bytes written : 46 Spillable Memory Manager spill count : 0 Total bags proactively spilled: 0 Total records proactively spilled: 0 Job DAG: job_1559370613628_0016 2019-06-06 17:06:56,345 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 17:06:56,403 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 17:06:56,474 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 17:06:56,554 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success! 2019-06-06 17:06:56,556 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code. 2019-06-06 17:06:56,568 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2019-06-06 17:06:56,568 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1 (13726230503,{(13726230503,27,2481,24681,200)})
7.4 流量汇总
t_wlan_simple_group_sum = FOREACH t_wlan_simple_group GENERATE group, SUM(t_wlan_simple.t6), SUM(t_wlan_simple.t7), SUM(t_wlan_simple.t8), SUM(t_wlan_simple.t9); dump t_wlan_simple_group_sum;
grunt> dump t_wlan_simple_group_sum 2019-06-06 17:15:39,824 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: GROUP_BY 2019-06-06 17:15:39,877 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code. 2019-06-06 17:15:39,878 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NestedLimitOptimizer, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]} 2019-06-06 17:15:39,885 [main] INFO org.apache.pig.newplan.logical.rules.ColumnPruneVisitor - Columns pruned for t_wlan: $0, $2, $3, $4, $5, $10 2019-06-06 17:15:39,904 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false 2019-06-06 17:15:39,908 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.CombinerOptimizerUtil - Choosing to move algebraic foreach to combiner 2019-06-06 17:15:39,972 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1 2019-06-06 17:15:39,972 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1 2019-06-06 17:15:40,000 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job 2019-06-06 17:15:40,001 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 2019-06-06 17:15:40,002 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers. 2019-06-06 17:15:40,002 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator 2019-06-06 17:15:40,005 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=102 2019-06-06 17:15:40,005 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1 2019-06-06 17:15:40,005 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process 2019-06-06 17:15:40,602 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp-1906860032/tmp-784677978/pig-0.17.0-core-h2.jar 2019-06-06 17:15:40,699 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp-1906860032/tmp-1113714067/automaton-1.11-8.jar 2019-06-06 17:15:40,796 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp-1906860032/tmp-1701171835/antlr-runtime-3.4.jar 2019-06-06 17:15:40,910 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp-1906860032/tmp-725132195/joda-time-2.9.3.jar 2019-06-06 17:15:40,914 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job 2019-06-06 17:15:40,915 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code. 2019-06-06 17:15:40,915 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche 2019-06-06 17:15:40,915 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize [] 2019-06-06 17:15:40,968 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission. 2019-06-06 17:15:41,035 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set. User classes may not be found. See Job or Job#setJar(String). 2019-06-06 17:15:41,055 [JobControl] INFO org.apache.pig.builtin.PigStorage - Using PigTextInputFormat 2019-06-06 17:15:41,057 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2019-06-06 17:15:41,057 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1 2019-06-06 17:15:41,060 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1 2019-06-06 17:15:41,282 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1 2019-06-06 17:15:41,432 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1559370613628_0018 2019-06-06 17:15:41,438 [JobControl] INFO org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources. 2019-06-06 17:15:41,686 [JobControl] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1559370613628_0018 2019-06-06 17:15:41,691 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://master1:8088/proxy/application_1559370613628_0018/ 2019-06-06 17:15:41,692 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1559370613628_0018 2019-06-06 17:15:41,692 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases t_wlan,t_wlan_simple,t_wlan_simple_group,t_wlan_simple_group_sum 2019-06-06 17:15:41,692 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: t_wlan[4,9],t_wlan_simple[-1,-1],t_wlan_simple_group_sum[7,26],t_wlan_simple_group[6,22] C: t_wlan_simple_group_sum[7,26],t_wlan_simple_group[6,22] R: t_wlan_simple_group_sum[7,26] 2019-06-06 17:15:41,698 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete 2019-06-06 17:15:41,698 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0018] 2019-06-06 17:15:55,903 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete 2019-06-06 17:15:55,903 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0018] 2019-06-06 17:16:00,962 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0018] 2019-06-06 17:16:06,981 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 17:16:07,185 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 17:16:07,257 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 17:16:07,332 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete 2019-06-06 17:16:07,333 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: HadoopVersion PigVersion UserId StartedAt FinishedAt Features 2.6.5 0.17.0 zkkafka 2019-06-06 17:15:39 2019-06-06 17:16:07 GROUP_BY Success! Job Stats (time in seconds): JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outputs job_1559370613628_0018 1 1 3 3 3 3 3 3 3 3 t_wlan,t_wlan_simple,t_wlan_simple_group,t_wlan_simple_group_sum GROUP_BY,COMBINER hdfs://master/tmp/temp-1906860032/tmp2100428296, Input(s): Successfully read 1 records (459 bytes) from: "/hdfs/pig/tel.txt" Output(s): Successfully stored 1 records (29 bytes) in: "hdfs://master/tmp/temp-1906860032/tmp2100428296" Counters: Total records written : 1 Total bytes written : 29 Spillable Memory Manager spill count : 0 Total bags proactively spilled: 0 Total records proactively spilled: 0 Job DAG: job_1559370613628_0018 2019-06-06 17:16:07,343 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 17:16:07,402 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 17:16:07,456 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-06 17:16:07,512 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success! 2019-06-06 17:16:07,513 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code. 2019-06-06 17:16:07,529 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2019-06-06 17:16:07,529 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1 (13726230503,27,2481,24681,200)
7.5 存储到HDFS中
STORE t_wlan_simple_group_sum INTO '/hdfs/pig/wlan_result';
[zkkafka@yanfabu2-36 ~]$ hdfs dfs -text /hdfs/pig/wlan_result/part-r-00000 13726230503 27 2481 24681 200 [zkkafka@yanfabu2-36 ~]$
7.6 排序
t_wlan_simple_group_sum_group = ORDER t_wlan_simple_group_sum BY group; DUMP t_wlan_simple_group_sum_group;
grunt> DUMP t_wlan_simple_group_sum_group; 2019-06-12 15:35:33,188 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: GROUP_BY,ORDER_BY 2019-06-12 15:35:33,235 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code. 2019-06-12 15:35:33,236 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NestedLimitOptimizer, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]} 2019-06-12 15:35:33,242 [main] INFO org.apache.pig.newplan.logical.rules.ColumnPruneVisitor - Columns pruned for t_wlan: $0, $2, $3, $4, $5, $10 2019-06-12 15:35:33,255 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false 2019-06-12 15:35:33,280 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.CombinerOptimizerUtil - Choosing to move algebraic foreach to combiner 2019-06-12 15:35:33,291 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SecondaryKeyOptimizerMR - Using Secondary Key Optimization for MapReduce node scope-283 2019-06-12 15:35:33,292 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 3 2019-06-12 15:35:33,292 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 3 2019-06-12 15:35:33,328 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job 2019-06-12 15:35:33,329 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 2019-06-12 15:35:33,330 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers. 2019-06-12 15:35:33,330 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator 2019-06-12 15:35:33,332 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=102 2019-06-12 15:35:33,333 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1 2019-06-12 15:35:33,333 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process 2019-06-12 15:35:33,510 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp1544583298/tmp-955805369/pig-0.17.0-core-h2.jar 2019-06-12 15:35:33,595 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp1544583298/tmp712002240/automaton-1.11-8.jar 2019-06-12 15:35:34,074 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp1544583298/tmp1938988919/antlr-runtime-3.4.jar 2019-06-12 15:35:34,154 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp1544583298/tmp1704097364/joda-time-2.9.3.jar 2019-06-12 15:35:34,157 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job 2019-06-12 15:35:34,158 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code. 2019-06-12 15:35:34,158 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche 2019-06-12 15:35:34,158 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize [] 2019-06-12 15:35:34,193 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission. 2019-06-12 15:35:34,256 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set. User classes may not be found. See Job or Job#setJar(String). 2019-06-12 15:35:34,277 [JobControl] INFO org.apache.pig.builtin.PigStorage - Using PigTextInputFormat 2019-06-12 15:35:34,288 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2019-06-12 15:35:34,289 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1 2019-06-12 15:35:34,291 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1 2019-06-12 15:35:34,450 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1 2019-06-12 15:35:34,952 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1559370613628_0024 2019-06-12 15:35:34,960 [JobControl] INFO org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources. 2019-06-12 15:35:35,211 [JobControl] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1559370613628_0024 2019-06-12 15:35:35,216 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://master1:8088/proxy/application_1559370613628_0024/ 2019-06-12 15:35:35,216 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1559370613628_0024 2019-06-12 15:35:35,216 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases t_wlan,t_wlan_simple,t_wlan_simple_group,t_wlan_simple_group_sum 2019-06-12 15:35:35,216 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: t_wlan[1,9],t_wlan_simple[-1,-1],t_wlan_simple_group_sum[4,26],t_wlan_simple_group[3,22] C: t_wlan_simple_group_sum[4,26],t_wlan_simple_group[3,22] R: t_wlan_simple_group_sum[4,26] 2019-06-12 15:35:35,231 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete 2019-06-12 15:35:35,231 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0024] 2019-06-12 15:35:47,386 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 16% complete 2019-06-12 15:35:47,386 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0024] 2019-06-12 15:35:54,902 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 33% complete 2019-06-12 15:35:54,902 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0024] 2019-06-12 15:36:00,424 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-12 15:36:00,596 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-12 15:36:00,651 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-12 15:36:00,688 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job 2019-06-12 15:36:00,688 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 2019-06-12 15:36:00,689 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers. 2019-06-12 15:36:00,689 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator 2019-06-12 15:36:00,698 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=29 2019-06-12 15:36:00,699 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1 2019-06-12 15:36:00,699 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process 2019-06-12 15:36:01,245 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp1544583298/tmp-87045202/pig-0.17.0-core-h2.jar 2019-06-12 15:36:01,308 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp1544583298/tmp568012746/automaton-1.11-8.jar 2019-06-12 15:36:01,405 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp1544583298/tmp780878190/antlr-runtime-3.4.jar 2019-06-12 15:36:01,485 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp1544583298/tmp772462384/joda-time-2.9.3.jar 2019-06-12 15:36:01,487 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job 2019-06-12 15:36:01,487 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code. 2019-06-12 15:36:01,487 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche 2019-06-12 15:36:01,487 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize [] 2019-06-12 15:36:01,508 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission. 2019-06-12 15:36:01,559 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set. User classes may not be found. See Job or Job#setJar(String). 2019-06-12 15:36:01,582 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2019-06-12 15:36:01,582 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1 2019-06-12 15:36:01,582 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1 2019-06-12 15:36:01,749 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1 2019-06-12 15:36:02,233 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1559370613628_0025 2019-06-12 15:36:02,237 [JobControl] INFO org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources. 2019-06-12 15:36:02,472 [JobControl] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1559370613628_0025 2019-06-12 15:36:02,476 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://master1:8088/proxy/application_1559370613628_0025/ 2019-06-12 15:36:02,476 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1559370613628_0025 2019-06-12 15:36:02,476 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases t_wlan_simple_group_sum_group 2019-06-12 15:36:02,476 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: t_wlan_simple_group_sum_group[6,32] C: R: 2019-06-12 15:36:16,558 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete 2019-06-12 15:36:16,558 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0025] 2019-06-12 15:36:24,572 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 66% complete 2019-06-12 15:36:24,572 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0025] 2019-06-12 15:36:27,589 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-12 15:36:27,756 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-12 15:36:27,814 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-12 15:36:27,850 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job 2019-06-12 15:36:27,850 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 2019-06-12 15:36:27,854 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers. 2019-06-12 15:36:27,854 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1 2019-06-12 15:36:27,854 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process 2019-06-12 15:36:27,995 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp1544583298/tmp-1238945561/pig-0.17.0-core-h2.jar 2019-06-12 15:36:28,103 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp1544583298/tmp1385874378/automaton-1.11-8.jar 2019-06-12 15:36:28,223 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp1544583298/tmp2107107107/antlr-runtime-3.4.jar 2019-06-12 15:36:28,297 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp1544583298/tmp-637573401/joda-time-2.9.3.jar 2019-06-12 15:36:28,301 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job 2019-06-12 15:36:28,302 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code. 2019-06-12 15:36:28,302 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche 2019-06-12 15:36:28,302 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize [] 2019-06-12 15:36:28,374 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission. 2019-06-12 15:36:28,445 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set. User classes may not be found. See Job or Job#setJar(String). 2019-06-12 15:36:28,465 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2019-06-12 15:36:28,465 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1 2019-06-12 15:36:28,465 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1 2019-06-12 15:36:28,599 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1 2019-06-12 15:36:28,675 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1559370613628_0026 2019-06-12 15:36:28,679 [JobControl] INFO org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources. 2019-06-12 15:36:28,918 [JobControl] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1559370613628_0026 2019-06-12 15:36:28,921 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://master1:8088/proxy/application_1559370613628_0026/ 2019-06-12 15:36:28,921 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1559370613628_0026 2019-06-12 15:36:28,921 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases t_wlan_simple_group_sum_group 2019-06-12 15:36:28,921 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: t_wlan_simple_group_sum_group[6,32] C: R: 2019-06-12 15:36:44,145 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 83% complete 2019-06-12 15:36:44,146 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0026] 2019-06-12 15:36:51,164 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0026] 2019-06-12 15:36:54,180 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-12 15:36:54,330 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-12 15:36:54,369 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-12 15:36:54,401 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete 2019-06-12 15:36:54,527 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: HadoopVersion PigVersion UserId StartedAt FinishedAt Features 2.6.5 0.17.0 zkkafka 2019-06-12 15:35:33 2019-06-12 15:36:54 GROUP_BY,ORDER_BY Success! Job Stats (time in seconds): JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outputs job_1559370613628_0024 1 1 3 3 3 3 4 4 4 4 t_wlan,t_wlan_simple,t_wlan_simple_group,t_wlan_simple_group_sum GROUP_BY,COMBINER job_1559370613628_0025 1 1 5 5 5 5 5 5 5 5 t_wlan_simple_group_sum_group SAMPLER job_1559370613628_0026 1 1 3 3 3 3 4 4 4 4 t_wlan_simple_group_sum_group ORDER_BY hdfs://master/tmp/temp1544583298/tmp-717585849, Input(s): Successfully read 1 records (459 bytes) from: "/hdfs/pig/tel.txt" Output(s): Successfully stored 1 records (29 bytes) in: "hdfs://master/tmp/temp1544583298/tmp-717585849" Counters: Total records written : 1 Total bytes written : 29 Spillable Memory Manager spill count : 0 Total bags proactively spilled: 0 Total records proactively spilled: 0 Job DAG: job_1559370613628_0024 -> job_1559370613628_0025, job_1559370613628_0025 -> job_1559370613628_0026, job_1559370613628_0026 2019-06-12 15:36:54,532 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-12 15:36:54,584 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-12 15:36:54,623 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-12 15:36:54,664 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-12 15:36:54,702 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-12 15:36:54,735 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-12 15:36:54,776 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-12 15:36:54,836 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-12 15:36:54,871 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-06-12 15:36:54,928 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success! 2019-06-12 15:36:54,929 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code. 2019-06-12 15:36:54,934 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2019-06-12 15:36:54,934 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1 (13726230503,27,2481,24681,200)
8.脚本
pig -x mapreduce t_wlan.pig
捐助开发者
在兴趣的驱动下,写一个免费
的东西,有欣喜,也还有汗水,希望你喜欢我的作品,同时也能支持一下。 当然,有钱捧个钱场(支持支付宝和微信 以及扣扣群),没钱捧个人场,谢谢各位。
个人主页:http://knight-black-bob.iteye.com/
谢谢您的赞助,我会做的更好!
上一篇: python3基础之“函数(1)”
下一篇: 3. 基本数据结构-元组