欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

linux pig 安裝使用

程序员文章站 2022-03-20 22:06:34
...

 

0.准备工作 hadoop 服务器

10.156.50.35 yanfabu2-35.base.app.dev.yf zk1  hadoop1 master1 master
10.156.50.36 yanfabu2-36.base.app.dev.yf zk2  hadoop2 master2
10.156.50.37 yanfabu2-37.base.app.dev.yf zk3  hadoop3 slaver1

 

2.解压pig

 tar xf pig-0.17.0.tar.gz 
 mv pig-0.17.0 pig

vim ~/.bash_profile

export PIG_HOME=/home/zkkafka/pig
export PATH=$PATH:$PIG_HOME/bin

source ~/.bash_profile

scp -r ~/.bash_profile  zkkafka@10.156.50.36:/home/zkkafka/

 

3.配置文件修改

vim pig.properties

fs.default.name=hdfs://master     #core-site 配置
mapred.job.tracker=master1:10020  #maper-site 配置 jobhistory

scp -r ../conf/  zkkafka@10.156.50.36:/home/zkkafka/pig/conf/
scp -r ../conf/  zkkafka@10.156.50.37:/home/zkkafka/pig/conf/

 

4.pig 版本

pig -version
[zkkafka@yanfabu2-35 pig]$ pig -version
19/06/05 19:58:19 INFO Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
Apache Pig version 0.17.0 (r1797386) 
compiled Jun 02 2017, 15:41:58

 

 

5.准备数据

vim tel.txt

1363157985066	13726230503	00-FD-07-A4-72-B8:CMCC	120.196.100.82	i02.c.aliimg.com	24	27	2481	24681	200

 

hdfs dfs -mkdir -p /hdfs/pig/
hdfs dfs -put /home/zkkafka/pig/data/tel.txt  /hdfs/pig/
hdfs dfs -lsr /hdfs/pig

 

[zkkafka@yanfabu2-35 conf]$ hdfs dfs -lsr /hdfs/pig
lsr: DEPRECATED: Please use 'ls -R' instead.
-rw-r--r--   2 zkkafka supergroup       2546 2019-06-05 21:03 /hdfs/pig/tel.txt

 

6.进入pig 命令

 

[zkkafka@yanfabu2-37 ~]$ pig
19/06/06 16:44:27 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
19/06/06 16:44:27 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
19/06/06 16:44:27 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
2019-06-06 16:44:27,558 [main] INFO  org.apache.pig.Main - Apache Pig version 0.17.0 (r1797386) compiled Jun 02 2017, 15:41:58
2019-06-06 16:44:27,558 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/zkkafka/pig_1559810667556.log
2019-06-06 16:44:27,605 [main] INFO  org.apache.pig.impl.util.Utils - Default bootup file /home/zkkafka/.pigbootup not found
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/zkkafka/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/zkkafka/hbase/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2019-06-06 16:44:28,312 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-06-06 16:44:28,312 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://master/
2019-06-06 16:44:28,859 [main] INFO  org.apache.pig.PigServer - Pig Script ID for the session: PIG-default-3d2427ca-7fdf-4252-ab78-cfb6ed2be36e
2019-06-06 16:44:28,859 [main] WARN  org.apache.pig.PigServer - ATS is disabled since yarn.timeline-service.enabled set to false

 

7.使用pig

7.1导入数据到hive

 t_wlan = LOAD '/hdfs/pig/tel.txt' USING PigStorage('\t')   AS (t0:long, msisdn:chararray, t2:chararray, t3:chararray, t4:chararray, t5:chararray, t6:long, t7:long, t8:long, t9:long, t10:chararray);

 

7.2 查询 表 t_wlan

dump t_wlan;

grunt> dump t_wlan;
2019-06-06 16:59:05,805 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2019-06-06 16:59:05,840 [main] WARN  org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2019-06-06 16:59:05,840 [main] INFO  org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NestedLimitOptimizer, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2019-06-06 16:59:05,847 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2019-06-06 16:59:05,848 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2019-06-06 16:59:05,848 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2019-06-06 16:59:05,880 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2019-06-06 16:59:05,881 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-06-06 16:59:05,883 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2019-06-06 16:59:06,472 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp-1906860032/tmp-489322267/pig-0.17.0-core-h2.jar
2019-06-06 16:59:06,598 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp-1906860032/tmp1532488090/automaton-1.11-8.jar
2019-06-06 16:59:07,094 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp-1906860032/tmp731737639/antlr-runtime-3.4.jar
2019-06-06 16:59:07,190 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp-1906860032/tmp-2081706505/joda-time-2.9.3.jar
2019-06-06 16:59:07,192 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2019-06-06 16:59:07,192 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2019-06-06 16:59:07,193 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2019-06-06 16:59:07,193 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2019-06-06 16:59:07,202 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2019-06-06 16:59:07,264 [JobControl] WARN  org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2019-06-06 16:59:07,286 [JobControl] INFO  org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2019-06-06 16:59:07,289 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-06 16:59:07,289 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-06-06 16:59:07,291 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2019-06-06 16:59:07,487 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2019-06-06 16:59:07,590 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1559370613628_0014
2019-06-06 16:59:07,598 [JobControl] INFO  org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources.
2019-06-06 16:59:07,856 [JobControl] INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1559370613628_0014
2019-06-06 16:59:07,862 [JobControl] INFO  org.apache.hadoop.mapreduce.Job - The url to track the job: http://master1:8088/proxy/application_1559370613628_0014/
2019-06-06 16:59:07,862 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1559370613628_0014
2019-06-06 16:59:07,862 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases t_wlan
2019-06-06 16:59:07,862 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: t_wlan[4,9],t_wlan[-1,-1] C:  R: 
2019-06-06 16:59:07,872 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2019-06-06 16:59:07,873 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0014]
2019-06-06 16:59:20,161 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2019-06-06 16:59:20,161 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0014]
2019-06-06 16:59:23,200 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 16:59:23,409 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 16:59:23,505 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 16:59:23,573 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2019-06-06 16:59:23,574 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: 

HadoopVersion	PigVersion	UserId	StartedAt	FinishedAt	Features
2.6.5	0.17.0	zkkafka	2019-06-06 16:59:05	2019-06-06 16:59:23	UNKNOWN

Success!

Job Stats (time in seconds):
JobId	Maps	Reduces	MaxMapTime	MinMapTime	AvgMapTime	MedianMapTime	MaxReduceTime	MinReduceTime	AvgReduceTime	MedianReducetime	Alias	Feature	Outputs
job_1559370613628_0014	1	0	4	4	4	4	0	0	0	0	t_wlan	MAP_ONLY	hdfs://master/tmp/temp-1906860032/tmp1645766804,

Input(s):
Successfully read 1 records (459 bytes) from: "/hdfs/pig/tel.txt"

Output(s):
Successfully stored 1 records (106 bytes) in: "hdfs://master/tmp/temp-1906860032/tmp1645766804"

Counters:
Total records written : 1
Total bytes written : 106
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1559370613628_0014


2019-06-06 16:59:23,582 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 16:59:23,639 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 16:59:23,698 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 16:59:23,753 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Encountered Warning ACCESSING_NON_EXISTENT_FIELD 1 time(s).
2019-06-06 16:59:23,753 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2019-06-06 16:59:23,755 [main] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2019-06-06 16:59:23,764 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-06 16:59:23,764 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(1363157985066,13726230503,00-FD-07-A4-72-B8:CMCC,120.196.100.82,i02.c.aliimg.com,24,27,2481,24681,200,)

 

7.2 A 表中抽出数据成B 表

 

t_wlan_simple = FOREACH t_wlan GENERATE msisdn, t6, t7, t8, t9;
dump t_wlan_simple;

 

grunt> dump t_wlan_simple;
2019-06-06 17:03:42,827 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2019-06-06 17:03:42,869 [main] WARN  org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2019-06-06 17:03:42,870 [main] INFO  org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NestedLimitOptimizer, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2019-06-06 17:03:42,884 [main] INFO  org.apache.pig.newplan.logical.rules.ColumnPruneVisitor - Columns pruned for t_wlan: $0, $2, $3, $4, $5, $10
2019-06-06 17:03:42,891 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2019-06-06 17:03:42,893 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2019-06-06 17:03:42,893 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2019-06-06 17:03:42,923 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2019-06-06 17:03:42,923 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-06-06 17:03:42,924 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2019-06-06 17:03:43,081 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp-1906860032/tmp1408006038/pig-0.17.0-core-h2.jar
2019-06-06 17:03:43,178 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp-1906860032/tmp1149486211/automaton-1.11-8.jar
2019-06-06 17:03:43,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp-1906860032/tmp1835019327/antlr-runtime-3.4.jar
2019-06-06 17:03:43,378 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp-1906860032/tmp2065709292/joda-time-2.9.3.jar
2019-06-06 17:03:43,382 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2019-06-06 17:03:43,383 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2019-06-06 17:03:43,383 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2019-06-06 17:03:43,383 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2019-06-06 17:03:43,399 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2019-06-06 17:03:43,481 [JobControl] WARN  org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2019-06-06 17:03:43,510 [JobControl] INFO  org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2019-06-06 17:03:43,519 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-06 17:03:43,519 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-06-06 17:03:43,522 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2019-06-06 17:03:44,131 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2019-06-06 17:03:44,228 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1559370613628_0015
2019-06-06 17:03:44,232 [JobControl] INFO  org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources.
2019-06-06 17:03:44,471 [JobControl] INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1559370613628_0015
2019-06-06 17:03:44,475 [JobControl] INFO  org.apache.hadoop.mapreduce.Job - The url to track the job: http://master1:8088/proxy/application_1559370613628_0015/
2019-06-06 17:03:44,475 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1559370613628_0015
2019-06-06 17:03:44,475 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases t_wlan,t_wlan_simple
2019-06-06 17:03:44,475 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: t_wlan[4,9],t_wlan_simple[-1,-1] C:  R: 
2019-06-06 17:03:44,480 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2019-06-06 17:03:44,480 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0015]
2019-06-06 17:03:58,648 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2019-06-06 17:03:58,649 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0015]
2019-06-06 17:04:04,679 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:04:04,910 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:04:04,977 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:04:05,043 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2019-06-06 17:04:05,044 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: 

HadoopVersion	PigVersion	UserId	StartedAt	FinishedAt	Features
2.6.5	0.17.0	zkkafka	2019-06-06 17:03:42	2019-06-06 17:04:05	UNKNOWN

Success!

Job Stats (time in seconds):
JobId	Maps	Reduces	MaxMapTime	MinMapTime	AvgMapTime	MedianMapTime	MaxReduceTime	MinReduceTime	AvgReduceTime	MedianReducetime	Alias	Feature	Outputs
job_1559370613628_0015	1	0	4	4	4	4	0	0	0	0	t_wlan,t_wlan_simple	MAP_ONLY	hdfs://master/tmp/temp-1906860032/tmp1236017200,

Input(s):
Successfully read 1 records (459 bytes) from: "/hdfs/pig/tel.txt"

Output(s):
Successfully stored 1 records (29 bytes) in: "hdfs://master/tmp/temp-1906860032/tmp1236017200"

Counters:
Total records written : 1
Total bytes written : 29
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1559370613628_0015


2019-06-06 17:04:05,058 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:04:05,137 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:04:05,223 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:04:05,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2019-06-06 17:04:05,337 [main] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2019-06-06 17:04:05,382 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-06 17:04:05,382 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(13726230503,27,2481,24681,200)

 

7.3 分组数据

 

t_wlan_simple_group = GROUP t_wlan_simple BY msisdn;	
dump t_wlan_simple_group;

 

grunt> dump t_wlan_simple_group;
2019-06-06 17:06:28,589 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: GROUP_BY
2019-06-06 17:06:28,640 [main] WARN  org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2019-06-06 17:06:28,641 [main] INFO  org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NestedLimitOptimizer, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2019-06-06 17:06:28,646 [main] INFO  org.apache.pig.newplan.logical.rules.ColumnPruneVisitor - Columns pruned for t_wlan: $0, $2, $3, $4, $5, $10
2019-06-06 17:06:28,661 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2019-06-06 17:06:28,674 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2019-06-06 17:06:28,674 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2019-06-06 17:06:28,715 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2019-06-06 17:06:28,716 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-06-06 17:06:28,717 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers.
2019-06-06 17:06:28,723 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
2019-06-06 17:06:28,729 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=102
2019-06-06 17:06:28,729 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1
2019-06-06 17:06:28,730 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2019-06-06 17:06:28,929 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp-1906860032/tmp-412980928/pig-0.17.0-core-h2.jar
2019-06-06 17:06:29,039 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp-1906860032/tmp-1182557529/automaton-1.11-8.jar
2019-06-06 17:06:29,543 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp-1906860032/tmp-1112811524/antlr-runtime-3.4.jar
2019-06-06 17:06:30,043 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp-1906860032/tmp432932811/joda-time-2.9.3.jar
2019-06-06 17:06:30,046 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2019-06-06 17:06:30,047 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2019-06-06 17:06:30,047 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2019-06-06 17:06:30,047 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2019-06-06 17:06:30,111 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2019-06-06 17:06:30,174 [JobControl] WARN  org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2019-06-06 17:06:30,189 [JobControl] INFO  org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2019-06-06 17:06:30,191 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-06 17:06:30,191 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-06-06 17:06:30,193 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2019-06-06 17:06:30,391 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2019-06-06 17:06:30,488 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1559370613628_0016
2019-06-06 17:06:30,492 [JobControl] INFO  org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources.
2019-06-06 17:06:30,734 [JobControl] INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1559370613628_0016
2019-06-06 17:06:30,738 [JobControl] INFO  org.apache.hadoop.mapreduce.Job - The url to track the job: http://master1:8088/proxy/application_1559370613628_0016/
2019-06-06 17:06:30,738 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1559370613628_0016
2019-06-06 17:06:30,738 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases t_wlan,t_wlan_simple,t_wlan_simple_group
2019-06-06 17:06:30,738 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: t_wlan[4,9],t_wlan_simple[-1,-1],t_wlan_simple_group[6,22] C:  R: 
2019-06-06 17:06:30,745 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2019-06-06 17:06:30,745 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0016]
2019-06-06 17:06:44,943 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2019-06-06 17:06:44,943 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0016]
2019-06-06 17:06:50,964 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0016]
2019-06-06 17:06:55,983 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:06:56,181 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:06:56,283 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:06:56,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2019-06-06 17:06:56,335 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: 

HadoopVersion	PigVersion	UserId	StartedAt	FinishedAt	Features
2.6.5	0.17.0	zkkafka	2019-06-06 17:06:28	2019-06-06 17:06:56	GROUP_BY

Success!

Job Stats (time in seconds):
JobId	Maps	Reduces	MaxMapTime	MinMapTime	AvgMapTime	MedianMapTime	MaxReduceTime	MinReduceTime	AvgReduceTime	MedianReducetime	Alias	Feature	Outputs
job_1559370613628_0016	1	1	4	4	4	4	4	4	4	4	t_wlan,t_wlan_simple,t_wlan_simple_group	GROUP_BY	hdfs://master/tmp/temp-1906860032/tmp912427234,

Input(s):
Successfully read 1 records (459 bytes) from: "/hdfs/pig/tel.txt"

Output(s):
Successfully stored 1 records (46 bytes) in: "hdfs://master/tmp/temp-1906860032/tmp912427234"

Counters:
Total records written : 1
Total bytes written : 46
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1559370613628_0016


2019-06-06 17:06:56,345 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:06:56,403 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:06:56,474 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:06:56,554 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2019-06-06 17:06:56,556 [main] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2019-06-06 17:06:56,568 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-06 17:06:56,568 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(13726230503,{(13726230503,27,2481,24681,200)})

 

7.4 流量汇总

 

t_wlan_simple_group_sum = FOREACH t_wlan_simple_group GENERATE group, SUM(t_wlan_simple.t6), SUM(t_wlan_simple.t7), SUM(t_wlan_simple.t8), SUM(t_wlan_simple.t9);
dump t_wlan_simple_group_sum;

 

grunt> dump t_wlan_simple_group_sum
2019-06-06 17:15:39,824 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: GROUP_BY
2019-06-06 17:15:39,877 [main] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2019-06-06 17:15:39,878 [main] INFO  org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NestedLimitOptimizer, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2019-06-06 17:15:39,885 [main] INFO  org.apache.pig.newplan.logical.rules.ColumnPruneVisitor - Columns pruned for t_wlan: $0, $2, $3, $4, $5, $10
2019-06-06 17:15:39,904 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2019-06-06 17:15:39,908 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.CombinerOptimizerUtil - Choosing to move algebraic foreach to combiner
2019-06-06 17:15:39,972 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2019-06-06 17:15:39,972 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2019-06-06 17:15:40,000 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2019-06-06 17:15:40,001 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-06-06 17:15:40,002 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers.
2019-06-06 17:15:40,002 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
2019-06-06 17:15:40,005 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=102
2019-06-06 17:15:40,005 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1
2019-06-06 17:15:40,005 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2019-06-06 17:15:40,602 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp-1906860032/tmp-784677978/pig-0.17.0-core-h2.jar
2019-06-06 17:15:40,699 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp-1906860032/tmp-1113714067/automaton-1.11-8.jar
2019-06-06 17:15:40,796 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp-1906860032/tmp-1701171835/antlr-runtime-3.4.jar
2019-06-06 17:15:40,910 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp-1906860032/tmp-725132195/joda-time-2.9.3.jar
2019-06-06 17:15:40,914 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2019-06-06 17:15:40,915 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2019-06-06 17:15:40,915 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2019-06-06 17:15:40,915 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2019-06-06 17:15:40,968 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2019-06-06 17:15:41,035 [JobControl] WARN  org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2019-06-06 17:15:41,055 [JobControl] INFO  org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2019-06-06 17:15:41,057 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-06 17:15:41,057 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-06-06 17:15:41,060 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2019-06-06 17:15:41,282 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2019-06-06 17:15:41,432 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1559370613628_0018
2019-06-06 17:15:41,438 [JobControl] INFO  org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources.
2019-06-06 17:15:41,686 [JobControl] INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1559370613628_0018
2019-06-06 17:15:41,691 [JobControl] INFO  org.apache.hadoop.mapreduce.Job - The url to track the job: http://master1:8088/proxy/application_1559370613628_0018/
2019-06-06 17:15:41,692 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1559370613628_0018
2019-06-06 17:15:41,692 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases t_wlan,t_wlan_simple,t_wlan_simple_group,t_wlan_simple_group_sum
2019-06-06 17:15:41,692 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: t_wlan[4,9],t_wlan_simple[-1,-1],t_wlan_simple_group_sum[7,26],t_wlan_simple_group[6,22] C: t_wlan_simple_group_sum[7,26],t_wlan_simple_group[6,22] R: t_wlan_simple_group_sum[7,26]
2019-06-06 17:15:41,698 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2019-06-06 17:15:41,698 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0018]
2019-06-06 17:15:55,903 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2019-06-06 17:15:55,903 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0018]
2019-06-06 17:16:00,962 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0018]
2019-06-06 17:16:06,981 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:16:07,185 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:16:07,257 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:16:07,332 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2019-06-06 17:16:07,333 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: 

HadoopVersion	PigVersion	UserId	StartedAt	FinishedAt	Features
2.6.5	0.17.0	zkkafka	2019-06-06 17:15:39	2019-06-06 17:16:07	GROUP_BY

Success!

Job Stats (time in seconds):
JobId	Maps	Reduces	MaxMapTime	MinMapTime	AvgMapTime	MedianMapTime	MaxReduceTime	MinReduceTime	AvgReduceTime	MedianReducetime	Alias	Feature	Outputs
job_1559370613628_0018	1	1	3	3	3	3	3	3	3	3	t_wlan,t_wlan_simple,t_wlan_simple_group,t_wlan_simple_group_sum	GROUP_BY,COMBINER	hdfs://master/tmp/temp-1906860032/tmp2100428296,

Input(s):
Successfully read 1 records (459 bytes) from: "/hdfs/pig/tel.txt"

Output(s):
Successfully stored 1 records (29 bytes) in: "hdfs://master/tmp/temp-1906860032/tmp2100428296"

Counters:
Total records written : 1
Total bytes written : 29
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1559370613628_0018


2019-06-06 17:16:07,343 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:16:07,402 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:16:07,456 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:16:07,512 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2019-06-06 17:16:07,513 [main] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2019-06-06 17:16:07,529 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-06 17:16:07,529 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(13726230503,27,2481,24681,200)

 

7.5 存储到HDFS中

STORE t_wlan_simple_group_sum INTO '/hdfs/pig/wlan_result';

 

[zkkafka@yanfabu2-36 ~]$ hdfs dfs -text /hdfs/pig/wlan_result/part-r-00000
13726230503	27	2481	24681	200
[zkkafka@yanfabu2-36 ~]$ 

 

7.6 排序

t_wlan_simple_group_sum_group = ORDER t_wlan_simple_group_sum BY group;
DUMP t_wlan_simple_group_sum_group;

 

grunt> DUMP t_wlan_simple_group_sum_group;
2019-06-12 15:35:33,188 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: GROUP_BY,ORDER_BY
2019-06-12 15:35:33,235 [main] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2019-06-12 15:35:33,236 [main] INFO  org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NestedLimitOptimizer, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2019-06-12 15:35:33,242 [main] INFO  org.apache.pig.newplan.logical.rules.ColumnPruneVisitor - Columns pruned for t_wlan: $0, $2, $3, $4, $5, $10
2019-06-12 15:35:33,255 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2019-06-12 15:35:33,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.CombinerOptimizerUtil - Choosing to move algebraic foreach to combiner
2019-06-12 15:35:33,291 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SecondaryKeyOptimizerMR - Using Secondary Key Optimization for MapReduce node scope-283
2019-06-12 15:35:33,292 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 3
2019-06-12 15:35:33,292 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 3
2019-06-12 15:35:33,328 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2019-06-12 15:35:33,329 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-06-12 15:35:33,330 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers.
2019-06-12 15:35:33,330 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
2019-06-12 15:35:33,332 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=102
2019-06-12 15:35:33,333 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1
2019-06-12 15:35:33,333 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2019-06-12 15:35:33,510 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp1544583298/tmp-955805369/pig-0.17.0-core-h2.jar
2019-06-12 15:35:33,595 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp1544583298/tmp712002240/automaton-1.11-8.jar
2019-06-12 15:35:34,074 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp1544583298/tmp1938988919/antlr-runtime-3.4.jar
2019-06-12 15:35:34,154 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp1544583298/tmp1704097364/joda-time-2.9.3.jar
2019-06-12 15:35:34,157 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2019-06-12 15:35:34,158 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2019-06-12 15:35:34,158 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2019-06-12 15:35:34,158 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2019-06-12 15:35:34,193 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2019-06-12 15:35:34,256 [JobControl] WARN  org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2019-06-12 15:35:34,277 [JobControl] INFO  org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2019-06-12 15:35:34,288 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-12 15:35:34,289 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-06-12 15:35:34,291 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2019-06-12 15:35:34,450 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2019-06-12 15:35:34,952 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1559370613628_0024
2019-06-12 15:35:34,960 [JobControl] INFO  org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources.
2019-06-12 15:35:35,211 [JobControl] INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1559370613628_0024
2019-06-12 15:35:35,216 [JobControl] INFO  org.apache.hadoop.mapreduce.Job - The url to track the job: http://master1:8088/proxy/application_1559370613628_0024/
2019-06-12 15:35:35,216 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1559370613628_0024
2019-06-12 15:35:35,216 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases t_wlan,t_wlan_simple,t_wlan_simple_group,t_wlan_simple_group_sum
2019-06-12 15:35:35,216 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: t_wlan[1,9],t_wlan_simple[-1,-1],t_wlan_simple_group_sum[4,26],t_wlan_simple_group[3,22] C: t_wlan_simple_group_sum[4,26],t_wlan_simple_group[3,22] R: t_wlan_simple_group_sum[4,26]
2019-06-12 15:35:35,231 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2019-06-12 15:35:35,231 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0024]
2019-06-12 15:35:47,386 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 16% complete
2019-06-12 15:35:47,386 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0024]
2019-06-12 15:35:54,902 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 33% complete
2019-06-12 15:35:54,902 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0024]
2019-06-12 15:36:00,424 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:00,596 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:00,651 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:00,688 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2019-06-12 15:36:00,688 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-06-12 15:36:00,689 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers.
2019-06-12 15:36:00,689 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
2019-06-12 15:36:00,698 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=29
2019-06-12 15:36:00,699 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1
2019-06-12 15:36:00,699 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2019-06-12 15:36:01,245 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp1544583298/tmp-87045202/pig-0.17.0-core-h2.jar
2019-06-12 15:36:01,308 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp1544583298/tmp568012746/automaton-1.11-8.jar
2019-06-12 15:36:01,405 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp1544583298/tmp780878190/antlr-runtime-3.4.jar
2019-06-12 15:36:01,485 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp1544583298/tmp772462384/joda-time-2.9.3.jar
2019-06-12 15:36:01,487 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2019-06-12 15:36:01,487 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2019-06-12 15:36:01,487 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2019-06-12 15:36:01,487 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2019-06-12 15:36:01,508 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2019-06-12 15:36:01,559 [JobControl] WARN  org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2019-06-12 15:36:01,582 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-12 15:36:01,582 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-06-12 15:36:01,582 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2019-06-12 15:36:01,749 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2019-06-12 15:36:02,233 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1559370613628_0025
2019-06-12 15:36:02,237 [JobControl] INFO  org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources.
2019-06-12 15:36:02,472 [JobControl] INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1559370613628_0025
2019-06-12 15:36:02,476 [JobControl] INFO  org.apache.hadoop.mapreduce.Job - The url to track the job: http://master1:8088/proxy/application_1559370613628_0025/
2019-06-12 15:36:02,476 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1559370613628_0025
2019-06-12 15:36:02,476 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases t_wlan_simple_group_sum_group
2019-06-12 15:36:02,476 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: t_wlan_simple_group_sum_group[6,32] C:  R: 
2019-06-12 15:36:16,558 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2019-06-12 15:36:16,558 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0025]
2019-06-12 15:36:24,572 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 66% complete
2019-06-12 15:36:24,572 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0025]
2019-06-12 15:36:27,589 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:27,756 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:27,814 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:27,850 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2019-06-12 15:36:27,850 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-06-12 15:36:27,854 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers.
2019-06-12 15:36:27,854 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1
2019-06-12 15:36:27,854 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2019-06-12 15:36:27,995 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp1544583298/tmp-1238945561/pig-0.17.0-core-h2.jar
2019-06-12 15:36:28,103 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp1544583298/tmp1385874378/automaton-1.11-8.jar
2019-06-12 15:36:28,223 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp1544583298/tmp2107107107/antlr-runtime-3.4.jar
2019-06-12 15:36:28,297 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp1544583298/tmp-637573401/joda-time-2.9.3.jar
2019-06-12 15:36:28,301 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2019-06-12 15:36:28,302 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2019-06-12 15:36:28,302 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2019-06-12 15:36:28,302 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2019-06-12 15:36:28,374 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2019-06-12 15:36:28,445 [JobControl] WARN  org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2019-06-12 15:36:28,465 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-12 15:36:28,465 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-06-12 15:36:28,465 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2019-06-12 15:36:28,599 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2019-06-12 15:36:28,675 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1559370613628_0026
2019-06-12 15:36:28,679 [JobControl] INFO  org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources.
2019-06-12 15:36:28,918 [JobControl] INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1559370613628_0026
2019-06-12 15:36:28,921 [JobControl] INFO  org.apache.hadoop.mapreduce.Job - The url to track the job: http://master1:8088/proxy/application_1559370613628_0026/
2019-06-12 15:36:28,921 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1559370613628_0026
2019-06-12 15:36:28,921 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases t_wlan_simple_group_sum_group
2019-06-12 15:36:28,921 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: t_wlan_simple_group_sum_group[6,32] C:  R: 
2019-06-12 15:36:44,145 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 83% complete
2019-06-12 15:36:44,146 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0026]
2019-06-12 15:36:51,164 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0026]
2019-06-12 15:36:54,180 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,330 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,369 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,401 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2019-06-12 15:36:54,527 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: 

HadoopVersion	PigVersion	UserId	StartedAt	FinishedAt	Features
2.6.5	0.17.0	zkkafka	2019-06-12 15:35:33	2019-06-12 15:36:54	GROUP_BY,ORDER_BY

Success!

Job Stats (time in seconds):
JobId	Maps	Reduces	MaxMapTime	MinMapTime	AvgMapTime	MedianMapTime	MaxReduceTime	MinReduceTime	AvgReduceTime	MedianReducetime	Alias	Feature	Outputs
job_1559370613628_0024	1	1	3	3	3	3	4	4	4	4	t_wlan,t_wlan_simple,t_wlan_simple_group,t_wlan_simple_group_sum	GROUP_BY,COMBINER	
job_1559370613628_0025	1	1	5	5	5	5	5	5	5	5	t_wlan_simple_group_sum_group	SAMPLER	
job_1559370613628_0026	1	1	3	3	3	3	4	4	4	4	t_wlan_simple_group_sum_group	ORDER_BY	hdfs://master/tmp/temp1544583298/tmp-717585849,

Input(s):
Successfully read 1 records (459 bytes) from: "/hdfs/pig/tel.txt"

Output(s):
Successfully stored 1 records (29 bytes) in: "hdfs://master/tmp/temp1544583298/tmp-717585849"

Counters:
Total records written : 1
Total bytes written : 29
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1559370613628_0024	->	job_1559370613628_0025,
job_1559370613628_0025	->	job_1559370613628_0026,
job_1559370613628_0026


2019-06-12 15:36:54,532 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,584 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,623 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,664 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,702 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,735 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,776 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,836 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,871 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,928 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2019-06-12 15:36:54,929 [main] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2019-06-12 15:36:54,934 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-12 15:36:54,934 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(13726230503,27,2481,24681,200)

 

 

8.脚本

pig -x mapreduce  t_wlan.pig

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

捐助开发者 

在兴趣的驱动下,写一个免费的东西,有欣喜,也还有汗水,希望你喜欢我的作品,同时也能支持一下。 当然,有钱捧个钱场(支持支付宝和微信 以及扣扣群),没钱捧个人场,谢谢各位。

 

个人主页http://knight-black-bob.iteye.com/


linux  pig 安裝使用
            
    
    博客分类: linuxhivehadooppig linuxhivehadooppig linux  pig 安裝使用
            
    
    博客分类: linuxhivehadooppig linuxhivehadooppig linux  pig 安裝使用
            
    
    博客分类: linuxhivehadooppig linuxhivehadooppig 
 
 
 谢谢您的赞助,我会做的更好!