【若泽大数据实战第十八天】Hive - 创建测试表dual的使用

程序员文章站 2024-03-22 16:13:58

...

-- 创建 dual 表（只有在测试的时候用insert）

hive> create table dual(x string);
OK
Time taken: 0.282 seconds
hive> insert into table dual values('');
Query ID = hadoop_20180611233030_645e070e-77f9-4ea4-8b32-ee306424c16b
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1528730871092_0001, Tracking URL = http://hadoop000:8088/proxy/application_1528730871092_0001/
Kill Command = /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/bin/hadoop job  -kill job_1528730871092_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2018-06-11 23:31:32,290 Stage-1 map = 0%,  reduce = 0%
2018-06-11 23:31:37,712 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.54 sec
MapReduce Total cumulative CPU time: 1 seconds 540 msec
Ended Job = job_1528730871092_0001
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to: hdfs://hadoop000:9000/user/hive/warehouse/hive3.db/dual/.hive-staging_hive_2018-06-11_23-31-19_987_4145860992197710987-1/-ext-10000
Loading data to table hive3.dual
Table hive3.dual stats: [numFiles=1, numRows=1, totalSize=1, rawDataSize=0]
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1   Cumulative CPU: 1.54 sec   HDFS Read: 2917 HDFS Write: 67 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 540 msec
OK
Time taken: 20.257 seconds

-- 查看 dual是否有数据

hive> select * from dual;
OK

Time taken: 0.114 seconds, Fetched: 1 row(s)

-- 测试时间戳

https://tool.lu/timestamp/

hive> select unix_timestamp() from dual;
OK
1528731989

-- 测试当前时间

hive> select current_date from dual;
OK
2018-06-11

-- 测试当前时间戳

hive> select current_timestamp from dual;
OK
2018-06-11 23:51:11.681
Time taken: 0.07 seconds, Fetched: 1 row(s)

--测试年月日时分钟秒

hive> select year("2018-08-08 20:08:08") from dual;
OK
2018
Time taken: 0.098 seconds, Fetched: 1 row(s)
hive> select month("2018-08-08 20:08:08") from dual;
OK
8
Time taken: 0.066 seconds, Fetched: 1 row(s)
hive> select day("2018-08-08 20:08:08") from dual;
OK
8
Time taken: 0.124 seconds, Fetched: 1 row(s)
hive> select hour("2018-08-08 20:08:08") from dual;
OK
20
Time taken: 0.075 seconds, Fetched: 1 row(s)
hive> select minute("2018-08-08 20:08:08") from dual;
OK
8
Time taken: 0.115 seconds, Fetched: 1 row(s)
hive> select second("2018-08-08 20:08:08") from dual;
OK
8
Time taken: 0.067 seconds, Fetched: 1 row(s)

--测试天数的加减

hive> select date_add("2018-08-08",10) from dual;
OK
2018-08-18
Time taken: 0.242 seconds, Fetched: 1 row(s)
hive> select date_sub("2018-08-08",10) from dual;
OK
2018-07-29
Time taken: 0.06 seconds, Fetched: 1 row(s)

--测试类型转换

hive> select cast("5" as int) from dual;
OK
5
Time taken: 0.108 seconds, Fetched: 1 row(s)
hive> select cast("5" as date) from dual;
OK
NULL
Time taken: 0.104 seconds, Fetched: 1 row(s)

-- 测试当前时间戳转换成日期

hive> select cast(current_timestamp as date) from dual;
OK
2018-06-12
Time taken: 0.089 seconds, Fetched: 1 row(s)

--测试截取字符串

hive> desc function extended substr;
OK
substr(str, pos[, len]) - returns the substring of str that starts at pos and is of length len orsubstr(bin, pos[, len]) - returns the slice of byte array that starts at pos and is of length len
Synonyms: substring
pos is a 1-based index. If pos<0 the starting position is determined by counting backwards from the end of str.
Example:
   > SELECT substr('Facebook', 5) FROM src LIMIT 1; --从左边开始截取5个字符
  'book'
  > SELECT substr('Facebook', -5) FROM src LIMIT 1; --从最后开始截取5个字符
  'ebook'
  > SELECT substr('Facebook', 5, 1) FROM src LIMIT 1; --从左边第5个开始，截取1个字符
  'b'
Time taken: 0.019 seconds, Fetched: 10 row(s)

-- 测试concat将字符连接起来

hive> desc function extended concat_ws;
OK
concat_ws(separator, [string | array(string)]+) - returns the concatenation of the strings separated by the separator.
Example:
  > SELECT concat_ws('.', 'www', array('facebook', 'com')) FROM src LIMIT 1;
  'www.facebook.com'
Time taken: 0.03 seconds, Fetched: 4 row(s)
hive> select concat_ws(".","192","168","199","151") from dual;
OK
192.168.199.151
Time taken: 0.085 seconds, Fetched: 1 row(s)

-- 测试长度

hive> select length("192.168.199.151") from dual;
OK
15
Time taken: 0.061 seconds, Fetched: 1 row(s)

-- 测试拆分（Hive实现wordcunt拆分）

hive> desc function extended split;
OK
split(str, regex) - Splits str around occurances that match regex
Example:
  > SELECT split('oneAtwoBthreeC', '[ABC]') FROM src LIMIT 1;
  ["one", "two", "three"]
Time taken: 0.012 seconds, Fetched: 4 row(s)
hive> select split("192.168.199.151","\\.") from dual;
OK
["192","168","199","151"]
Time taken: 0.088 seconds, Fetched: 1 row(s)
hive>

大数据课程推荐：

【若泽大数据实战第十八天】Hive - 创建测试表dual的使用

上一篇： Quartz定时任务嵌套，创建不同的Scheduler实例问题

下一篇：微信小程序-引入类model