【若泽大数据实战第十八天】Hive - 创建测试表dual的使用
程序员文章站
2024-03-22 16:13:58
...
-- 创建 dual 表(只有在测试的时候用insert)
hive> create table dual(x string);
OK
Time taken: 0.282 seconds
hive> insert into table dual values('');
Query ID = hadoop_20180611233030_645e070e-77f9-4ea4-8b32-ee306424c16b
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1528730871092_0001, Tracking URL = http://hadoop000:8088/proxy/application_1528730871092_0001/
Kill Command = /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/bin/hadoop job -kill job_1528730871092_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2018-06-11 23:31:32,290 Stage-1 map = 0%, reduce = 0%
2018-06-11 23:31:37,712 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.54 sec
MapReduce Total cumulative CPU time: 1 seconds 540 msec
Ended Job = job_1528730871092_0001
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to: hdfs://hadoop000:9000/user/hive/warehouse/hive3.db/dual/.hive-staging_hive_2018-06-11_23-31-19_987_4145860992197710987-1/-ext-10000
Loading data to table hive3.dual
Table hive3.dual stats: [numFiles=1, numRows=1, totalSize=1, rawDataSize=0]
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Cumulative CPU: 1.54 sec HDFS Read: 2917 HDFS Write: 67 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 540 msec
OK
Time taken: 20.257 seconds
-- 查看 dual是否有数据
hive> select * from dual;
OK
Time taken: 0.114 seconds, Fetched: 1 row(s)
-- 测试时间戳
https://tool.lu/timestamp/
hive> select unix_timestamp() from dual;
OK
1528731989
-- 测试当前时间
hive> select current_date from dual;
OK
2018-06-11
-- 测试当前时间戳
hive> select current_timestamp from dual;
OK
2018-06-11 23:51:11.681
Time taken: 0.07 seconds, Fetched: 1 row(s)
--测试年月日时分钟秒
hive> select year("2018-08-08 20:08:08") from dual;
OK
2018
Time taken: 0.098 seconds, Fetched: 1 row(s)
hive> select month("2018-08-08 20:08:08") from dual;
OK
8
Time taken: 0.066 seconds, Fetched: 1 row(s)
hive> select day("2018-08-08 20:08:08") from dual;
OK
8
Time taken: 0.124 seconds, Fetched: 1 row(s)
hive> select hour("2018-08-08 20:08:08") from dual;
OK
20
Time taken: 0.075 seconds, Fetched: 1 row(s)
hive> select minute("2018-08-08 20:08:08") from dual;
OK
8
Time taken: 0.115 seconds, Fetched: 1 row(s)
hive> select second("2018-08-08 20:08:08") from dual;
OK
8
Time taken: 0.067 seconds, Fetched: 1 row(s)
--测试天数的加减
hive> select date_add("2018-08-08",10) from dual;
OK
2018-08-18
Time taken: 0.242 seconds, Fetched: 1 row(s)
hive> select date_sub("2018-08-08",10) from dual;
OK
2018-07-29
Time taken: 0.06 seconds, Fetched: 1 row(s)
--测试类型转换
hive> select cast("5" as int) from dual;
OK
5
Time taken: 0.108 seconds, Fetched: 1 row(s)
hive> select cast("5" as date) from dual;
OK
NULL
Time taken: 0.104 seconds, Fetched: 1 row(s)
-- 测试当前时间戳转换成日期
hive> select cast(current_timestamp as date) from dual;
OK
2018-06-12
Time taken: 0.089 seconds, Fetched: 1 row(s)
--测试截取字符串
hive> desc function extended substr;
OK
substr(str, pos[, len]) - returns the substring of str that starts at pos and is of length len orsubstr(bin, pos[, len]) - returns the slice of byte array that starts at pos and is of length len
Synonyms: substring
pos is a 1-based index. If pos<0 the starting position is determined by counting backwards from the end of str.
Example:
> SELECT substr('Facebook', 5) FROM src LIMIT 1; --从左边开始截取5个字符
'book'
> SELECT substr('Facebook', -5) FROM src LIMIT 1; --从最后开始截取5个字符
'ebook'
> SELECT substr('Facebook', 5, 1) FROM src LIMIT 1; --从左边第5个开始,截取1个字符
'b'
Time taken: 0.019 seconds, Fetched: 10 row(s)
-- 测试concat将字符连接起来hive> desc function extended concat_ws;
OK
concat_ws(separator, [string | array(string)]+) - returns the concatenation of the strings separated by the separator.
Example:
> SELECT concat_ws('.', 'www', array('facebook', 'com')) FROM src LIMIT 1;
'www.facebook.com'
Time taken: 0.03 seconds, Fetched: 4 row(s)
hive> select concat_ws(".","192","168","199","151") from dual;
OK
192.168.199.151
Time taken: 0.085 seconds, Fetched: 1 row(s)
-- 测试长度hive> select length("192.168.199.151") from dual;
OK
15
Time taken: 0.061 seconds, Fetched: 1 row(s)
-- 测试拆分 (Hive实现wordcunt拆分)
hive> desc function extended split;
OK
split(str, regex) - Splits str around occurances that match regex
Example:
> SELECT split('oneAtwoBthreeC', '[ABC]') FROM src LIMIT 1;
["one", "two", "three"]
Time taken: 0.012 seconds, Fetched: 4 row(s)
hive> select split("192.168.199.151","\\.") from dual;
OK
["192","168","199","151"]
Time taken: 0.088 seconds, Fetched: 1 row(s)
hive>
大数据课程推荐: