shell-sort-wc-uniq 博客分类： linux-shell shellsortuniqwccut

程序员文章站 2024-03-02 10:42:46

...

1、准备数据
格式：
pgj.trade.baidu.com
chy.guoji.baidu.com
ndd.trade.baidu.com
cmt.trade.baidu.com
....

2、cut分割

-d, --delimiter=DELIM use DELIM instead of TAB for field delimiter

-f, --fields=LIST select only these fields; also print any line that contains no delimiter character, unless the -s option is specified

 cut -d. -f2 domain.txt

3、排序-sort

-b   忽略每行前面开始出的空格字符。
-c   检查文件是否已经按照顺序排序。
-f   排序时，忽略大小写字母。
-M   将前面3个字母依照月份的缩写进行排序。
-n   依照数值的大小排序。
-o<输出文件>   将排序后的结果存入指定的文件。
-r   以相反的顺序来排序。
-t<分隔字符>   指定排序时所用的栏位分隔字符。
-k 选择以哪个区间进行排序。

cut -d. -f2 domain.txt |sort

4、取唯一值-uniq

默认输出唯一行

-c, --count prefix lines by the number of occurrences
-d, --repeated only print duplicate lines
-u, --unique only print unique lines

  cut -d. -f2 domain.txt |sort|uniq -c

5、再排序-sort

-t, --field-separator=SEP 指定分隔符 use SEP instead of non-blank to blank transition
-n, --numeric-sort 按数字格式排序 compare according to string numerical value
-f, --ignore-case 不考虑大小写 fold lower case to upper case characters
-r, --reverse 反转 reverse the result of comparisons

-g, --general-numeric-sort compare according to general numerical value

cut -d. -f2 domain.txt |sort|uniq -c|sort -n
cut -d. -f2 domain.txt |sort|uniq -c|sort -nr
cut -d. -f2 domain.txt |sort |uniq -c|sort -g
cut -d. -f2 domain.txt |sort|uniq -c|sort -nr|head -2
sort domain.txt -t. -k2

-k:指定排序的字符开始和结束位置
-k, --key=POS1[,POS2] start a key at POS1, end it at POS2 (origin 1)

data:
12.12.45.4
36.415.545.45
9.45.15.15
154.45.45

 sort a.txt -t. -k2

output：
12.12.45.4
36.415.545.45
9.45.15.15
154.45.45

 sort a.txt -t. -k1

output：
12.12.45.4
154.45.45
36.415.545.45
9.45.15.15

case：

atnodes "zgrep validateOrder.jsp /server/tts/logs/tts.log.2014-10-30-1*.gz" l-ttsi[1-10].f.cn1 |grep "RequestError"|awk -F '&id=' '{print $2}'|awk -F'&' '{print $1}'|sort|uniq|wc