awk命令示例详解
awk options program file
一种用于文本处理的编程语言工具
参数options通常可以有以下选项
- F fs:指定文件分隔符
- f file:指定awk脚本文件
- v var=value:定义变量
使用变量
- $0:表示整行
- $1:表示第一个数据字段
- $2:表示第二个数据字段
- $n:表示第N个数据字段
假设我们有myfile定义如下:
[email protected]:~$ cat myfile
this is a test
this is the second test.
[email protected]:~$
执行命令:
awk '{print $1}' myfile
得到如下执行结果:
[email protected]:~$ awk '{print $1}' myfile
this
this
[email protected]:~$
[email protected]:~$
如果某些文件中分隔符不是空格或者Tab键,我们可以通过-F参数来指定文件分隔符:
awk -F: '{print $1}' /etc/passwd
得到如下结果:
[email protected]:~$ awk -F: '{print $1}' /etc/passwd
root
daemon
bin
sys
sync
games
man
lp
mail
news
uucp
proxy
www-data
backup
list
irc
gnats
nobody
systemd-timesync
systemd-network
systemd-resolve
syslog
_apt
messagebus
uuidd
lightdm
whoopsie
avahi-autoipd
avahi
dnsmasq
colord
speech-dispatcher
hplip
kernoops
pulse
rtkit
saned
usbmux
icbc
mysql
cups-pk-helper
geoclue
gdm
gnome-initial-setup
redis
jenkins
sshd
[email protected]:~$
[email protected]:~$
使用多个命令
echo "Hello Tom" | awk '{$2="Adam"; print $0}'
上述命令会将$2的值设置为Adam,打印整行得到如下结果:
[email protected]:~$ echo "Hello Tom" | awk '{$2="Adam"; print $0}'
Hello Adam
[email protected]:~$
[email protected]:~$
从文件中读取脚本文件
我们有testfile内容如下:
[email protected]:~$
[email protected]:~$ cat testfile
{print $1 " home at " $6}
[email protected]:~$
[email protected]:~$
调用脚本文件得到如下内容:
[email protected]:~$
[email protected]:~$ awk -F: -f testfile /etc/passwd
root home at /root
daemon home at /usr/sbin
bin home at /bin
sys home at /dev
sync home at /bin
games home at /usr/games
man home at /var/cache/man
lp home at /var/spool/lpd
mail home at /var/mail
news home at /var/spool/news
uucp home at /var/spool/uucp
proxy home at /bin
www-data home at /var/www
backup home at /var/backups
list home at /var/list
irc home at /var/run/ircd
gnats home at /var/lib/gnats
nobody home at /nonexistent
systemd-timesync home at /run/systemd
systemd-network home at /run/systemd/netif
systemd-resolve home at /run/systemd/resolve
syslog home at /home/syslog
_apt home at /nonexistent
messagebus home at /var/run/dbus
uuidd home at /run/uuidd
lightdm home at /var/lib/lightdm
whoopsie home at /nonexistent
avahi-autoipd home at /var/lib/avahi-autoipd
avahi home at /var/run/avahi-daemon
dnsmasq home at /var/lib/misc
colord home at /var/lib/colord
speech-dispatcher home at /var/run/speech-dispatcher
hplip home at /var/run/hplip
kernoops home at /
pulse home at /var/run/pulse
rtkit home at /proc
saned home at /var/lib/saned
usbmux home at /var/lib/usbmux
icbc home at /home/icbc
mysql home at /nonexistent
cups-pk-helper home at /home/cups-pk-helper
geoclue home at /var/lib/geoclue
gdm home at /var/lib/gdm3
gnome-initial-setup home at /run/gnome-initial-setup/
redis home at /var/lib/redis
jenkins home at /var/lib/jenkins
sshd home at /run/sshd
[email protected]:~$
我们将脚本文件内容修改如下:
[email protected]:~$
[email protected]:~$ cat testfile2
{
text = " home at "
print $1 $6
}
[email protected]:~$
[email protected]:~$
再次调用得到如下结果:
[email protected]:~$ awk -F: -f testfile2 /etc/passwd
root/root
daemon/usr/sbin
bin/bin
sys/dev
sync/bin
games/usr/games
man/var/cache/man
lp/var/spool/lpd
mail/var/mail
news/var/spool/news
uucp/var/spool/uucp
proxy/bin
www-data/var/www
backup/var/backups
list/var/list
irc/var/run/ircd
gnats/var/lib/gnats
nobody/nonexistent
systemd-timesync/run/systemd
systemd-network/run/systemd/netif
systemd-resolve/run/systemd/resolve
syslog/home/syslog
_apt/nonexistent
messagebus/var/run/dbus
uuidd/run/uuidd
lightdm/var/lib/lightdm
whoopsie/nonexistent
avahi-autoipd/var/lib/avahi-autoipd
avahi/var/run/avahi-daemon
dnsmasq/var/lib/misc
colord/var/lib/colord
speech-dispatcher/var/run/speech-dispatcher
hplip/var/run/hplip
kernoops/
pulse/var/run/pulse
rtkit/proc
saned/var/lib/saned
usbmux/var/lib/usbmux
icbc/home/icbc
mysql/nonexistent
cups-pk-helper/home/cups-pk-helper
geoclue/var/lib/geoclue
gdm/var/lib/gdm3
gnome-initial-setup/run/gnome-initial-setup/
redis/var/lib/redis
jenkins/var/lib/jenkins
sshd/run/sshd
[email protected]:~$
awk预处理
如果我们需要对我们的处理结果添加标题或者抬头,那么我们就可以使用BEGIN关键字来实现,BEGIN关键字会确保在数据处理前执行。
awk 'BEGIN {print "The File Contents:"}
{print $0}' myfile
得到执行结果如下:
[email protected]:~$
[email protected]:~$ awk 'BEGIN {print "The File Contents:"}
>
> {print $0}' myfile
The File Contents:
this is a test
this is the second test.
[email protected]:~$
[email protected]:~$
###awk后处理
使用END关键字
awk 'BEGIN {print "The File Contents:"}
{print $0}
END {print "File footer"}' myfile
得到执行结果如下:
[email protected]:~$
[email protected]:~$ awk 'BEGIN {print "The File Contents:"}
>
> {print $0}
>
> END {print "File footer"}' myfile
The File Contents:
this is a test
this is the second test.
File footer
[email protected]:~$
组合起来一起使用
[email protected]:~$ cat testfile3
BEGIN{
print "USERS and their corresponding home"
print "UserName \t HomePath"
print "---\t---"
FS=":"
}
{
print $1 "\t" $6
}
END {
print "The end"
}
[email protected]:~$
调用执行得到如下结果
[email protected]:~$ awk -f testfile3 /etc/passwd
USERS and their corresponding home
UserName HomePath
--- ---
root /root
daemon /usr/sbin
bin /bin
sys /dev
sync /bin
games /usr/games
man /var/cache/man
lp /var/spool/lpd
mail /var/mail
news /var/spool/news
uucp /var/spool/uucp
proxy /bin
www-data /var/www
backup /var/backups
list /var/list
irc /var/run/ircd
gnats /var/lib/gnats
nobody /nonexistent
systemd-timesync /run/systemd
systemd-network /run/systemd/netif
systemd-resolve /run/systemd/resolve
syslog /home/syslog
_apt /nonexistent
messagebus /var/run/dbus
uuidd /run/uuidd
lightdm /var/lib/lightdm
whoopsie /nonexistent
avahi-autoipd /var/lib/avahi-autoipd
avahi /var/run/avahi-daemon
dnsmasq /var/lib/misc
colord /var/lib/colord
speech-dispatcher /var/run/speech-dispatcher
hplip /var/run/hplip
kernoops /
pulse /var/run/pulse
rtkit /proc
saned /var/lib/saned
usbmux /var/lib/usbmux
icbc /home/icbc
mysql /nonexistent
cups-pk-helper /home/cups-pk-helper
geoclue /var/lib/geoclue
gdm /var/lib/gdm3
gnome-initial-setup /run/gnome-initial-setup/
redis /var/lib/redis
jenkins /var/lib/jenkins
sshd /run/sshd
The end
[email protected]:~$
使用内嵌变量
除了我们之前所提及的$1,$2等内嵌变量,awk还支持一些其他内嵌变量
- FIELDWIDTHS :指定字段宽度
- RS:指定记录分隔符
- FS:指定字段分隔符
- OFS:指定输出分隔符
- ORS:指定输出分隔符
OFS默认为空格,也可以指定别的字符。
[email protected]:~$ awk 'BEGIN{FS=":"; OFS="-"} {print $1,$6,$7}' /etc/passwd
root-/root-/bin/bash
daemon-/usr/sbin-/usr/sbin/nologin
bin-/bin-/usr/sbin/nologin
sys-/dev-/usr/sbin/nologin
sync-/bin-/bin/sync
games-/usr/games-/usr/sbin/nologin
man-/var/cache/man-/usr/sbin/nologin
lp-/var/spool/lpd-/usr/sbin/nologin
mail-/var/mail-/usr/sbin/nologin
news-/var/spool/news-/usr/sbin/nologin
uucp-/var/spool/uucp-/usr/sbin/nologin
proxy-/bin-/usr/sbin/nologin
www-data-/var/www-/usr/sbin/nologin
backup-/var/backups-/usr/sbin/nologin
list-/var/list-/usr/sbin/nologin
irc-/var/run/ircd-/usr/sbin/nologin
gnats-/var/lib/gnats-/usr/sbin/nologin
nobody-/nonexistent-/usr/sbin/nologin
systemd-timesync-/run/systemd-/bin/false
systemd-network-/run/systemd/netif-/bin/false
systemd-resolve-/run/systemd/resolve-/bin/false
syslog-/home/syslog-/bin/false
_apt-/nonexistent-/bin/false
messagebus-/var/run/dbus-/bin/false
uuidd-/run/uuidd-/bin/false
lightdm-/var/lib/lightdm-/bin/false
whoopsie-/nonexistent-/bin/false
avahi-autoipd-/var/lib/avahi-autoipd-/bin/false
avahi-/var/run/avahi-daemon-/bin/false
dnsmasq-/var/lib/misc-/bin/false
colord-/var/lib/colord-/bin/false
speech-dispatcher-/var/run/speech-dispatcher-/bin/false
hplip-/var/run/hplip-/bin/false
kernoops-/-/bin/false
pulse-/var/run/pulse-/bin/false
rtkit-/proc-/bin/false
saned-/var/lib/saned-/bin/false
usbmux-/var/lib/usbmux-/bin/false
icbc-/home/icbc-/bin/bash
mysql-/nonexistent-/bin/false
cups-pk-helper-/home/cups-pk-helper-/usr/sbin/nologin
geoclue-/var/lib/geoclue-/usr/sbin/nologin
gdm-/var/lib/gdm3-/bin/false
gnome-initial-setup-/run/gnome-initial-setup/-/bin/false
redis-/var/lib/redis-/usr/sbin/nologin
jenkins-/var/lib/jenkins-/bin/bash
sshd-/run/sshd-/usr/sbin/nologin
[email protected]:~$
假设我们有文件内容如下:
[email protected]:~$
[email protected]:~$ cat cash
1235.96521
[email protected]:~$
[email protected]:~$
使用FIELDWIDTHS关键字得到如下结果:
[email protected]:~$
[email protected]:~$ awk 'BEGIN{FIELDWIDTHS="3 4 3"}{print $1,$2,$3}' cash
123 5.96 521
[email protected]:~$
假设我们有内容如下:
[email protected]:~$
[email protected]:~$ cat person
Person Name
123 High Street
(222)466-1234
Another person
487 High Street
(523)643-8754
[email protected]:~$
数据通过换行符区分,此时我们需要通过FS指定分隔符为换行符,RS为空
[email protected]:~$
[email protected]:~$ awk 'BEGIN{FS="\n"; RS=""} {print $1,$3}' person
Person Name (222)466-1234
Another person (523)643-8754
[email protected]:~$
其他变量
-
ARGC :传参个数
-
ARGV :命令行参数
-
ENVIRON :环境变量
-
FILENAME :awk处理目标文件
-
NF :正在处理行的记录数
-
NR :已处理记录数
-
FNR :被处理记录
-
IGNORECASE:忽略大小写
简单测试一下:
[email protected]:~$
[email protected]:~$ awk 'BEGIN{print ARGC,ARGV[1]}' myfile
2 myfile
[email protected]:~$
使用环境变量
[email protected]:~$ awk '
>
> BEGIN{
>
> print ENVIRON["PATH"]
>
> }'
/usr/lib/jvm/java-8-openjdk-amd64/bin:/usr/lib/jvm/java-8-openjdk-amd64/jre/bin:/home/icbc/software/go/bin:/home/icbc/software/node-v9.4.0-linux-x64/bin:/home/icbc/bin:/home/icbc/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
[email protected]:~$
也可以直接使用bash变量
[email protected]:~$
[email protected]:~$ echo | awk -v home=$HOME '{print "My home is " home}'
My home is /home/icbc
[email protected]:~$
[email protected]:~$
NF变量表示的是记录中的最后一个字段
[email protected]:~$ awk 'BEGIN{FS=":"; OFS=":"} {print $1,$NF}' /etc/passwd
root:/bin/bash
daemon:/usr/sbin/nologin
bin:/usr/sbin/nologin
sys:/usr/sbin/nologin
sync:/bin/sync
games:/usr/sbin/nologin
man:/usr/sbin/nologin
lp:/usr/sbin/nologin
mail:/usr/sbin/nologin
news:/usr/sbin/nologin
uucp:/usr/sbin/nologin
proxy:/usr/sbin/nologin
www-data:/usr/sbin/nologin
backup:/usr/sbin/nologin
list:/usr/sbin/nologin
irc:/usr/sbin/nologin
gnats:/usr/sbin/nologin
nobody:/usr/sbin/nologin
systemd-timesync:/bin/false
systemd-network:/bin/false
systemd-resolve:/bin/false
syslog:/bin/false
_apt:/bin/false
messagebus:/bin/false
uuidd:/bin/false
lightdm:/bin/false
whoopsie:/bin/false
avahi-autoipd:/bin/false
avahi:/bin/false
dnsmasq:/bin/false
colord:/bin/false
speech-dispatcher:/bin/false
hplip:/bin/false
kernoops:/bin/false
pulse:/bin/false
rtkit:/bin/false
saned:/bin/false
usbmux:/bin/false
icbc:/bin/bash
mysql:/bin/false
cups-pk-helper:/us
我们看下NR与FNR的区别
[email protected]:~$
[email protected]:~$ awk 'BEGIN{FS=","}{print $1,"FNR="FNR}' myfile myfile
this is a test FNR=1
this is the second test. FNR=2
this is a test FNR=1
this is the second test. FNR=2
[email protected]:~$
上面的例子中,我们定义了两个输入文件,同一个文件处理两次。输出是第一个字段值和FNR。看下NR与FNR的区别:处理到新文件时,FNR会重新变成1,而NR会继续累加。
[email protected]:~$ awk '
>
> BEGIN {FS=","}
>
> {print $1,"FNR="FNR,"NR="NR}
>
> END{print "Total",NR,"processed lines"}' myfile myfile
this is a test FNR=1 NR=1
this is the second test. FNR=2 NR=2
this is a test FNR=1 NR=3
this is the second test. FNR=2 NR=4
Total 4 processed lines
[email protected]:~$
使用自定义变量
[email protected]:~$
[email protected]:~$ awk '
>
> BEGIN{
>
> test="Welcome to LikeGeeks website"
>
> print test
>
> }'
Welcome to LikeGeeks website
[email protected]:~$
结构化命令
假设有文件内容如下:
[email protected]:~$ cat numbers
10
15
6
33
45
[email protected]:~$
[email protected]:~$
[email protected]:~$
执行IF条件判断
[email protected]:~$
[email protected]:~$ awk '{if ($1 > 30) print $1}' numbers
33
45
[email protected]:~$
或者使用大括号执行多条语句
[email protected]:~$
[email protected]:~$ awk '{
>
> if ($1 > 30)
>
> {
>
> x = $1 * 3
>
> print x
>
> }
>
> }' numbers
99
135
[email protected]:~$
也可以使用ELSE语句
[email protected]:~$ awk '{
>
> if ($1 > 30)
>
> {
>
> x = $1 * 3
>
> print x
>
> } else
>
> {
>
> x = $1 / 2
>
> print x
>
> }}' numbers
5
7.5
3
99
135
[email protected]:~$
也可以使用分号将ELSE写在一行上
[email protected]:~$ awk '{if ($1 > 20) print $1 *2;else print $1 /2 }' numbers
5
7.5
3
66
90
[email protected]:~$
While循环
有文件内容如下:
[email protected]:~$ cat numbers2
124 127 130
112 142 135
175 158 245
118 231 147
[email protected]:~$
[email protected]:~$
循环求平均
[email protected]:~$ awk '{
>
> sum = 0
>
> i = 1
>
> while (i < 5)
>
> {
>
> sum += $i
>
> i++
>
> }
>
> average = sum / 3
>
> print "Average:",average
>
> }' numbers2
Average: 127
Average: 129.667
Average: 192.667
Average: 165.333
[email protected]:~$
使用break中断循环
[email protected]:~$ awk '{
tot = 0
i = 1
while (i < 5)
{
tot += $i
if (i == 3)
break
i++
}
average = tot / 3
print "Average is:",average
}' numbers2
Average is: 127
Average is: 129.667
Average is: 192.667
Average is: 165.333
[email protected]:~$
For循环
[email protected]:~$ awk '{
total = 0
for (var = 1; var < 5; var++)
{
total += $var
}
avg = total / 3
print "Average:",avg
}' numbers2
Average: 127
Average: 129.667
Average: 192.667
Average: 165.333
[email protected]:~$
格式化打印
%[modifier]control-letter
- c:将数字作为字串打印
- d:打印整型数据
- e:科学计数
- f:浮点数
- o:八进制
- s:字符串
[email protected]:~$ awk 'BEGIN{
>
> x = 100 * 100
>
> printf "The result is: %e\n", x
>
> }'
The result is: 1.000000e+04
[email protected]:~$
内嵌函数
数学函数
sin(x) | cos(x) | sqrt(x) | exp(x) | log(x) | rand()
[email protected]:~$ awk 'BEGIN{x=exp(5); print x}'
148.413
[email protected]:~$
字符串函数
[email protected]:~$
[email protected]:~$ awk 'BEGIN{x = "likegeeks"; print toupper(x)}'
LIKEGEEKS
[email protected]:~$
使用自定义函数
[email protected]:~$ awk '
>
> function myfunc()
>
> {
>
> printf "The user %s has home path at %s\n", $1,$6
>
> }
>
> BEGIN{FS=":"}
>
> {
>
> myfunc()
>
> }' /etc/passwd
The user root has home path at /root
The user daemon has home path at /usr/sbin
The user bin has home path at /bin
The user sys has home path at /dev
The user sync has home path at /bin
The user games has home path at /usr/games
The user man has home path at /var/cache/man
The user lp has home path at /var/spool/lpd
The user mail has home path at /var/mail
The user news has home path at /var/spool/news
The user uucp has home path at /var/spool/uucp
The user proxy has home path at /bin
The user www-data has home path at /var/www
The user backup has home path at /var/backups
The user list has home path at /var/list
The user irc has home path at /var/run/ircd
The user gnats has home path at /var/lib/gnats
The user nobody has home path at /nonexistent
The user systemd-timesync has home path at /run/systemd
The user systemd-network has home path at /run/systemd/netif
The user systemd-resolve has home path at /run/systemd/resolve
The user syslog has home path at /home/syslog
The user _apt has home path at /nonexistent
The user messagebus has home path at /var/run/dbus
The user uuidd has home path at /run/uuidd
The user lightdm has home path at /var/lib/lightdm
The user whoopsie has home path at /nonexistent
The user avahi-autoipd has home path at /var/lib/avahi-autoipd
The user avahi has home path at /var/run/avahi-daemon
The user dnsmasq has home path at /var/lib/misc
The user colord has home path at /var/lib/colord
The user speech-dispatcher has home path at /var/run/speech-dispatcher
The user hplip has home path at /var/run/hplip
The user kernoops has home path at /
The user pulse has home path at /var/run/pulse
The user rtkit has home path at /proc
The user saned has home path at /var/lib/saned
The user usbmux has home path at /var/lib/usbmux
The user icbc has home path at /home/icbc
The user mysql has home path at /nonexistent
The user cups-pk-helper has home path at /home/cups-pk-helper
The user geoclue has home path at /var/lib/geoclue
The user gdm has home path at /var/lib/gdm3
The user gnome-initial-setup has home path at /run/gnome-initial-setup/
The user redis has home path at /var/lib/redis
The user jenkins has home path at /var/lib/jenkins
The user sshd has home path at /run/sshd
[email protected]:~$
上一篇: awk使用示例
下一篇: awk编程基本使用示例