欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

一个可以分析格式化文本的工具

程序员文章站 2022-06-04 09:08:58
...
记录一个python工具:可以分析格式化文本

[url]https://raw.githubusercontent.com/harelba/q/master/bin/q[/url]

case:
[url]http://harelba.github.io/q/examples.html[/url]

安装和使用:


curl https://raw.githubusercontent.com/harelba/q/master/bin/q>q
chmod +x q
./q

输出:
q version 1.6.0notreleasedyet
Copyright (C) 2012-2014 Harel Ben-Attia ([email protected], @harelba on twitter)
http://harelba.github.io/q/

Must provide at least one query in the command line, or through a file with the -q parameter

例子:

python q 'select * from data.txt'


通过-d 可以指定分隔符

./q -d"," "select c1,c2 from a.txt where c2>2"


通过- 接受命令行

cal|grep -v "2016"|./q -H "select * from - where Sa != ''"



其他:

q "SELECT myfiles.c8,emails.c2 FROM exampledatafile myfiles JOIN group-emails-example emails ON (myfiles.c4 = emails.c1) WHERE myfiles.c8 = 'ppp'"

ps -ef | q -H "SELECT UID,COUNT(*) cnt FROM - GROUP BY UID ORDER BY cnt DESC LIMIT 3"

sudo find /tmp -ls | q "SELECT c5,c6,sum(c7)/1024.0/1024 AS total FROM - GROUP BY c5,c6 ORDER BY total desc"

q -t -H "SELECT strftime('%H:%M',date_time) hour_and_minute,count(*) FROM ./clicks.csv GROUP BY hour_and_minute"

q -t -H "SELECT hashed_source_machine,count(*) FROM ./clicks.csv GROUP BY hashed_source_machine"


q -H -t "SELECT request_id,score FROM ./clicks.csv WHERE score > 0.7 ORDER BY score DESC LIMIT 5"


相关标签: 格式化文本分析