javascript - 能帮忙写一个多行正则提取的规则吗?
程序员文章站
2022-03-09 09:41:12
...
以下是log日志,希望能够通过正则提取到各个字段的内容,任何语言的正则都可以
=====================[2016-03-03 14:56:36]==================
IP: 127.0.0.1
Agent: Dalvik/2.1.0 (Linux; U; Android 5.1.1; jacinto6evm Build/LMY48P)
URL: http://127.0.0.1/report?power={"charge_state":0,"battery":0.0}&location={"lat":0.0,"lng":0.0}&env={"air":{"pm25":0.0},"humidity":0.0}
POST: power={"charge_state":0,"battery":0.0}&location={"lat":0.0,"lng":0.0}&env={"air":{"pm25":0.0},"humidity":0.0}
COOKIE: C_TOKEN=98f1706e92ab-e3195b005d65c4aa7df00566be939841
ErrorCode: 0
Result: {"error_code":0,"error_msg":"","data":null,"time":1456988196}
=====================[2016-03-03 14:56:36]==================
IP:127.0.0.1
Agent: Dalvik/2.1.0 (Linux; U; Android 5.1.1; jacinto6evm Build/LMY48P)
URL: http://127.0.0.1/report?power={"charge_state":0,"battery":0.0}&location={"lat":0.0,"lng":0.0}&env={"air":{"pm25":0.0},"humidity":0.0}
POST: power={"charge_state":0,"battery":0.0}&location={"lat":0.0,"lng":0.0}&env={"air":{"pm25":0.0},"humidity":0.0}
COOKIE: C_TOKEN=98f1706e92ab-e3195b005d65c4aa7df00566be939841
ErrorCode: 0
Result: {"error_code":0,"error_msg":"","data":null,"time":1456988196}
谢谢了,我自己写出来始终提取有问题
这是我自己写的
[^\[]+\[([^]]+)][^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+.*
阿里云显示是正常的,但是就是提取不出来,应该是什么细节没注意到
回复内容:
以下是log日志,希望能够通过正则提取到各个字段的内容,任何语言的正则都可以
=====================[2016-03-03 14:56:36]==================
IP: 127.0.0.1
Agent: Dalvik/2.1.0 (Linux; U; Android 5.1.1; jacinto6evm Build/LMY48P)
URL: http://127.0.0.1/report?power={"charge_state":0,"battery":0.0}&location={"lat":0.0,"lng":0.0}&env={"air":{"pm25":0.0},"humidity":0.0}
POST: power={"charge_state":0,"battery":0.0}&location={"lat":0.0,"lng":0.0}&env={"air":{"pm25":0.0},"humidity":0.0}
COOKIE: C_TOKEN=98f1706e92ab-e3195b005d65c4aa7df00566be939841
ErrorCode: 0
Result: {"error_code":0,"error_msg":"","data":null,"time":1456988196}
=====================[2016-03-03 14:56:36]==================
IP:127.0.0.1
Agent: Dalvik/2.1.0 (Linux; U; Android 5.1.1; jacinto6evm Build/LMY48P)
URL: http://127.0.0.1/report?power={"charge_state":0,"battery":0.0}&location={"lat":0.0,"lng":0.0}&env={"air":{"pm25":0.0},"humidity":0.0}
POST: power={"charge_state":0,"battery":0.0}&location={"lat":0.0,"lng":0.0}&env={"air":{"pm25":0.0},"humidity":0.0}
COOKIE: C_TOKEN=98f1706e92ab-e3195b005d65c4aa7df00566be939841
ErrorCode: 0
Result: {"error_code":0,"error_msg":"","data":null,"time":1456988196}
谢谢了,我自己写出来始终提取有问题
这是我自己写的
[^\[]+\[([^]]+)][^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+:\s([^\n]+)[^:]+.*
阿里云显示是正常的,但是就是提取不出来,应该是什么细节没注意到
贴一个perl5的,对perl5不熟悉,写的不好
if ($line =~ m/[IP|Agent|URL|POST|COOKIE|ErrorCode|Result|=+\[]+[\:]?(\d*-\d*-\d* \d*:\d*:\d*|.*)/) {
print $1."\n";
}
然后是一个perl6的
if $line ~~ /[
[\=]+\[(.*)\][\=]+ ||
[IP|Agent|URL|POST|COOKIE|ErrorCode|Result]\:(.*)
]/ {
say $/;
}
log 文件过大,不建议用正则表达式,你可以通过按行读取、分割字符串的方式进行处理:
PHP 代码:
$fp = fopen('xx.log', 'r');
while(!feof($fp)){
$line = trim(fgets($fp));
// 跳过空行
if(!$line){
continue;
}
// 以 ==== 字符串开头时
if(strpos($line, '====') === 0){
if($data){
//处理上一条记录
}
$data = array();
}
// 分割字符串
list($key, $value) = explode(':', $line, 2);
// 写入到数组
$data[$key] = trim($value);
}
fclose($fp);
上一篇: 504 Gateway Time-out
下一篇: 这三个难的PHP知识点,你都会了吗?