欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  数据库

Postgres的日志实用功能

程序员文章站 2022-06-17 14:00:43
...

Postgres的日志实用功能 不得不说,Postgres的日志(pg_log,类似oracle的alter文件,非pg_xlog)确实是很灵活,功能也很丰富的,下面是借用postgres的日志来实现一些管理功能,下面涉及的参数都在文件$PGDATA/postgresql.conf里面。 OS:CentOS 6.2 DB:Postgre

Postgres的日志实用功能

不得不说,Postgres的日志(pg_log,类似oracle的alter文件,非pg_xlog)确实是很灵活,功能也很丰富的,下面是借用postgres的日志来实现一些管理功能,下面涉及的参数都在文件$PGDATA/postgresql.conf里面。

OS:CentOS 6.2

DB:Postgres 9.2.3

1.日志审计

审计是值记录用户的登陆退出以及登陆后在数据库里的行为操作,可以根据安全等级不一样设置不一样级别的审计,

此处涉及的参数文件有:

logging_collector --是否开启日志收集开关,默认off,开启要重启DB

log_destination --日志记录类型,默认是stderr,只记录错误输出

log_directory --日志路径,默认是$PGDATA/pg_log

log_filename --日志名称,默认是postgresql-%Y-%m-%d_%H%M%S.log

log_connections --用户session登陆时是否写入日志,默认off

log_disconnections --用户session退出时是否写入日志,默认off

log_rotation_age --保留单个文件的最大时长,默认是1d,也有1h,1min,1s,个人觉得不实用

log_rotation_size --保留单个文件的最大尺寸,默认是10MB

配置值:

logging_collector = on

log_destination = 'csvlog'

log_directory = '/home/postgres/pg_log'

log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'

log_connections = on

log_disconnections = on

log_rotation_age = 1d

log_rotation_size = 20MB

配置完重启DB,检查日志情况

[postgres@localhost pg_log]$ ls -l

total 4

-rw-------. 1 postgres postgres 672 Mar 29 08:25 postgresql-2013-03-29_000000.csv

-rw-------. 1 postgres postgres 0 Mar 29 00:00 postgresql-2013-03-29_000000.log

[postgres@localhost pg_log]$

--登陆并退出,日志内容有访问的IP(local),访问用户,登陆和退出时间等信息,对检查超级用户的登陆退出是很有效的

[postgres@localhost pg_log]$ psql

psql (9.2.3)

Type "help" for help.

postgres=# \q

[postgres@localhost pg_log]$ tail -f postgresql-2013-03-29_000000.csv

2013-03-29 10:38:36.934 PDT,,,2236,"",5155d19c.8bc,1,"",2013-03-29 10:38:36 PDT,,0,LOG,00000,"connection received: host=[local]",,,,,,,,,""

2013-03-29 10:38:36.938 PDT,"postgres","postgres",2236,"[local]",5155d19c.8bc,2,"authentication",2013-03-29 10:38:36 PDT,2/11858,0,LOG,00000,"connection authorized: user=postgres database=postgres",,,,,,,,,""

2013-03-29 10:38:42.365 PDT,"postgres","postgres",2236,"[local]",5155d19c.8bc,3,"idle",2013-03-29 10:38:36 PDT,,0,LOG,00000,"disconnection: session time: 0:00:05.431 user=postgres database=postgres host=[local]",,,,,,,,,"psql"

记录用户登陆数据库后的各种操作,postgres日志里分成了3类,通过参数pg_statement来控制,默认的pg_statement参数值是none,即不记录,可以设置ddl(记录create,drop和alter)、mod(记录ddl+insert,delete,update和truncate)和all(mod+select)。

示例:

[postgres@localhost ~]$ vi $PGDATA/postgresql.conf

log_statement = ddl

postgres=# show log_statement;

log_statement

---------------

ddl

(1 row)

postgres=# create table t_ken_yon(id int);

CREATE TABLE

postgres=# drop table t_ken_yon ;

DROP TABLE

postgres=#

[postgres@localhost pg_log]$ tail -f postgresql-2013-03-29_000000.csv

2013-03-29 11:01:29.048 PDT,"postgres","postgres",2324,"[local]",5155d681.914,3,"idle",2013-03-29 10:59:29 PDT,2/11945,0,LOG,00000,"statement: create table t_ken_yon(id int);",,,,,,,,,"psql"

2013-03-29 11:01:36.087 PDT,"postgres","postgres",2324,"[local]",5155d681.914,4,"idle",2013-03-29 10:59:29 PDT,2/11948,0,LOG,00000,"statement: drop table t_ken_yon ;",,,,,,,,,"psql"

--修改为mod级别,并reload

postgres=# show log_statement;

log_statement

---------------

mod

(1 row)

postgres=# insert into t_ken_yon values(1),(2);

INSERT 0 2

postgres=# delete from t_ken_yon where id =1;

DELETE 1

[postgres@localhost pg_log]$ tail -f postgresql-2013-03-29_000000.csv

2013-03-29 11:04:08.148 PDT,,,5554,,514933a6.15b2,42,,2013-03-19 20:57:26 PDT,,0,LOG,00000,"received SIGHUP, reloading configuration files",,,,,,,,,""

2013-03-29 11:04:08.151 PDT,,,5554,,514933a6.15b2,43,,2013-03-19 20:57:26 PDT,,0,LOG,00000,"parameter ""log_statement"" changed to ""mod""",,,,,,,,,""

2013-03-29 11:05:33.346 PDT,"postgres","postgres",2324,"[local]",5155d681.914,6,"idle",2013-03-29 10:59:29 PDT,2/11952,0,LOG,00000,"statement: insert into t_ken_yon values(1),(2);",,,,,,,,,"psql"

2013-03-29 11:05:52.033 PDT,"postgres","postgres",2324,"[local]",5155d681.914,7,"idle",2013-03-29 10:59:29 PDT,2/11953,0,LOG,00000,"statement: delete from t_ken_yon where id =1;",,,,,,,,,"psql"

--修改为all级别,并reload

postgres=# show log_statement;

log_statement

---------------

all

(1 row)

postgres=# select * from t_ken_yon;

id

----

2

(1 row)

[postgres@localhost pg_log]$ tail -f postgresql-2013-03-29_000000.csv

2013-03-29 11:07:14.820 PDT,,,5554,,514933a6.15b2,44,,2013-03-19 20:57:26 PDT,,0,LOG,00000,"received SIGHUP, reloading configuration files",,,,,,,,,""

2013-03-29 11:07:14.821 PDT,,,5554,,514933a6.15b2,45,,2013-03-19 20:57:26 PDT,,0,LOG,00000,"parameter ""log_statement"" changed to ""all""",,,,,,,,,""

2013-03-29 11:07:19.784 PDT,"postgres","postgres",2324,"[local]",5155d681.914,8,"idle",2013-03-29 10:59:29 PDT,2/11954,0,LOG,00000,"statement: show log_statement;",,,,,,,,,"psql"

2013-03-29 11:07:28.631 PDT,"postgres","postgres",2324,"[local]",5155d681.914,9,"idle",2013-03-29 10:59:29 PDT,2/11955,0,LOG,00000,"statement: select * from t_ken_yon;",,,,,,,,,"psql"

一般的OLTP系统审计级别设置为ddl就够了,因为记录输出各种SQL对性能的影响还是蛮大的,安全级别高一点的也可以设置mod模式,有条件也可以不在数据库层面做,而是购买设备放在网络层监控解析。

2.定位慢查询SQL

可以设置一定时长的参数(log_min_duration_statement),来记录超过该时长的所有SQL,对找出当前数据库的慢查询很有效。 比如log_min_duration_statement = 2s,记录超过2秒的SQL,改完需要reload

示例:

postgres=# show log_min_duration_statement ;

log_min_duration_statement

----------------------------

2s

(1 row)

postgres=# \timing

Timing is on.

postgres=# select now(),pg_sleep(1);

now | pg_sleep

------------------------------+----------

2013-03-29 12:36:48.13353-07 |

(1 row)

Time: 1001.844 ms

postgres=# select now(),pg_sleep(4);

now | pg_sleep

-------------------------------+----------

2013-03-29 12:36:28.309595-07 |

(1 row)

Time: 4002.273 ms

[postgres@localhost pg_log]$ tail -f postgresql-2013-03-29_000000.csv

2013-03-29 12:36:19.265 PDT,"postgres","postgres",2324,"[local]",5155d681.914,10,"SELECT",2013-03-29 10:59:29 PDT,2/0,0,LOG,00000,"duration: 4027.183 ms statement: select now(),pg_sleep(4);",,,,,,,,,"psql"

可以看到只记录了4秒的那个SQL,而没有记录1秒的SQL。

3.监控数据库的checkpoint

当数据库进行一项大更新操作时,如果参数设置不当,会在日志里留下大量的告警信息,频繁的做checkpoint会导致系统变慢,如:

2013-03-28 17:01:39.523 CST,,,10350,,50bd676b.286e,1,,2012-12-04 11:00:59 CST,,0,LOG,00000,"checkpoints are occurring too frequently (8 seconds apart)",,"Consider increasing the configuration parameter ""checkpoint_segments"".",,,,,,,""

2013-03-28 17:01:50.427 CST,,,10350,,50bd676b.286e,2,,2012-12-04 11:00:59 CST,,0,LOG,00000,"checkpoints are occurring too frequently (11 seconds apart)",,"Consider increasing the configuration parameter ""checkpoint_segments"".",,,,,,,""

但是不会记录系统正常的checkpoint,如果你想看系统一天之类发生了多少次checkpoint,以及每次checkpoint的一些详细信息,比如buffer,sync等,就可以通过设置log_checkpoints,该参数默认值是off,修改log_checkpoints = on 示例:

postgres=# show log_checkpoints ;

log_checkpoints

-----------------

on

(1 row)

postgres=# checkpoint;

CHECKPOINT

postgres=#

[postgres@localhost pg_log]$ tail -f postgresql-2013-03-29_000000.csv

2013-03-29 12:43:38.900 PDT,,,5557,,514933a7.15b5,45,,2013-03-19 20:57:27 PDT,,0,LOG,00000,"checkpoint starting: immediate force wait",,,,,,,,,""

2013-03-29 12:43:38.941 PDT,,,5557,,514933a7.15b5,46,,2013-03-19 20:57:27 PDT,,0,LOG,00000,"checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=0.009 s, sync=0.000 s, total=0.040 s; sync files=0, longest=0.000 s, average=0.000 s",,,,,,,,,""

4.监控数据库的锁

数据库的锁通常可以在pg_locks这个系统表里找,但这只是当前的锁表/行信息,如果你想看一天内有多少个超过死锁时间的锁发生,可以在日志里设置并查看,log_lock_waits 默认是off,可以设置开启。这个可以区分SQL慢是资源紧张还是锁等待的问题。 示例:

postgres=# show log_lock_waits ;

log_lock_waits

----------------

on

(1 row)

postgres=# show deadlock_timeout ;

deadlock_timeout

------------------

1s

(1 row)

--模拟锁

postgres=# begin;

BEGIN

postgres=# SELECT * FROM t_ken_yon ;

id

----

11

(1 row)

postgres=# delete from t_ken_yon ;

DELETE 1

--另一个session

postgres=# begin;

BEGIN

postgres=# delete from t_ken_yon;

--查看日志

[postgres@localhost pg_log]$ tail -f postgresql-2013-03-29_000000.csv

2013-03-29 14:01:02.673 PDT,"postgres","postgres",3056,"[local]",5155f4d9.bf0,6,"DELETE waiting",2013-03-29 13:08:57 PDT,5/12502,2659,LOG,00000,"process 3056 still waiting for ShareLock on transaction 2658 after 1000.398 ms",,,,,,"delete from t_ken_yon;",,,"psql"

2013-03-29 14:02:06.208 PDT,"postgres","postgres",3056,"[local]",5155f4d9.bf0,7,"DELETE waiting",2013-03-29 13:08:57 PDT,5/12502,2659,LOG,00000,"process 3056 acquired ShareLock on transaction 2658 after 64535.339 ms",,,,,,"delete from t_ken_yon;",,,"psql"

2013-03-29 14:02:06.209 PDT,"postgres","postgres",3056,"[local]",5155f4d9.bf0,8,"DELETE",2013-03-29 13:08:57 PDT,5/12502,2659,LOG,00000,"duration: 64536.118 ms statement: delete from t_ken_yon;",,,,,,,,,"psql"

还有一些Debug功能,适合修改源码调试,一般的的系统上并不需要,暂时比较关注的就这些。