从alert日志看Oracle 11g DataGuard日志传输
Open过程完成,但是数据库日志不断报错。主要体现在两个方面,一个是监听器故障,Primary在传递日志的时候,archive_log_dest配置
Oracel DG技术本身,是借助redo log的传递和应用,确保在standby端数据和primary端保持一致数据。在这个过程中,Redo Transport和Redo Apply是两个核心动作。Redo Transport是将Redo Log信息传递到Standby端,等待进行Apply。而Redo Apply就是将这些日志应用执行,更改Standby端的数据,来实现一致。
下面实验,就是利用alert log来观察一对Primary和Standby在启动过程、工作过程中传递日志的情况。从而证明Oracle DG的工作特点和机制。
相关参考:
Oracle Data Guard 重要配置参数
基于同一主机配置 Oracle 11g Data Guard
探索Oracle之11g DataGuard
Oracle Data Guard (RAC+DG) 归档删除策略及脚本
Oracle Data Guard 的角色转换
Oracle Data Guard的日志FAL gap问题
Oracle 11g Data Guard Error 16143 Heartbeat failed to connect to standby 处理方法
1、环境介绍
我们在Oracle 11g上进行试验,版本为11.2.0.4。由于环境限制,笔者Primary和Physical Standby在相同服务器上。Primary实例名称为ora11g,Standby实例名为ora11gsy。
监听程序首先关闭,来查看数据库行为。
[oracle@SimpleLinux ~]$ lsnrctl status
LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 27-APR-2014 13:40:15
Copyright (c) 1991, 2013, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=SimpleLinux)(PORT=1521)))
TNS-12541: TNS:no listener
TNS-12560: TNS:protocol adapter error
TNS-00511: No listener
Linux Error: 111: Connection refused
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC1521)))
TNS-12541: TNS:no listener
TNS-12560: TNS:protocol adapter error
TNS-00511: No listener
Linux Error: 111: Connection refused
主库Primary日志。
[root@SimpleLinux ~]# su - oracle
[oracle@SimpleLinux ~]$ cd /u01/app/diag/rdbms/ora11g/ora11g/trace/
[oracle@SimpleLinux trace]$ ls -l | grep alert
-rw-r-----. 1 oracle oinstall 176813 Apr 21 21:58 alert_ora11g.log
2、Primary端启动过程
首先启动数据库到nomount状态,此时pmon是进行工作的。
[oracle@SimpleLinux ~]$ env | grep ORACLE_SID
ORACLE_SID=ora11g
[oracle@SimpleLinux ~]$ sqlplus /nolog
SQL*Plus: Release 11.2.0.4.0 Production on Sun Apr 27 13:54:15 2014
Copyright (c) 1982, 2013, Oracle. All rights reserved.
SQL> conn / as sysdba
Connected to an idle instance.
SQL> startup nomount
ORACLE instance started.
Total System Global Area 372449280 bytes
Fixed Size 1364732 bytes
Variable Size 331353348 bytes
Database Buffers 33554432 bytes
Redo Buffers 6176768 bytes
这个阶段日志是没有什么额外特殊的信息的,只有正常的后台实例启动。
Sun Apr 27 13:54:58 2014
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Initial number of CPU is 1
CELL communication is configured to use 0 interface(s):
CELL IP affinity details:
NUMA status: non-NUMA system
cellaffinity.ora status: N/A
CELL communication will use 1 IP group(s):
Grp 0:
(篇幅原因,有省略……)
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
Sun Apr 27 13:55:07 2014
MMNL started with pid=16, OS id=1776
starting up 1 shared server(s) ...
ORACLE_BASE from environment = /u01/app
切换到mount状态。
SQL> alter database mount;
Database altered.
日志中,定位到mount状态。
Sun Apr 27 14:03:18 2014
alter database mount
Sun Apr 27 14:03:23 2014
Successful mount of redo thread 1, with mount id 4242195174
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: alter database mount
之后启动数据库,到open状态。在mount之前,数据库是不会生成和执行redo相关的动作的。从mount到open阶段,是需要进行一个instance recovery过程的,也就是日志前滚后滚的动作。在mount和mount之前,是不会有Redo Transport过程的。
Sun Apr 27 14:24:56 2014
alter database open
Beginning crash recovery of 1 threads
Started redo scan
Completed redo scan
read 78 KB redo, 26 data blocks need recovery
Started redo application at
Thread 1: logseq 32, block 47
Recovery of Online Redo Log: Thread 1 Group 1 Seq 32 Reading mem 0
Mem# 0: /u01/app/oradata/ORA11G/onlinelog/o1_mf_1_9mnjwtj9_.log
Mem# 1: /u01/app/fast_recovery_area/ORA11G/onlinelog/o1_mf_1_9mnjwvdm_.log
Completed redo application of 0.02MB
Completed crash recovery at
Thread 1: logseq 32, block 203, scn 815633
26 data blocks read, 26 data blocks written, 78 redo k-bytes read
Sun Apr 27 14:24:58 2014
LGWR: STARTING ARCH PROCESSES
Sun Apr 27 14:24:58 2014
Fatal NI connect error 12541, connecting to: