欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  数据库

Oracle基础知识:还原真实的cache recovery

程序员文章站 2024-04-04 12:53:17
...

我们在学习Oracle基础知识的时候会了解到实例恢复(Instance Recovery)或者说崩溃恢复(Crash recovery)的概念,有时候甚至于这2个

我们在学习Oracle基础知识的时候会了解到实例恢复(Instance Recovery)或者说崩溃恢复(Crash recovery)的概念,有时候甚至于这2个名词在我们日常的语言中表达同样的意思。实际上Instance Recovery与Crash Recovery是存在区别的:针对单实例(single instance)或者RAC中所有节点全部崩溃后的恢复,我们称之为Crash Recovery。而对于RAC中的某一个节点失败,存活节点(surviving instance)试图对失败节点线程上redo做应用的情况,我们称之为Instance Recovery。

不管是Instance Recovery还是Crash Recovery,都由2个部分组成:cache recovery继以transaction recovery。

根据官方文档的介绍,Cache Recovery也叫Rolling Forward,就是我们常说的前滚;而Transaction Recovery也叫Rolling Back,就是我们常说的回滚。前滚和回滚贯穿Oracle恢复的基本概念,是我们入门必要学习的知识,在次不多介绍。

有文事者,必济之以武略。理论学得再好,不实践也无用。所幸Crash Recovery是很容易做成的实验,我们不妨来看一看:

SQL> shutdown abort;
ORACLE instance shut down.

SQL> startup mount;
ORACLE instance started.

Total System Global Area 1065353216 bytes
Fixed Size 2089336 bytes
Variable Size 486542984 bytes
Database Buffers 570425344 bytes
Redo Buffers 6295552 bytes
Database mounted.

SQL> alter database open;

Crash Recovery将从alter database open开始,我们来观察其日志

==================alert.log====================
alter database open
Tue Jun 14 18:19:53 2011
Beginning crash recovery of 1 threads
parallel recovery started with 2 processes
Tue Jun 14 18:19:53 2011
Started redo scan
Tue Jun 14 18:19:53 2011
Completed redo scan
0 redo blocks read, 0 data blocks need recovery
Tue Jun 14 18:19:53 2011
Started redo application at
Thread 1: logseq 1004, block 1124, scn 17136185
Tue Jun 14 18:19:53 2011
Recovery of Online Redo Log: Thread 1 Group 2 Seq 1004 Reading mem 0
Mem# 0: /flashcard/oradata/G10R2/onlinelog/o1_mf_2_6v34jokt_.log
Mem# 1: /s01/flash_recovery_area/G10R2/onlinelog/o1_mf_2_6v34jotq_.log
Tue Jun 14 18:19:53 2011
Completed redo application
Tue Jun 14 18:19:53 2011
Completed crash recovery at
Thread 1: logseq 1004, block 1124, scn 17156186
0 data blocks read, 0 data blocks written, 0 redo blocks read
Tue Jun 14 18:19:53 2011
LGWR: STARTING ARCH PROCESSES
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0 started with pid=16, OS id=7829
Tue Jun 14 18:19:53 2011
Thread 1 advanced to log sequence 1005 (thread open)
Thread 1 opened at log sequence 1005
Current log# 3 seq# 1005 mem# 0: /flashcard/oradata/G10R2/onlinelog/o1_mf_3_6v34jpmp_.log
Current log# 3 seq# 1005 mem# 1: /s01/flash_recovery_area/G10R2/onlinelog/o1_mf_3_6v34jpyn_.log
Successful open of redo thread 1
Tue Jun 14 18:19:53 2011
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
ARC0: Becoming the heartbeat ARCH
Tue Jun 14 18:19:53 2011
SMON: enabling cache recovery
Tue Jun 14 18:19:53 2011
db_recovery_file_dest_size of 204800 MB is 6.81% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Tue Jun 14 18:19:54 2011
Successfully onlined Undo Tablespace 1.
Tue Jun 14 18:19:54 2011
SMON: enabling tx recovery
Database Characterset is UTF8
Opening with internal Resource Manager plan
where NUMA PG = 1, CPUs = 2
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=17, OS id=7831
Tue Jun 14 18:19:55 2011
Completed: alter database open

注意上述单实例Crash Recovery到数据库打开的整个过程:

alter database open
Beginning crash recovery of 1 threads
Started redo scan
Completed redo scan
Started redo application at
Completed redo application
Completed crash recovery at
SMON: enabling cache recovery
Successfully onlined Undo Tablespace 1
SMON: enabling tx recovery
Completed: alter database open

从上述步骤中我们可以看到三种恢复名词,即:

crash recovery
cache recovery
tx recovery

这和官方文档所描述的Crash Recovery概念是不一致的,,我们现在来理清这几种recovery。

crash recovery包含对redo的scan和application,显然其完成的是Rolling Forward前滚的工作,告警日志中出现的crash recovery等同于官方文档中介绍的”cache recovery”,我们可以将” Completed crash recovery”看做前滚完成的标志。而tx recovery从字面就可以看出实际上是Transaction Recovery,tx recovery发生在Undo Tablespace online之后(回滚事务的前提是Undo可用),数据完成打开操作之前(“Completed: alter database open”)。注意tx recovery并不要求数据库打开前完成,仅仅是在数据库打开之前由smon启动(“SMON: enabling tx recovery”)。

剩下的唯一的问题是,这里的cache recovery是什么?显然它不是官方文档中所描述的”cache recovery”,几乎没有任何文档介绍存在这样一个recovery操作,这也是本文重点要介绍的。

我们来看另一个演示,这个演示用以说明cache recovery还存在于最普通的不包含Crash Recovery的数据库打开过程中:

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.

SQL> startup mount;
ORACLE instance started.
Total System Global Area 1065353216 bytes
Fixed Size 2089336 bytes
Variable Size 486542984 bytes
Database Buffers 570425344 bytes
Redo Buffers 6295552 bytes
Database mounted.

SQL> alter database open;
Database altered.

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE 10.2.0.4.0 Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------

==================alert.log====================
alter database open
Tue Jun 14 18:43:52 2011
LGWR: STARTING ARCH PROCESSES
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0 started with pid=14, OS id=8133
Tue Jun 14 18:43:52 2011
Thread 1 opened at log sequence 1005
Current log# 3 seq# 1005 mem# 0: /flashcard/oradata/G10R2/onlinelog/o1_mf_3_6v34jpmp_.log
Current log# 3 seq# 1005 mem# 1: /s01/flash_recovery_area/G10R2/onlinelog/o1_mf_3_6v34jpyn_.log
Successful open of redo thread 1
Tue Jun 14 18:43:52 2011
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
ARC0: Becoming the heartbeat ARCH
Tue Jun 14 18:43:52 2011
SMON: enabling cache recovery
Tue Jun 14 18:43:53 2011
Successfully onlined Undo Tablespace 1.
Tue Jun 14 18:43:53 2011
SMON: enabling tx recovery
Database Characterset is UTF8
Opening with internal Resource Manager plan
where NUMA PG = 1, CPUs = 2
replication_dependency_tracking turned off (no async multimaster replication found)
Tue Jun 14 18:43:53 2011
Incremental checkpoint up to RBA [0x3ed.624.0], current log tail at RBA [0x3ed.944.0]
Tue Jun 14 18:43:53 2011
Starting background process QMNC
QMNC started with pid=15, OS id=8135
Tue Jun 14 18:43:53 2011
Completed: alter database open

因为是clean shutdown,所以这里不存在crash recovery。但这里同样出现了”SMON: enabling cache recovery”,可见cache recovery是每次实例启动instance startup必要执行的一种恢复操作。但问题是,这个恢复操作到底针对何种对象?

Oracle基础知识:还原真实的cache recovery