欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  数据库

误删重做日志文件组导致启动数据库报错ORA-03113

程序员文章站 2022-05-07 22:17:51
...

听不止一个人说起,学技术就要不断地折腾、搞破坏,比如说备份恢复,你就可以尝试删文件,不管是日志文件、临时文件、数据文件还

听不止一个人说起,学技术就要不断地折腾、搞破坏,比如说备份恢复,你就可以尝试删文件,不管是日志文件、临时文件、数据文件还是system文件。删了之后,重启数据库肯定报错,有的甚至当时数据库就挂掉,这样你就可以学着恢复,一破一立之间,很多常规的备份恢复手段也就算是领教了。我今天就尝试着把虚拟机上的一个重做日志文件组删除。

1.环境准备

我们在Oracle11g中进行测试,数据库处于非归档状态。

SQL>

SQL> select * from v$version;

BANNER

--------------------------------------------------------------------------------

Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production

PL/SQL Release 11.2.0.3.0 - Production

CORE 11.2.0.3.0 Production

TNS for Linux: Version 11.2.0.3.0 - Production

NLSRTL Version 11.2.0.3.0 - Production

SQL>

SQL> archive log list;

Database log mode No Archive Mode

Automatic archival Disabled

Archive destination /u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch

Oldest online log sequence 9

Current log sequence 11

SQL>

2.删除一个重做日志文件组,重启数据库报错

首先,通过查询v$log视图来获取数据库重做日志文件组的状态。

SQL> select GROUP#,MEMBERS,ARCHIVED,STATUS from v$log;

GROUP# MEMBERS ARC STATUS

---------- ---------- --- ----------------

1 1 NO INACTIVE

2 1 NO CURRENT

3 1 NO INACTIVE

SQL>
然后,通过ls命令查看数据文件,,删除第一个重做日志文件组(该文件组只有一个日志成员)。

[oracle@ hoegh HOEGH]$ ls

control01.ctl redo01.log sysaux01.dbf undotbs01.dbf

control02.ctl redo02.log system01.dbf users01.dbf

example01.dbf redo03.log temp01.dbf

[oracle@hoegh HOEGH]$

[oracle@hoegh HOEGH]$

[oracle@hoegh HOEGH]$ rm redo01.log

[oracle@hoegh HOEGH]$ ls

control01.ctl control02.ctl example01.dbf redo02.log redo03.log sysaux01.dbf system01.dbf temp01.dbf undotbs01.dbf users01.dbf
重启数据库,数据库报错。

SQL>

SQL> shu immediate

Database closed.

Database dismounted.

ORACLE instance shut down.

SQL> startup

ORACLE instance started.

Total System Global Area 941600768 bytes

Fixed Size 1348860 bytes

Variable Size 515902212 bytes

Database Buffers 419430400 bytes

Redo Buffers 4919296 bytes

Database mounted.

ORA-03113: end-of-file on communication channel

Process ID: 5196

Session ID: 125 Serial number: 5

SQL>

SQL> select status from v$instance;

ERROR:

ORA-03114: not connected to ORACLE

SQL>

3.查看报警日志文件,定位问题

ORA-03113报错是一个非常经典的报错,报错原因多种多样,从报错信息中并看不出是什么原因导致的报错,我们可以到报警日志文件中查看有价值的线索。

[oracle@enmoedu1 trace]$ tail -40 alert_HOEGH.log

Wed Jul 08 21:59:30 2015

MMON started with pid=15, OS id=5443

Wed Jul 08 21:59:30 2015

MMNL started with pid=16, OS id=5445

starting up 1 dispatcher(s) for network address \'(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))\'...

starting up 1 shared server(s) ...

ORACLE_BASE from environment = /u01/app/oracle

Wed Jul 08 21:59:39 2015

alter database mount

Wed Jul 08 21:59:43 2015

Successful mount of redo thread 1, with mount id 2105928075

Database mounted in Exclusive Mode

Lost write protection disabled

Completed: alter database mount

Wed Jul 08 22:11:45 2015

Time drift detected. Please check VKTM trace file for more details.

Wed Jul 08 22:11:59 2015

alter database open

Wed Jul 08 22:11:59 2015

Errors in file /u01/app/oracle/diag/rdbms/HOEGH/HOEGH/trace/HOEGH_lgwr_5435.trc:

ORA-00313: open failed for members of log group 1 of thread 1

ORA-00312: online log 1 thread 1: \'/u01/app/oracle/oradata/HOEGH/redo01.log\'

ORA-27037: unable to obtain file status

Linux Error: 2: No such file or directory

Additional information: 3

Errors in file /u01/app/oracle/diag/rdbms/HOEGH/HOEGH/trace/HOEGH_lgwr_5435.trc:

ORA-00313: open failed for members of log group 1 of thread 1

ORA-00312: online log 1 thread 1: \'/u01/app/oracle/oradata/HOEGH/redo01.log\'

ORA-27037: unable to obtain file status

Linux Error: 2: No such file or directory

Additional information: 3

Errors in file /u01/app/oracle/diag/rdbms/HOEGH/HOEGH/trace/HOEGH_ora_5451.trc:

ORA-00313: open failed for members of log group 1 of thread

ORA-00312: online log 1 thread 1: \'/u01/app/oracle/oradata/HOEGH/redo01.log\'

USER (ospid: 5451): terminating the instance due to error 313

Wed Jul 08 22:12:00 2015

System state dump requested by (instance=1, osid=5451), summary=[abnormal instance termination].

System State dumped to trace file /u01/app/oracle/diag/rdbms/HOEGH/HOEGH/trace/HOEGH_diag_5425.trc

Dumping diagnostic data in directory=[cdmp_20150708221200], requested by (instance=1, osid=5451), summary=[abnormal instance termination].

Instance terminated by USER, pid = 5451
其中,黄色标注部分为关键信息,我们知道“/u01/app/oracle/oradata/HOEGH/redo01.log”这个文件找不到了。

4.启动数据库到mount状态,重建重做日志文件组

从报警日志可以看出,第一组重做日志文件组丢了,我们可以通过sql语句“alter database clear logfile group 1;”重建日志文件组;确认日志文件创建成功后,将数据库切换到open状态。

SQL> startup nomount

ORACLE instance started.

Total System Global Area 941600768 bytes

Fixed Size 1348860 bytes

Variable Size 515902212 bytes

Database Buffers 419430400 bytes

Redo Buffers 4919296 bytes

SQL> alter database mount;

Database altered.

SQL> select group#,sequence#,archived,status from v$log;

GROUP# SEQUENCE# ARC STATUS

---------- ---------- --- ----------------

1 7 NO INACTIVE

3 6 NO INACTIVE

2 8 NO CURRENT

SQL>

SQL>

SQL>

SQL>

SQL> alter database clear logfile group 1;

Database altered.

SQL> select group#,sequence#,archived,status from v$log;

GROUP# SEQUENCE# ARC STATUS

---------- ---------- --- ----------------

1 0 NO UNUSED

3 6 NO INACTIVE

2 8 NO CURRENT

启动数据库到open状态

SQL>

SQL> alter database open;

Database altered.

SQL>

SQL> select group#,sequence#,archived,status from v$log;

GROUP# SEQUENCE# ARC STATUS

---------- ---------- --- ----------------

1 0 NO UNUSED

2 8 NO CURRENT

3 6 NO INACTIVE
此时我们再次查看文件列表,结果如下。

[oracle@hoegh HOEGH]$ ls

control01.ctl control02.ctl example01.dbf redo01.log redo02.log redo03.log sysaux01.dbf system01.dbf temp01.dbf undotbs01.dbf users01.dbf

[oracle@hoegh HOEGH]$

5.手动切换重做日志文件组
为了确保新建的日志文件组可用,我们可以手动切换日志文件组,改变新建日志文件组的状态(由UNUSED改为其他)。

SQL>

SQL> alter system switch logfile;

System altered.

SQL> select group#,sequence#,archived,status from v$log;

GROUP# SEQUENCE# ARC STATUS

---------- ---------- --- ----------------

1 9 NO CURRENT

2 8 NO ACTIVE

3 6 NO INACTIVE

SQL> alter system switch logfile;

System altered.

SQL> select group#,sequence#,archived,status from v$log;

GROUP# SEQUENCE# ARC STATUS

---------- ---------- --- ----------------

1 9 NO ACTIVE

2 8 NO ACTIVE

3 10 NO CURRENT

SQL> alter system switch logfile;

System altered.

SQL> select group#,sequence#,archived,status from v$log;

GROUP# SEQUENCE# ARC STATUS

---------- ---------- --- ----------------

1 9 NO INACTIVE

2 11 NO CURRENT

3 10 NO INACTIVE

SQL>
其中,
current:表示该日志组为当前日志组,oracle正在使用该日志组;

active:当current redo组发生日志切换时,状态会改变为active,在这个状态下,如果数据库为归档模式,archive进程会归档active日志组;如果发生数据库crash,该日志组也是实例恢复必需的日志组;

inactive:当active日志组归档完毕并且oracle判断不需要进行实例恢复时,会将其状态修改为inactive,等待下一轮的使用;所以当日志组为inactive的时候,如果数据库为归档模式.那么日志肯定是归档完成了。