欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  数据库

ORA-00600: internal error code, arguments: [15709]

程序员文章站 2024-02-14 14:27:22
...

客户一套10.2.0.4的数据库,一个实例突然的Crash掉了。客户想让我们帮忙分析宕机的原因。对于这种数据库突然Crash的问题,我们首先就会看数据库的Alert日志,可以看到在宕机之前,SMON进程报了ORA-00600[15709]的错误,紧接数据库就输出了一条信息“Fatal in

客户一套10.2.0.4的数据库,一个实例突然的Crash掉了。客户想让我们帮忙分析宕机的原因。对于这种数据库突然Crash的问题,我们首先就会看数据库的Alert日志,可以看到在宕机之前,SMON进程报了ORA-00600[15709]的错误,紧接数据库就输出了一条信息“Fatal internal error happened while SMON was doing active transaction recovery.”也就是说SMON在做活动事务恢复的时候出现了异常。最终导致了数据库实例的宕机。日志输出如下所示:

Fri Sep 26 10:53:35 2014
Errors in file /oracle/app/oracle/admin/wxyydb/bdump/wxyydb_smon_28997.trc:
ORA-00600: internal error code, arguments: [15709], [29], [1], [], [], [], [], []
ORA-30319: Message 30319 not found;  product=RDBMS; facility=ORA
Fri Sep 26 10:53:55 2014
Fatal internal error happened while SMON was doing active transaction recovery.
Fri Sep 26 10:53:55 2014
Errors in file /oracle/app/oracle/admin/wxyydb/bdump/wxyydb_smon_28997.trc:
ORA-00600: internal error code, arguments: [15709], [29], [1], [], [], [], [], []
ORA-30319: Message 30319 not found;  product=RDBMS; facility=ORA
SMON: terminating instance due to error 474
Termination issued to instance processes. Waiting for the processes to exit
Fri Sep 26 10:54:05 2014
Instance termination failed to kill one or more processes
Instance terminated by SMON, pid = 28997

我们再来分析一下wxyydb_smon_28997.trc文件的信息。可以看到数据库的SMON进程一直尝试在做并行恢复事务。在恢复的过程中遇到了ORA-00600错误,最终底层代码异常触发了数据库的宕机。

*** 2014-09-26 10:10:36.236
Parallel Transaction recovery caught error 30319 
*** 2014-09-26 10:15:10.643
Parallel Transaction recovery caught exception 30319
*** 2014-09-26 10:15:21.816
Parallel Transaction recovery caught error 30319 
*** 2014-09-26 10:19:51.707
Parallel Transaction recovery caught exception 30319
*** 2014-09-26 10:53:35.830
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [15709], [29], [1], [], [], [], [], []
ORA-30319: Message 30319 not found;  product=RDBMS; facility=ORA
----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
ksedst()+64          call     ksedst1()            000000000 ? 000000001 ?
ksedmp()+2176        call     ksedst()             000000000 ?
                                                   C000000000000C9F ?
                                                   4000000004057F40 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ?
ksfdmp()+48          call     ksedmp()             000000003 ?
kgeriv()+336         call     ksfdmp()             C000000000000695 ?
                                                   000000003 ?
                                                   40000000095185E0 ?
                                                   00000EC33 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ?
kgeasi()+416         call     kgeriv()             6000000000031770 ?
                                                   6000000000032828 ?
                                                   4000000001A504E0 ?
                                                   000000002 ?
                                                   9FFFFFFFFFFFA138 ?
$cold_kxfpqsrls()+1  call     kgeasi()             6000000000031770 ?
168                                                9FFFFFFFFD3D2290 ?
                                                   000003D5D ? 000000002 ?
                                                   000000002 ? 0000003E7 ?
                                                   000003D5D ?
                                                   9FFFFFFFFD3D22A0 ?
kxfpqrsod()+1104     call     $cold_kxfpqsrls()    C0000004FDF7A838 ?
                                                   C0000004FDF74430 ?
                                                   000000004 ?
                                                   9FFFFFFFFFFFA200 ?
                                                   C0000000000011AB ?
                                                   4000000003AA1250 ?
                                                   00000EDF5 ? 000000001 ?
kxfpdelqrefs()+640   call     kxfpqrsod()          C0000004FDF74430 ?
                                                   000000001 ?
                                                   60000000000B6300 ?
                                                   C000000000000694 ?
                                                   4000000003DD14F0 ?
                                                   00000EE2D ?
                                                   60000000000C6708 ?
kxfpqsod_qc_sod()+2  call     kxfpdelqrefs()       00000003E ? 000000001 ?
016                                                60000000000B6300 ?
                                                   C000000000001028 ?
                                                   40000000025DE5A0 ?
                                                   4000000001B1A110 ?
                                                   60000000000C2D04 ?
                                                   60000000000C2E90 ?
kxfpqsod()+816       call     kxfpqsod_qc_sod()    000000010 ? 000000001 ?
                                                   9FFFFFFFFFFFA260 ?
                                                   60000000000B6300 ?
                                                   9FFFFFFFFFFFA7F0 ?
                                                   C000000000001028 ?
                                                   40000000025DF810 ?
                                                   00000EE65 ?
ktprdestroy()+208    call     kxfpqsod()           C0000004FDF7A838 ?
                                                   000000001 ?
                                                   9FFFFFFFFFFFA810 ?
                                                   60000000000B6300 ?
                                                   9FFFFFFFFFFFAD90 ?
ktprbeg()+8272       call     ktprdestroy()        C000000000001026 ?
                                                   40000000025615B0 ?
                                                   000006E61 ? 000000000 ?
                                                   4000000001052E40 ?
                                                   000000000 ?
ktmmon()+10096       call     ktprbeg()            9FFFFFFFFFFFBE70 ?
                                                   9FFFFFFFFFFFADA0 ?
                                                   60000000000B6300 ?
                                                   40000000028B75A0 ?
                                                   00000EF21 ?
                                                   9FFFFFFFFFFFADD8 ?
                                                   9FFFFFFFFFFFADE0 ?
ktmSmonMain()+64     call     ktmmon()             9FFFFFFFFFFFD140 ?
ksbrdp()+2816        call     ktmSmonMain()        C000000100E1CA60 ?
                                                   C000000000000FA5 ?
                                                   000007361 ?
                                                   4000000003B5AE10 ?
                                                   C000000000000205 ?
                                                   400000000409DCD0 ?
opirip()+1136        call     ksbrdp()             9FFFFFFFFFFFD150 ?
                                                   60000000000B6300 ?
                                                   9FFFFFFFFFFFDC90 ?
                                                   4000000002863EF0 ?
                                                   000004861 ?
                                                   C000000000000B1D ?
                                                   60000000000318F0 ?
$cold_opidrv()+1408  call     opirip()             9FFFFFFFFFFFEA70 ?
                                                   000000004 ?
                                                   9FFFFFFFFFFFF090 ?
                                                   9FFFFFFFFFFFDCA0 ?
                                                   60000000000B6300 ?
                                                   C000000000000DA1 ?
sou2o()+336          call     $cold_opidrv()       000000032 ?
                                                   9FFFFFFFFFFFF090 ?
                                                   60000000000C2C78 ?
$cold_opimai_real()  call     sou2o()              9FFFFFFFFFFFF0B0 ?
+640                                               000000032 ? 000000004 ?
                                                   9FFFFFFFFFFFF090 ?
main()+368           call     $cold_opimai_real()  000000003 ? 000000000 ?
main_opd_entry()+80  call     main()               000000003 ?
                                                   9FFFFFFFFFFFF598 ?
                                                   60000000000B6300 ?
                                                   C000000000000004 ?
 

根据ORA-00600[15709],我们在Oracle Support上找到一篇文档,SMON may fail with ORA-00600 [15709] Errors Crashing the Instance (文档 ID 736348.1),这篇文档的错误信息和我们所报出来的信息雷同。这篇文档列出了出现错误的堆栈情况:kxfpqsrls 695472,而如果你安装了这个patch,还是有类似的问题,很可能是遇到了另外一个类似的bug 9233544,Oracle的Bug还真是多啊。

bug 695472会影响9.2.0.8和10.2.0.4这两个版本,并且在10.2.0.4.2和10.2.0.5,11.1.0.7,11.2.0.1上得到了修复。解决bug 695472的方法是:

1.Use the following workaround

Set fast_start_parallel_rollback=false and recovery_parallelism=0

OR

2.Apply one-off  >, if available for your platform/version here.

OR

3.Upgrade to fixed release 10.2.0.5, 11.1.0.7 or 11.2.0.1.

bug 9233544会影响10.2.0.4,11.1.0.7和11.2.0.1这三个版本,并且在11.2.0.3和12.1上得到了修复,解决bug 9233544的方法是:

1.Apply patchset 11.2.0.3, in which Bug: 9233544 is fixed.

OR

2.Check if one-off Patch:9233544 is available for your release and platform here.

我们仔细检查了一下系统的补丁,发现系统已经安装了patch 6954722,那就证明是bug 9233544影响的。要么升级到11.2.0.3的版本,要么就是安装单独的patch 9233544。对于升级11.2.0.3这个动作太大了,给客户说了一下考虑安装小patch来解决。