欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  数据库

Oracle RAC一节点系统重做问题

程序员文章站 2024-04-05 09:21:54
...

在Oracl RAC 10.2.0.4 两个节点,操作系统为Linux 的环境中,一节点服务器的本地硬盘突然全部损坏,停止运行。剩下的一个节点还

在Oracl RAC 10.2.0.4 两个节点,操作系统为Linux 的环境中,,一节点服务器的本地硬盘突然全部损坏,停止运行。剩下的一个节点还能正常工作,继续提供对外数据库服务。

问题很清楚,硬盘损坏的服务器在操作系统重做后,如何添加到RAC 集群中去?

在Google 以及METALINK 上查了一下,倒是有完全一样的问题,但没有想要的答案。

其中在Oracle 官网讨论区有这样一个帖子,描述的情况同我的基本一致。

Hello all. I have a two-node 10g R2 RAC with ASM patched to 10.2.0.4 running on RedHat AS4 x86_64. We recently had an accidental release of an Inergen fire suppression system at or collocation facility. This release caused many of our disks to fail causing issues for some of our systems. For the most part, we were very lucky having built-in redundancy across LUNs; however, we lost all 4 disks of local storage on Node1 of our two-node RAC.

...............

I appreciate any help, and I'm greatful for your time.

(老外有点比较好,最后都是感谢的话。)

有人给出这样的解决方法:

[root@webrac1 crs_1]# more root.sh

#!/bin/sh

/u01/app/oracle/product/10.2.0/crs_1/install/rootinstall

/u01/app/oracle/product/10.2.0/crs_1/install/rootconfig

[root@webrac1 crs_1]# ./root.sh

WARNING: directory '/u01/app/oracle/product/10.2.0' is not owned by root

WARNING: directory '/u01/app/oracle/product' is not owned by root

WARNING: directory '/u01/app/oracle' is not owned by root

Checking to see if Oracle CRS stack is already configured

/etc/oracle does not exist. Creating it now.

Setting the permissions on OCR backup directory

Setting up NS directories

Oracle Cluster Registry configuration upgraded successfully

WARNING: directory '/u01/app/oracle/product/10.2.0' is not owned by root

WARNING: directory '/u01/app/oracle/product' is not owned by root

WARNING: directory '/u01/app/oracle' is not owned by root

clscfg: EXISTING configuration version 3 detected.

clscfg: version 3 is 10G Release 2.

Successfully accumulated necessary OCR keys.

Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.

node :

node 1: webrac1 webrac1-priv webrac1

node 2: webrac2 webrac2-priv webrac2

clscfg: Arguments check out successfully.

NO KEYS WERE WRITTEN. Supply -force parameter to override.

-force is destructive and will destroy any previous cluster

configuration.

Oracle Cluster Registry for cluster has already been initialized

Startup will be queued to init within 30 seconds.

Adding daemons to inittab

Expecting the CRS daemons to be up within 600 seconds.

CSS is active on these nodes.

webrac1

webrac2

CSS is active on all nodes.

Waiting for the Oracle CRSD and EVMD to start

Oracle CRS stack installed and running under init(1M)

Running vipca(silent) for configuring nodeapps

在 (0) 节点上创建 VIP 应用程序资源 .

在 (0) 节点上创建 GSD 应用程序资源 .

在 (0) 节点上创建 ONS 应用程序资源 .

启动 (2) 节点上的 VIP 应用程序资源 ...

启动 (2) 节点上的 GSD 应用程序资源 ...

启动 (2) 节点上的 ONS 应用程序资源 ...

Done.

第六步,修改配置文件/etc/oratab

这个文件从幸存节点拷贝过来,修改一下属性和内容。

[root@webrac2 archivelog]# scp /etc/oratab webrac1:/etc/

root@webrac1's password:

oratab

100% 766 0.8KB/s 00:00

[root@webrac2 archivelog]#

[root@webrac1 etc]# chown -R oracle:root oratab

[root@webrac1 etc]# ls -ltr oratab

-rw-r--r-- 1 oracle root 766 05-08 17:12 oratab

[root@webrac1 etc]# vi oratab

#

+ASM1:/u01/app/oracle/product/10.2.0/db_1:N

webdb:/u01/app/oracle/product/10.2.0/db_1:N

~

第七步,执行RDBMS 下的root.sh

[root@webrac1 db_1]# ./root.sh

Running Oracle10 root.sh script...

The following environment variables are set as:

ORACLE_OWNER= oracle

ORACLE_HOME= /u01/app/oracle/product/10.2.0/db_1

Enter the full pathname of the local bin directory: [/usr/local/bin]:

Copying dbhome to /usr/local/bin ...

Copying oraenv to /usr/local/bin ...

Copying coraenv to /usr/local/bin ...

Entries will be added to the /etc/oratab file as needed by

Database Configuration Assistant when a database is created

Finished running generic part of root.sh script.

Now product-specific root actions will be performed.

第八步,修改配置文件$ORACLE_HOME/network/admin/listener.ora

原来的监听器文件的配置是基于节点2 的,所有这里修改成符合节点1 的。这个修改很容易。

LISTENER_WEBRAC1 =

(DESCRIPTION_LIST =

(DESCRIPTION =

(ADDRESS_LIST =

(ADDRESS = (PROTOCOL = TCP)(HOST = webrac1-vip)(PORT = 1521)(IP = FIRST))

)

(ADDRESS_LIST =

(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.10.42)(PORT = 1521)(IP = FIRST))

)

(ADDRESS_LIST =

(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC))

)

)

)

第九步,修改配置文件$ORACLE_HOME/dbs 下的spfile 文件和密码文件,对象为ASM 实例和数据库实例。

只要修改一下文件名称就可以,例如:

cp orapw+ASM2 orapw+ASM1

cp spfile+ASM2.ora spfile+ASM1.ora

第十步,使用crs_start -all 启动所有资源。

经过这十步,新系统被快速加入到RAC 中。整个过程中,不需要停数据库服务。

[oracle@webrac1 ~]$ crs_stat -t

名称 类型 目标 状态 主机

------------------------------------------------------------

ora.webdb.db application ONLINE ONLINE webrac2

ora....ebdb.cs application ONLINE ONLINE webrac2

ora....db1.srv application ONLINE ONLINE webrac2

ora....b1.inst application ONLINE ONLINE webrac1

ora....b2.inst application ONLINE ONLINE webrac2

ora....SM1.asm application ONLINE ONLINE webrac1

ora....C1.lsnr application ONLINE ONLINE webrac1

ora....ac1.gsd application ONLINE ONLINE webrac1

ora....ac1.ons application ONLINE ONLINE webrac1

ora....ac1.vip application ONLINE ONLINE webrac1

ora....SM2.asm application ONLINE ONLINE webrac2

ora....C2.lsnr application ONLINE ONLINE webrac2

ora....ac2.gsd application ONLINE ONLINE webrac2

ora....ac2.ons application ONLINE ONLINE webrac2

ora....ac2.vip application ONLINE ONLINE webrac2

[oracle@webrac1 ~]$

2. 总结
这个问题关键点在于 $CRS 目录中 root.sh 文件,这个文件在 /etc 等目录下创建了一些文件。这些文件,如果你很清楚,也可以手工去创建。

RAC 整个环境都是正常的, OCR 配置在存储上正常访问,所以问题本质上也就是配置配置访问链接。

更多Oracle相关信息见Oracle 专题页面 ?tid=12

Oracle RAC一节点系统重做问题

上一篇: nginx location 指令说明

下一篇: