11.2.0.1 GI在linux 6.1的ohasd failed to start at /u0
网友在redhat 6.1安装oracle 11.2.0.1 GI出现各种错误,小鱼远程上去帮助安装,前面的asm ssh认证都解决后,在GI安装过程中出现了asmdba asmadmin asmoper组无法显示在system group安装环节中,这个是因为这位朋友安装时候没有建立grid对应的组,后面小鱼新
网友在redhat 6.1安装oracle 11.2.0.1 GI出现各种错误,小鱼远程上去帮助安装,前面的asm ssh认证都解决后,在GI安装过程中出现了asmdba asmadmin asmoper组无法显示在system group安装环节中,这个是因为这位朋友安装时候没有建立grid对应的组,后面小鱼新建后但是还是无法出现后选择,最后重启机器得到解决
但是在执行root.sh时总是出现了下面的错误
CRS-4124: Oracle High Availability Services startup failed.
CRS-4000: Command Start failed, or completed with errors.
ohasd failed to start: Inappropriate ioctl for device
ohasd failed to start at /u01/11.2.0/grid/crs/install/rootcrs.pl line 443.
Mos上找了文章11gR2 ohasd Fails to Start (文档 ID 1157905.1)
There are many reasons why ohasd can fail to start. This document provides some details to find out what is going wrong. It is applicable to 11.2.0.1 CRS under Linux X64, though some sections described here can apply to other platforms and are marked as 'generic'.
In each case you will need to make the requested changes and rerun the root.sh script. If root.sh prompts to say that
| CRS is already configured on this node for crshome=0 Cannot configure
| two CRS instances on the same cluster.
| Please deconfigure before proceeding with the configuration of new home.
then you must run the deconfigure setup before rerunning root.sh.
The deconfigure process is described in Note 942166.1
1. "ohasd failed to start: Inappropriate ioctl for device":
Please reference Note 1069182.1 for troubleshooting "OHASD Failed to Start".
Actually the "ioctl for device" part of the message looks relevant but is in fact a red herring.
This is unpublished bug 9648820, fixed in 11.2.0.2.0 (unpublished bug 10122468)
解决办法1中就对应于此次安装的情形一致,oracle提出这个是个未公开的bug,在11.2.0.2版本中已经解决,而这个bug恰好就出现在了linux 6.1版本中,如果是linux 5版本是不存在的,至于其他的linux 6的版本是否存在,这个小鱼也不清楚。
2. Many known causes is listed in note 1050908.1, refer to Section "Case 1: OHASD.BIN does not start" of the note for details.
3. CRS-4124, CRS-4000 could be due to have configured IPv6 instead of IPv4. Problem described on Bug 9065141 (Closed, Not a bug).
IPv6 is not supported with 11GR2 release of RAC. Reference: http://www.oracle.com/technetwork/database/enterprise-edition/oracledatabaseipv6sod-2141330.pdf
Configure IPv4 as indicated on "Oracle Clusterware Installation Guide" and restart a new fresh installation
Notice that IPv4 and IPv6 can coexist on moderm systems, so you don't need to disable IPv6, just do not use it for RAC configurations
这个讲述的启用ip6也会出现ohasd无法启动的现象
4. Check if init.ohasd is running (generic)
init.ohasd is used to control ohasd (which runs as a binary 'ohasd.bin').
If init.ohasd is not running ohasd won't be able to start.
# ps -ef | grep init.ohasd
root 14324 1 0 Jul16 ? 00:00:00 /bin/sh /etc/init.d/init.ohasd run
init.ohasd is spawned by an entry in /etc/inittab. This is picked up when the machine boots. The scripts run by root.sh will create an entry in /etc/inittab and then call init (s_crsconfig_lib.pm) to start init.ohasd.
If you have no init.ohasd running then
check /etc/inittab contains
h1:35:respawn:/etc/init.d/init.ohasd run >/dev/null 2>&1 /null>
5. Check if ohasd.bin is actually running (generic)
It's possible that the error reported is just because of a slow startup of ohasd. So check if it really started:
[root@gbr10094]# ps -ef | grep ohasd.bin
root 23763 1 0 Jul16 ? 00:00:04 /u01/app/oracle/product/11.2.0/crs/bin/ohasd.bin reboot
If you selected a Standalone install (SIHA) then ohasd.bin should run as the crs owner. If configured for a cluster it will run as root.
If ohasd is running (it's probably worth checking 5 minutes after the root.sh error) then everything may actually be okay
Check status of OHAS and CRS stack. From the grid_home/bin directory run
# crsctl check crs
Output must display all stack ‘online’.
这里我们只讨论第一个情况的解决办法,mos上并没有找到这个bug的解决办法,在网络上倒是看到有些说法,其中最通用的就是在执行root.sh的同时,在另一个终端不停的执行
dd if=/var/tmp/.oracle/npohasd of=/dev/null bs=1024 count=1
刚开始这个文件不会存在,但是等到root.sh执行到adding daemon to inttab就需要执行dd命令了
peer user cert
pauser cert
Adding daemon to inittab
这里推荐就是不停的在另一个终端dd,而如果执行完root.sh后产生了报错后再指定dd已经解决不了这个问题
由于先前小鱼并不清楚这个状态,所以在重新执行root.sh之前还得卸载掉root.sh的一些信息
/u01/11.2.0/grid/crs/install/roothas.pl -deconfig -force –verbose
请记住上述这个脚本的位置$ORACLE_HOME/crs/install/roothas.pl 这个是用于卸载root.sh运行产生的信息,还有一个卸载RAC的脚本,存储位置是$ORACLE_HOME/deinstall/deinstall。
按照上述root.sh运行过程中在另一终端不停的运行dd命令终于算是执行成功了,不过这个网上有朋友提到重启后可能ohasd也无法启动,需要用同样的方法手动crsctl start crs然后在另一终端不停的dd,看来确实不应该在linux 6.1版本运行11.2.0.1的GI,这里建议这位朋友安装11.2.0.2及其以上版本
原文地址:11.2.0.1 GI在linux 6.1的ohasd failed to start at /u0, 感谢原作者分享。