Need Help - Two node cluster, RHEL 6 High Availability Add on , with Oracle over NFS
we have set up two node cluster, with Oracle datafiles running in the NFS mounted /data
fail over is working with DB crash, power failure
However, a loss of connectivity on eth0 causes the following problems:
1. The /data mount is not detected to have failed or hung. The netfs.sh script which we are using on the cluster.conf doesnt detect this and try to unount it.
2. The cluster doesn't know eth0 is dead.
clustat reports everything as normal throughout, so nothing happens. Additionally, because /data is essentially hung, manual failover via clusvcadm is also failing.
Here is our cluster.conf file. can anyone contribute please
<?xml version="1.0"?>
<cluster config_version="35" name="cluster1">
<fence_daemon post_fail_delay="0"/>
<clusternodes>
<clusternode name="test1.private" nodeid="1">
<fence>
<method name="manual">
<device name="manual"/>
</method>
</fence>
</clusternode>
<clusternode name="test2.private" nodeid="2">
<fence>
<method name="manual">
<device name="manual"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman expected_votes="1" two_node="1"/>
<fencedevices>
<fencedevice agent="fence_manual" name="manual"/>
</fencedevices>
<rm>
<failoverdomains/>
<service autostart="1" name="oracle" recovery="relocate">
<netfs ref="data_mount"/>
<script ref="oracle_resource"/>
<ip address="192.168.1.86" monitor_link="eth0"/>
</service>
<resources>
<script file="/usr/local/bin/test.sh" name="oracle_resource"/>
<netfs export="/data/dbcluster" force_unmount="1" fstype="nfs" host="test.main.example.com" mountpoint="/data" name="data_mount" opt ions="rw,bg,hard,nointr,tcp,nfsvers=3,timeo=600,rsize=32768,wsize=32768,actimeo=0"/>
</resources>
</rm>
</cluster>"
|