LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 05-21-2008, 01:13 PM   #1
Maaras
LQ Newbie
 
Registered: May 2008
Location: San Diego, CA
Distribution: OpenSUSE, SLES
Posts: 3

Rep: Reputation: 0
Need help using Heartbeat + softdog to reboot my SLES system


I am trying to set up Heartbeat + softdog on a SLES 10 installation so that if the machine gets seriously hung for some unlikely reason, it will automatically reboot. I cannot seem to find much documentation on using softdog, and the mentions of it on www.linux-ha.org are sparse (in my opinion).

I'm working on a test machine to try and get this set up. I have done a "baseline" install of SLES 10 (basically just accepting all defaults during installation). I then install the Heartbeat application after booting the machine from the hard drive the first time. I am not setting up a cluster, but I was told that Heartbeat could easily trigger the watchdog to reboot my machine. So my "cluster" is a cluster of one machine. My ha.cf is pretty simple:

Code:
logfile /var/log/ha-log
logfacility local0
keepalive 2
deadtime 30
warntime 10
initdead 120
autojoin any
crm true
bcast eth0
watchdog /dev/watchdog
node testserver
respawn root /sbin/evmsd
apiauth evms uid=hacluster,root
I'm not trying to set up any failover of services or anything, so I don't have any resources set up. So with this in place, I reboot my machine, and it seems to come up just fine.

It is at this point that I have questions. How can I test my setup to be sure that it is configured properly and works the way I want it to? I looked in the log file and the only mention I see of the watchdog is a line like the following:

Code:
heartbeat[3549]: 2008/05/19_16:40:52 ERROR: WDIOC_SETTIMEOUT: Failed to set watchdog timer to 31 seconds.: Invalid argument
This makes me think that I have something configured wrong, but as I mentioned above, I have not been very successful in finding much detailed documentation about what I'm trying to do here.

Am I even barking up the right tree? Is there an easier/better way to monitor general system health and reboot if there is an issue?
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
how to load sles in redhat platform using sles image created in nfs server AshishNsearch Red Hat 0 03-28-2008 12:12 AM
How do I add RAID 1 to existing system on SLES 10 twn2 SUSE / openSUSE 0 01-12-2008 12:08 PM
[linux auto reboot] content"reboot system boot" hunter_cao7 Linux - Server 1 11-12-2007 06:09 PM
SLES 10 updates via SLES 9 machine hassan2 SUSE / openSUSE 2 09-19-2007 05:00 PM
Initialized SLES 8 update over OES SLES 9 SP2 by mistake. Kamenko SUSE / openSUSE 4 03-13-2007 05:21 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 04:15 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration