Power Failure / Raid Failure/ HUD Failure / Now What --
TB 2.4.0 Aterisk 1.4.16 / CentOS 5, No YUMs.
Have a system that seems to be in a bad power area new UPS & New battries in other UPS unts helped some, but power was down just to long and the system failed hard.. What tool are people usinng to watch APC UPS units and shutdown when they go low ?
Now I have what appears to be a failed Raid set..
http://www.trixbox.org/forums/trixbox-forums/open-discussion/sata...
The IRCD and therefore HUDLite did not come back up had to restart by Hand, anyone write a cron to watch these two taskes and restart them if one or the other fails ?
/var/log/messages is filling up with this error.. An I have no idea what CentOS wants done, has anyone seen it before and know what the correction action should be ? Is it related to the raid failure ?
TIA....
----------------------------------------------------------------------------------------------------
Jan 30 18:20:47 trixbox1 smartd[2730]: Device: /dev/sda, not capable of SMART self-check
Jan 30 18:20:47 trixbox1 smartd[2730]: Device: /dev/sda, failed to read SMART Attribute Data
Jan 30 18:50:47 trixbox1 smartd[2730]: Device: /dev/sda, not capable of SMART self-check
Jan 30 18:50:47 trixbox1 smartd[2730]: Device: /dev/sda, failed to read SMART Attribute Data
Jan 30 19:20:47 trixbox1 smartd[2730]: Device: /dev/sda, not capable of SMART self-check
Jan 30 19:20:47 trixbox1 smartd[2730]: Device: /dev/sda, failed to read SMART Attribute Data
Jan 30 19:50:47 trixbox1 smartd[2730]: Device: /dev/sda, not capable of SMART self-check
Jan 30 19:50:47 trixbox1 smartd[2730]: Device: /dev/sda, failed to read SMART Attribute Data
Hi, can't really help you with your immediate problem, but no, it appears the problem is not related with the RAID itself but rather with the SMART monitoring: if you google for "Device: /dev/sda, not capable of SMART self-check" you will come across several peolpe who report a lot of SMART messages like yours. Their cure has been to disable smartd altogether.
Besides IRCD and HUDLite, is there anything else going wrong? Can you access all of your disks? Perhaps posting a dmesg would be appropriate.
Regarding your other question, look up www.networkupstools.org or, since you are using APC, www.apcupsd.org.
Well,
I would think it should be but I am going to research the two links referenced above and hopefully have something working soon. The Raid and smartd errors are more trouble some right now.
Other than my nervousness about the raid because power at this facility/ office is so unstable the system has actually been real stable.
------------------------
Hoping someone with more CentOS experience than I have can explain these messages from dmesg regarding md2 & sda3 after a post power failure boot....
sd 0:0:0:0: Attached scsi generic sg0 type 0
sd 1:0:0:0: Attached scsi generic sg1 type 0
hda: ATAPI 48X CD-ROM drive, 96kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.20
FDC 0 is a National Semiconductor PC87306
lp: driver loaded but no devices found
md: Autodetecting RAID arrays.
md: autorun ...
md: considering sda3 ...
md: adding sda3 ...
md: md2 already running, cannot run sda3
md: export_rdev(sda3)
md: ... autorun DONE.
tia -------------------
To install apcupsd from source:
-------------------------------------------------------------------------------
yum install gcc-c++
cd /usr/src
wget http://downloads.sourceforge.net/apcupsd/apcupsd-3.14.3.tar.gz
tar xvzf apcupsd-3.14.3.tar.gz
cd apcupsd-3.14.3
CFLAGS="-g -O2" LDFLAGS="-g" ./configure --enable-usb --with-upstype=usb --with-upscable=usb --prefix=/usr --sbindir=/sbin --with-cgi-bin=/var/www/cgi-bin --enable-cgi --with-css-dir=/var/www/docs/css --with-log-dir=/etc/apcupsd --enable-pthreads --enable-powerflute
make
make install
-----------------------------------------------------------------------------
to check location of apcupsd:
whereis apcupsd
If compiled correctly one of the locations will be /sbin/apcupsd
Then edit UPS device type and connection type:
nano /etc/apcupsd/apcupsd.conf
I use some older model APC Smart UPS 620 with serial interface, so I use
UPSTYPE = apcsmart
UPSCABLE = smart
To start apcupsd monitoring, type:
/etc/init.d/apcupsd start
To check status (including battery runtime etc):
/etc/init.d/apcupsd status
Dell SC-440 can run on chip SmartUPS 620 (you can get it for $70) for ~50-60 mins.
Thank you,
Vadim


Member Since:
2007-02-15