HA Cluster
Well so far i have been able to configure DRBD and HA, all of the services start (including amportal) when there is a failure. What i have not been able to find is a full listing/dir structure of the trix installation. I will need this in order to configure drbdlinks. This is what i have so far, this was gleaned from an article
http://www.danielaliaman.com/blog/files/phonecube/cluster/Asteris...
If anyone can point me in the right direction, that would be awesome!
PS, i can help anyone else get this far as well, and if you already got it figured out, let me know!
/var/account
/var/ftp
/var/nwebmail
/var/spool/asterisk
/var/spool/clientmqueue
/var/spool/mqueue
/var/spool/vbox
/var/trixbox_load
/var/www
/var/lib/asterisk
/var/lib/ircd
/var/lib/mysql
/var/lib/php
/etc/asterisk
/etc/httpd
/etc/ircd
/etc/mail
/etc/php.d
/etc/vsftpd
/usr/lib/httpd
/etc/vsftpd.ftpusers
/etc/vsftpd.user_list
/etc/aliases
/etc/aliases.db
/etc/dhcpd.conf
/etc/my.cnf
/etc/php.ini
/etc/xinetd.conf
/etc/xinetd.d


I will give all i know....Please help improve
2 identical boxes
Trixbox1.local eth0 publicip eth1 10.0.10.2 2x160gb hwRAID1
Trixbox2.local eth0 publicip eth1 10.0.10.3 2x160gb hwRAID1
eth1 via crossover gigabit nic
Both are set up as
/boot 101mb
/ 75000mb
/drbd075000mb
/swap 2048mb
OS: CentOS 5 Final 2.6.18-8.1.14.el5 SMP
Drbd 8.0.4-1.el5
Kmod-Drbd 8.0.4-1.2.6.18_8.1.14.el5
/etc/drbd.conf on both
global {
dialog-refresh 5; # 5 seconds
usage-count yes;
}
common {
syncer { rate 120M; }
}
resource drbd0 {
protocol C;
handlers {
pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f";
pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f";
local-io-error "echo o > /proc/sysrq-trigger ; halt -f";
}
startup {
wfc-timeout 60;
degr-wfc-timeout 120; # 2 minutes.
}
disk {
on-io-error detach;
fencing resource-and-stonith;
}
net {
after-sb-0pri disconnect;
after-sb-1pri disconnect;
after-sb-2pri disconnect;
rr-conflict call-pri-lost;
}
syncer {
rate 120M;
}
on trixbox1.local {
device /dev/drbd0;
disk /dev/mapper/hpt45x_dejfaehcfr4;
address 10.0.10.2:7788;
flexible-meta-disk internal;
}
on trixbox2.local {
device /dev/drbd0;
disk /dev/mapper/hpt45x_dejfaehcfp4;
address 10.0.10.3:7788;
meta-disk internal;
}
}
-------------------------------------------------------------
Reboot
On trixbox1.local drbdadm -- --overwrite-data-of-peer primary all
mkfs.ext3 /dev/drbd0
trixbox 1 & 2 /etc/fstab /dev/drbd0 /share ext3 noauto 0 0
trixbox 1 & 2 mkdir /share
RESULTS:
[trixbox1.local ~]# cat /proc/drbd
version: 8.0.4 (api:86/proto:86)
SVN Revision: 2947 build by buildsvn[at]c5-i386-build, 2007-09-29 06:28:57
0: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r---
ns:75189868 nr:0 dw:1340508 dr:73849428 al:1138 bm:4665 lo:0 pe:0
ua:0 ap:0
resync: used:0/31 hits:4736102 misses:4940 starving:0 dirty:0
changed:4940
act_log: used:0/127 hits:333989 misses:2974 starving:2
dirty:1834 changed:1138
on both yum install heartbeat -y
Installed: heartbeat.i386 0:2.1.2-3.el5.centos
Dependency Installed: heartbeat-pils.i386 0:2.1.2-3.el5.centos
heartbeat-stonith.i386 0:2.1.2-3.el5.centos
openhpi.i386 0:2.4.1-6.el5.1
on both /etc/ha.d/authkeys
auth 1
1 crc
On both Chmod 600 authkeys
On both /etc/ha.d/ha.cf
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
keepalive 200ms
deadtime 2
warntime 1
initdead 120
udpport 694
bcast eth1
auto_failback on
node trixbox1.local
node trixbox2.local
on both /etc/ha.d/haresources
trixbox1.local xxx.xxx.xxx.xxx/27/eth0 drbddisk::drbd0
Filesystem::/dev/drbd0::/share::ext3 mysqld sendmail asterisk httpd ircd
xinetd
on both did the following
chkconfig --levels 345 mysqld off
chkconfig --levels 345 sendmail off
chkconfig --levels 345 asterisk off
chkconfig --levels 345 httpd off
chkconfig --levels 345 ircd off
chkconfig --levels 345 xinetd off
chkconfig --levels 345 heartbeat on
service mysqld stop
service sendmail stop
service asterisk stop
service httpd stop
service ircd stop
service xinetd stop
service heartbeat start
on both, edit /etc/rc.local
remove the amportal entry.
Create the following script
vi /etc/init.d/amportal
#! /bin/sh
#
# Source function library.
. /etc/rc.d/init.d/functions
RETVAL=0
PROCNAME=portal
# See how we were called.
case "$1" in
start)
/usr/sbin/amportal start
RETVAL=0
echo
;;
stop)
/usr/sbin/amportal stop
RETVAL=0
echo
;;
status)
status $PROCNAME
RETVAL=$?
;;
restart|reload)
$0 stop
$0 start
RETVAL=$?
;;
*)
echo "Usage: amportal {start|stop|status|restart}"
exit 1
esac
exit $RETVAL
Then chmod the file amportal
chmod 755 /etc/init.d/amportal
wget ftp://anonymous:fireftp@example.com@ftp.tummy.com/pub/tummy/drbdl...
Make sure that all services are halted.
Here we need to tar the directories we need and move all data to /share
This is the drbdlinks.conf file use this as a template to see what files need to be copied.
drbdlinks.conf
usebindmount(1)
mountpoint('/share')
link('/var/account')
link('/var/ftp')
link('/var/spool/asterisk')
link('/var/spool/clientmqueue')
link('/var/spool/mqueue')
link('/var/spool/vbox')
link('/var/trixbox_load')
link('/var/www')
link('/var/lib/asterisk')
link('/var/lib/dav')
link('/var/lib/ircd')
link('/var/lib/mysql')
link('/var/lib/php')
link('/etc/asterisk')
link('/etc/httpd')
link('/etc/ircd')
link('/etc/mail')
link('/etc/php.d')
link('/tftpboot')
Now install the drbdlinks rpm.
rpm -ivh drbdlinks-1.09-1.noarch.rpm
add the entry to the /etc/ha.d/haresources to make sure the drbdlinks starts on swap.
trixbox1.local xxx.xxx.xxx.xxx/24/eth0 drbddisk::drbd0 Filesystem::/dev/drbd0::/share::ext3 drbdlinks mysqld sendmail asterisk httpd ircd amportal xinetd
Also, edit /etc/my.cnf
vi /etc/my.cnf
datadir=/share/var/lib/mysql
socket=/share/var/lib/mysql/mysql.sock
There will be a permissions problem if you try to use the endpoint manager, just chmod the directory that is shown on the endpoint manager page (module_11????) to 755 as well, That is how i fixed it.
Well i think that is all.