Tag Archives: heartbeat

Elastix 2.5 + heartbeat + drbd

As some of you knows I’ve got so much luck that even in my ne job they want new Telefon system 🙂 But massive one with failover and -20 sec downtime.

So what we’ve got:
Elastix 2.5 stable
Heartbeat for failover
Drbd – Network mirror raid, on which are all Elastix and asterig confs etc.

Our goals:
Stability, Reliability, High-availability

Let`s fucking do this shit!

I test it with 2 virtual elastix servers with attached 8 gig hdd on both, whitch I used for networ mirroring.
I followed THE TUTORIAL, the one tutorial for elastix HA+drbd.

/dev/sda1 - /
/dev/sda2 - swap
/dev/sdb1 - /replica

!!!Remember, these partition MUST be identical on both PCs, especially /dev/sdb1, where our Elasterix live.
(By identical i meant compleatly identical start block and finish block must be same on both)

1. When everything is installed, we need to create FS for /dev/sdb

[root@voipSERVER.drbd /]# fdisk /dev/sdb
p
n
----
t - 83
w

2. Format :

[root@voipSERVER.drbd /]# mke2fs -j /dev/sdb1

3. Just in case, we gonna formated with zeros:

[root@voipSERVER.drbd /]# dd if=/dev/zero bs=1M count=500 of=/dev/sdb1; sync

4. Installing drbd and heartbeat:

yum install heartbeat drbd83 kmod-drbd83

Note: If by any chance you experience problems with drbd83, use drbd82 version (64 bit
versions).

5. Now we need to edit /etc/hosts to be sure that the IP name resolution will be ok

192.168.0.242 voipserver.drbd
192.168.0.243 voipbackup.drbd

6. Edit /etc/drbd.conf on the Primary one:

global { usage-count no; }
resource r0 {
protocol C;
startup { wfc-timeout 10; degr-wfc-timeout 30; }
disk { on-io-error detach; }
net {
after-sb-0pri discard-least-changes;
after-sb-1pri discard-secondary;
after-sb-2pri call-pri-lost-after-sb;
cram-hmac-alg "sha1";
shared-secret "SECRET PASSWD";
}
syncer { rate 5M; }
on voipserver.drbd {
device /dev/drbd0;
disk /dev/sdb1;
address 192.168.0.242:7788;
meta-disk internal;
}
on voipbackup.drbd {
device /dev/drbd0;
disk /dev/sdb1;
address 192.168.0.243:7788;
meta-disk internal;
}
}

Note:
The following lines are used to help the servers resolve split brain recovery. Split brain is when two servers are in primary mode and need to know how to resolve who should assume primary/secondary role (discarding or accepting changes made in primaries).
Reference:

  • http://www.drbd.org/users-guide/s-configure-split-brain-behavior.html
  • 7. Replicate this config file to the second server

    [root@voipSERVER.drbd /]# scp /etc/drbd.conf root@voipbackup.drbd:/etc/

    8. Initialize the meta-data area on disk before starting drbd (! on both server!)

    drbdadm create-md r0

    * Start drbd on both nodes (service drbd start)

    service drbd start

    * Verify that both server are secondary

    cat /proc/drbd

    9. As you can see, both nodes are secondary, which is normal. we need to decide
    which node will act as a primary now (voipserver.drbd) : that will initiate the first ‘full
    sync’ between the two nodes:

    drbdadm -- --overwrite-data-of-peer primary r0

    10. Launch the command and wait until it’s finish synchronizing

    watch -n 1 cat /proc/drbd

    11. We can now format /dev/drbd0 and mount it on voipserver.drbd:

    [root@voipSERVER.drbd /]# mkfs.ext3 /dev/drbd0
    [root@voipSERVER.drbd /]# mkdir /replica

    [root@voipSERVER.drbd /]# mount /dev/drbd0 /replica

    12. We can determine the role of a server by executing the following;
    drbdadm role r0
    The primary server should return;

    Primary/Secondary

    13. Now we will copy all of the directories we want synchronized between the two
    servers to our new partition, remove the original directories and then create
    symbolic links to replace them on voipserver.drbd.
    Note: If you use 64bit version of Elastix this line: tar -zcvf usr-lib-asterisk.tgz /usr/lib/asterisk/ should looks like tar -zcvf usr-lib-asterisk.tgz /usr/lib64/asterisk/

    cd /replica

    amportal chown

    tar -zcvf etc-asterisk.tgz /etc/asterisk
    tar -zxvf etc-asterisk.tgz
    tar -zcvf var-lib-asterisk.tgz /var/lib/asterisk
    tar -zxvf var-lib-asterisk.tgz
    tar -zcvf usr-lib-asterisk.tgz /usr/lib/asterisk/
    tar -zxvf usr-lib-asterisk.tgz
    tar -zcvf var-spool-asterisk.tgz /var/spool/asterisk/
    tar -zxvf var-spool-asterisk.tgz
    tar -zcvf var-lib-mysql.tgz /var/lib/mysql/
    tar -zxvf var-lib-mysql.tgz
    tar -zcvf var-log-asterisk.tgz /var/log/asterisk/
    tar -zxvf var-log-asterisk.tgz
    tar -zcvf var-www.tgz /var/www/
    tar -zxvf var-www.tgz
    rm -rf /etc/asterisk
    rm -rf /var/lib/asterisk
    rm -rf /usr/lib/asterisk/
    rm -rf /var/spool/asterisk
    rm -rf /var/www

    rm -rf /var/lib/mysql/
    rm -rf /var/log/asterisk/
    ln -s /replica/etc/asterisk/ /etc/asterisk
    ln -s /replica/var/lib/asterisk/ /var/lib/asterisk
    ln -s /replica/usr/lib/asterisk/ /usr/lib/asterisk
    ln -s /replica/var/spool/asterisk/ /var/spool/asterisk
    ln -s /replica/var/lib/mysql/ /var/lib/mysql
    ln -s /replica/var/log/asterisk/ /var/log/asterisk
    ln -s /replica/var/www /var/www
    cd /

    Stop mysqld, asterisk and httpd services on voipserver.drbd

    service mysqld restart
    service mysqld stop
    service asterisk stop
    service httpd stop
    service elastix-updaterd stop
    service elastix-portknock stop

    14. Verify services are down and proceed to switch manually to the second server:

    [root@voipSERVER.drbd /]# umount /replica ; drbdadm secondary r0

    15. Now switch to the VOIPBACKUP server

    [root@voipBACKUP.drbd /]# mkdir /replica ; drbdadm primary r0 ; mount /dev/drbd0 /replica
    [root@voipBACKUP.drbd /]# ls /replica/

    Note: This is used to check if you are replicating information on both servers. You should
    see all data replicated in the secondary server just like data in the primary.
    * DO NOT perform this action with the physical terminal logged in. Use SSH. Otherwise, it will fail to
    unmount the /replica folder for some reason! Also make sure you are not IN the replica folder. Type “cd /” .

    16. Verify voipserver.drbd status (Primary/Secondary)

    drbdadm role r0

    Note: Executing this same command in voipbackup.drbd while in secondary mode should
    not display the /dev/drbd0 partition unless it’s assuming primary mode.

    17. Now we will remove and link on voipbackup.drbd

    cd /replica

    amportal chown

    tar -zcvf etc-asterisk.tgz /etc/asterisk
    tar -zxvf etc-asterisk.tgz
    tar -zcvf var-lib-asterisk.tgz /var/lib/asterisk
    tar -zxvf var-lib-asterisk.tgz
    tar -zcvf usr-lib-asterisk.tgz /usr/lib/asterisk/
    tar -zxvf usr-lib-asterisk.tgz
    tar -zcvf var-spool-asterisk.tgz /var/spool/asterisk/
    tar -zxvf var-spool-asterisk.tgz
    tar -zcvf var-lib-mysql.tgz /var/lib/mysql/
    tar -zxvf var-lib-mysql.tgz
    tar -zcvf var-log-asterisk.tgz /var/log/asterisk/
    tar -zxvf var-log-asterisk.tgz
    tar -zcvf var-www.tgz /var/www/
    tar -zxvf var-www.tgz
    rm -rf /etc/asterisk
    rm -rf /var/lib/asterisk
    rm -rf /usr/lib/asterisk/
    rm -rf /var/spool/asterisk
    rm -rf /var/www

    rm -rf /var/lib/mysql/
    rm -rf /var/log/asterisk/
    ln -s /replica/etc/asterisk/ /etc/asterisk
    ln -s /replica/var/lib/asterisk/ /var/lib/asterisk
    ln -s /replica/usr/lib/asterisk/ /usr/lib/asterisk
    ln -s /replica/var/spool/asterisk/ /var/spool/asterisk
    ln -s /replica/var/lib/mysql/ /var/lib/mysql
    ln -s /replica/var/log/asterisk/ /var/log/asterisk
    ln -s /replica/var/www /var/www
    cd /

    18. Stop mysqld, asterisk and httpd services on voipserver.drbd

    service mysqld restart
    service mysqld stop
    service asterisk stop
    service httpd stop
    service elastix-updaterd stop
    service elastix-portknock stop

    19. Now switch back to the first server:
    [root@voipBACKUP.drbd /]# umount /replica/ ; drbdadm secondary r0

    20. Now switch to the VOIPSERVER server

    [root@voipSERVER.drbd /]# drbdadm primary r0 ; mount /dev/drbd0 /replica

    Drbd is working … let’s be sure that it will always be started:
    chkconfig drbd on

    21. Remember to stop any boot up services on both servers that should be controlled by heartbeat. These services will be controlled by heartbeat on the server that is in control.

    chkconfig asterisk off
    chkconfig mysqld off
    chkconfig httpd off
    chkconfig elastix-updaterd off
    chkconfig elastix-portknock off
    service mysqld stop
    service asterisk stop
    service httpd stop
    service elastix-portknock stop
    service elastix-updaterd stop

    22. Let’s configure a simple /etc/ha.d/ha.cf file on voipserver.drbd :

    debugfile /var/log/ha-debug
    logfile /var/log/ha-log
    debugfile /var/log/ha-debug
    logfile /var/log/ha-log

    logfacility local0
    keepalive 2
    deadtime 30
    warntime 10
    initdead 120
    udpport 694
    bcast eth0
    auto_failback on
    node voipserver.drbd
    node voipbackup.drbd

    23. Create also the /etc/ha.d/authkeys on voipserver.drbd:

    auth 1
    1 sha1 MySecret

    24. Change permissions on the /etc/ha.d/authkeys file on voipserver.drbd:
    chmod 600 /etc/ha.d/authkeys

    25. Edit /etc/ha.d/haresources on voipserver.drbd: (It is two lines!!!!!!! Formating is
    important). Replace the email addresses with your own, on the second line.

    voipserver.drbd drbddisk::r0 Filesystem::/dev/drbd0::/replica::ext3 IPaddr::192.168.0.244/24/eth0/192.168.0.255 mysqld asterisk httpd elastix-updaterd elastix-portknock fop_start
    voipserver.drbd MailTo::hristo@computerassistance.uk.com::DRBD/HA-ALERT
    voipserver.drbd IPaddr::192.168.0.245/24/eth1/192.168.0.255

    Note: If you have second NIC, and you want to failover it, just add it here, like I did, the last line. Now this IP which you’ve set up will be floating between both servers.

    26. Start the heartbeat service on voipserver.drbd :
    service heartbeat start

    27. Replicate now the ha.cf, authkeys and haresources to voipbackup.drbd and start heartbeat

    [root@voipserver.drbd ha.d]# scp /etc/ha.d/ha.cf /etc/ha.d/authkeys /etc/ha.d/haresources
    root@voipbackup.drbd:/etc/ha.d/
    [root@svoipbackup.drbd ha.d]# service heartbeat start

    28. Configure heartbeat to initialize at boot on both server

    chkconfig --add heartbeat
    chkconfig heartbeat on

    29. Verify voipserver.drbd status (Primary/Secondary)

    drbdadm role r0

    30. Execute ‘df -h’ on the primary to confirm that our /dev/drbd0 partition is
    mounted and in use.

    Filesystem Size Used Avail Use% Mounted on
    /dev/sda1 5.7G 1.9G 3.5G 36% /
    tmpfs 249M 0 249M 0% /dev/shm
    /dev/drbd0 7.9G 394M 7.1G 6% /replica

    31. Test your work by creating a SIP extension or anything inside Elastix Web
    Interface, then shut down your primary server while making a continuous ping to
    192.168.0.245 (floating IP address) verifying it doesn’t lose connectivity. Make
    another change in the secondary server, turn your primary back on, and all
    changes should be kept intact.
    Special Note: Any changes made to asterisk files should be done via web Interface
    ONLY. Do not attempt to upgrade Elastix version once finished the cluster or else it will
    write its own files again discarding links to the /replica directory.
    Troubleshooting:

    tcpdump –i eth0:0 –s 1500 –w captura.pcap #capture traffic
    mv captura.pcap /var/www/html #move file to web for download

    http://wiki.centos.org/HowTos/Ha-Drbd
    http://support.red-fone.com/downloads/elastix/Elastix HA Cluster.pdf
    http://danielaliaman.com/blog/files/phonecube/cluster/AsteriskCluster.pdf
    http://www.drbd.org/users-guide/s-configure-split-brain-behavior.html

    Note: Here it is the original tutorial, there is few other think that you can do … as fop and tftpboot migrating Elastix HA cluster