WARNING: Before you embark on this, please read this disclaimer:
Although this technically works, GlusterFS needs some serious fine tuning of read speed to work; otherwise, mailbox will “think” it failed to start since it takes over 60s and effectively times out. This, in turn, causes the init.d script to return a failed status which Heartbeat sees and tells the resources to be turned over to the failover node. Problems abound. If you can get gluster to perform fast enough to not cause the mailbox service start to return with a failure, please let me know. Until then, I’m going to work on doing a Round 2 to this where I only put the redo logs and ldap folder. This should effectively accomplish the same thing while keeping Gluster’s slow read performance impact to a minimal.
Credits go to:
Gaurav Kohli’s Blog Post on setting up GlusterFS with Heartbeat
Philip Lawlor’s Post on setting up Zimbra for High Availability
Overview of Setup
zm1a.hlmn.co – 192.168.2.23
zm1b.hlmn.co – 192.168.2.24
zm1.hlmn.co – 192.168.2.50
Edit Hosts Files
On zm1a:
127.0.0.1 localhost.hlmn.co localhost 127.0.1.1 zm1.hlmn.co zm1a 192.168.2.23 zm1a zm1.hlmn.co 192.168.2.24 zm1b 192.168.2.50 zm1.hlmn.co
On zm2a:
127.0.0.1 zm1.hlmn.co localhost.hlmn.co localhost 192.168.1.23 zm1a 192.168.1.24 zm1b zm1.hlmn.co
Update Hostname of both:
nano /etc/hostname
zm1a
Setup Heartbeat
- Install heartbeat:
apt-get install heartbeat
- On both servers, add this config:
nano /etc/heartbeat/ha.cf
logfacility local0 logfile /var/log/ha-log keepalive 2 deadtime 20 # timeout before the other server takes over bcast eth0 node zm1a node zm1b auto_failback on # very important or auto failover won't happen
- edit /etc/heartbeat/haresources for Server1:
zm1a IPaddr::192.168.2.50/24 zimbra
- edit /etc/heartbeat/haresources for Server2:
zm1a IPaddr::192.168.2.50/24 zimbra
- Notice that both point to zm1a. That sets zm1a as the primary. Failure to do that will result in them trying to take each over, which just becomes a huge mess.
- Create /etc/heartbeat/authkeys on both servers
auth 3 3 md5 yourrandommd5string
Protect the permissions of authkeys file on both servers:
chmod 600 /etc/heartbeat/authkeys
Disable Upstart for Zimbra Services
On both machines, issue the below command to remove the startup services since Heartbeat will be handling them:
# update-rc.d -f zimbra remove
Final Comments:
Again, Heartbeat thinks Zimbra failed to start since the service takes so long to read from the GlusterFS. If you can figure a way to improve that, the above proof of concept should work well.