Zimbra High Availability Setup with GlusterFS

WARNING: Before you embark on this, please read this disclaimer:

Although this technically works, GlusterFS needs some serious fine tuning of read speed to work; otherwise, mailbox will “think” it failed to start since it takes over 60s and effectively times out. This, in turn, causes the init.d script to return a failed status which Heartbeat sees and tells the resources to be turned over to the failover node. Problems abound. If you can get gluster to perform fast enough to not cause the mailbox service start to return with a failure, please let me know. Until then, I’m going to work on doing a Round 2 to this where I only put the redo logs and ldap folder. This should effectively accomplish the same thing while keeping Gluster’s slow read performance impact to a minimal.

Credits go to:

Gaurav Kohli’s Blog Post on setting up GlusterFS with Heartbeat

Philip Lawlor’s Post on setting up Zimbra for High Availability

Overview of Setup

zm1a.hlmn.co –

zm1b.hlmn.co –

zm1.hlmn.co –

Edit Hosts Files

On zm1a: localhost.hlmn.co localhost zm1.hlmn.co zm1a zm1a zm1.hlmn.co zm1b zm1.hlmn.co

On zm2a:       zm1.hlmn.co localhost.hlmn.co localhost    zm1a    zm1b zm1.hlmn.co

Update Hostname of both:

nano /etc/hostname



Setup Heartbeat

  1. Install heartbeat:
    apt-get install heartbeat
  2. On both servers, add this config:
    nano /etc/heartbeat/ha.cf
    logfacility local0
    logfile /var/log/ha-log
    keepalive 2
    deadtime 20 # timeout before the other server takes over
    bcast eth0
    node zm1a
    node zm1b 
    auto_failback on # very important or auto failover won't happen
  3. edit /etc/heartbeat/haresources for Server1:
    zm1a IPaddr:: zimbra
  4. edit /etc/heartbeat/haresources for Server2:
    zm1a IPaddr:: zimbra
  5. Notice that both point to zm1a. That sets zm1a as the primary. Failure to do that will result in them trying to take each over, which just becomes a huge mess.
  6. Create /etc/heartbeat/authkeys on both servers
    auth 3
    3 md5 yourrandommd5string

    Protect the permissions of authkeys file on both servers:

    chmod 600 /etc/heartbeat/authkeys

Disable Upstart for Zimbra Services

On both machines, issue the below command to remove the startup services since Heartbeat will be handling them:

# update-rc.d -f zimbra remove

Final Comments:

Again, Heartbeat thinks Zimbra failed to start since the service takes so long to read from the GlusterFS. If you can figure a way to improve that, the above proof of concept should work well.




Notes on Installing GlusterFS on Ubuntu

Overview of Setup

Primary Gluster Server

Hostname: gf1.hlmn.co

IP Address:

OS: Ubuntu 14.04

Memory: 1GB

Secondary Gluster Server

Hostname: gf2.hlmn.co

IP Address:

OS Ubuntu 14.04

Memory: 1GB


Prepare the Virtual Machines

  1. Create a new clean, base Ubuntu 14.04 install
  2. Name it gf1 and setup the hosts file and hostname file to match that as well as the domain information.
  3. Add a raw VirtIO disk to be used by Gluster as the brick. We’ll call this gf1_brick1.img
  4. Repeat for the second machine, naming it gf2.
  5. Once they’re setup, make sure they’re both updated:
    sudo apt-get update && sudo apt-get upgrade

Install Gluster on Both Nodes

  1. Install python-software properties:
    $ sudo apt-get install python-software-properties
  2. Add the PPA:
    $ sudo add-apt-repository ppa:semiosis/ubuntu-glusterfs-3.5
    $ sudo apt-get update
  3. Then install Gluster packages:
    $ sudo apt-get install glusterfs-server
  4. Add both hosts to your DNS host so that they can see each other by hostname

Configure GlusterFS

We’ll setup GF1 as the primary server. Many of the Gluster commands will execute on both or all servers.

  1. Drop into root user
  2. Configure the Trusted Pool on gf1:
    gluster peer probe gf2.hlmn.co
  3. Check to make sure it works by typing this on gf2 as root user:
    # gluster peer status

    The output should be:

    Number of Peers: 1
    Uuid: 8aadbadf-8498-4674-8b42-a561d63b2e3d
    State: Peer in Cluster (Connected)
  4. It’s time to setup the disks to be used as bricks. If you’re using KVM and you setup the second disk as a raw VirtIO device, it should be listed as /dev/vd[a-z]. Mine is vdb
  5. We can double check to make sure it’s the right disk by issuing:
    # fdisk -l /dev/vdb

    And we should get something like this:

    Disk /dev/vdb: 21.0 GB, 20971520000 bytes
    16 heads, 63 sectors/track, 40634 cylinders, total 40960000 sectors
    Units = sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x00000000
    Disk /dev/vdb doesn't contain a valid partition table
  6. Once we ID the disk, issue:
    # fdisk /dev/vdb
    Command (m for help): n
    Partition type:
       p   primary (0 primary, 0 extended, 4 free)
       e   extended
    Select (default p): p
    Partition number (1-4, default 1): 1
    First sector (2048-40959999, default 2048): 
    Using default value 2048
    Last sector, +sectors or +size{K,M,G} (2048-40959999, default 40959999): 
    Using default value 40959999
    Command (m for help): w
    The partition table has been altered!
    Calling ioctl() to re-read partition table.
    Syncing disks
  7. Install xfs:
    apt-get install xfsprogs
  8. Format the partition:
     mkfs.xfs -i size=512 /dev/vdb1
  9. Mount the partition as a Gluster Brick:
    mkdir -p /export/vdb1 && mount /dev/vdb1 /export/vdb1 && mkdir -p /export/vdb1/brick
  10. Add entry into fstab:
     echo "/dev/vdb1 /export/vdb1 xfs defaults 0 0"  >> /etc/fstab
  11. Repeat Steps 4-10 on gf2.
  12. Now it’s time to setup a replicated volume. On gf1:
    gluster volume create gv0 replica 2 gf1.hlmn.co:/export/vdb1/brick gf2.hlmn.co:/export/vdb1/brick

    An explanation of the above, from Gluster documentation:

    Breaking this down into pieces, the first part says to create a gluster volume named gv0 (the name is arbitrary, gv0 was chosen simply because it’s less typing than gluster_volume_0). Next, we tell it to make the volume a replica volume, and to keep a copy of the data on at least 2 bricks at any given time. Since we only have two bricks total, this means each server will house a copy of the data. Lastly, we specify which nodes to use, and which bricks on those nodes. The order here is important when you have more bricks…it is possible (as of the most current release as of this writing, Gluster 3.3) to specify the bricks in a such a way that you would make both copies of the data reside on a single node. This would make for an embarrassing explanation to your boss when your bulletproof, completely redundant, always on super cluster comes to a grinding halt when a single point of failure occurs.
  13. The above should output:
    volume create: gv0: success: please start the volume to access data
  14. Now, to make sure everything is setup correctly, issue this on both gf1 and gf2, output should be the same on both servers:
    gluster volume info

    Expected Output:

    Volume Name: gv0
    Type: Replicate
    Volume ID: 064499be-56db-4e66-84c7-2b6712b10fa6
    Status: Created
    Number of Bricks: 1 x 2 = 2
    Transport-type: tcp
    Brick1: gf1.hlmn.co:/export/vdb1/brick
    Brick2: gf2.hlmn.co:/export/vdb1/brick
  15. Status of the above shows “Created” which means it hasn’t been started yet. Trying to mount of the volume at this point would cause it to fail, so we have to start it first by issuing this on gf1:
    gluster volume start gv0

    You should see this:

    volume start: gv0: success

Mount Your Gluster Volume on the Host Machine

Now that you have your Gluster Volume setup, you can access it using the glusterfs-client on another host.

Source: GlusterHacker

  1. Install the GlusterFS client on a remote host:
    apt-get install glusterfs-client
  2. Create a config location for gluster:
    mkdir /etc/glusterfs
  3. Create a volume config file:
    nano /etc/glusterfs/gfvolume1.vol
  4. Fill in the following:
    volume gv0-client-0
     type protocol/client
     option transport-type tcp
     option remote-subvolume /export/vdb1/brick
     option remote-host gf1.hlmn.co
    volume gv0-client-1
     type protocol/client
     option transport-type tcp
     option remote-subvolume /export/vdb1/brick
     option remote-host gf2.hlmn.co
    volume gv0-replicate
     type cluster/replicate
     subvolumes gv0-client-0 gv0-client-1
    volume writebehind
     type performance/write-behind
     option window-size 1MB
     subvolumes gv0-replicate
    volume cache
     type performance/io-cache
     option cache-size 512MB
     subvolumes writebehind

    Gluster reads the above starting at the bottom of the file and working it’s way up. So it first creates the cache volume, then adds a layer for writebehind and replication and finally the remote volumes.

  5. Add it through fstab (nano /etc/fstab) and add the following:
    /etc/glusterfs/gfvolume1.vol /mnt/gfvolume1 glusterfs rw,allow_other,default_permissions,_netdev 0 0

    This tells fstab about both bricks so that if one goes down, it can connect to the other.

That’s pretty much it to at least getting it to work.

The performance of it, on the other hand, will need a lot more looking into since I’m getting 50mb/s writes on Gluster where the host can do 250mb/s. Small file performance is also abysmal.