Pacemaker and Corosync HA

In this setup we will setup a HA failover solution using Corosync and Pacemake, in a Active/Passive setup.

Installation and Setup

Prerequisites

  • Hosts or DNS resolvers
  • NTP Must be installed and configured on all nodes
1
2
3
cat /etc/hosts
10.0.1 10   ha1 server01
10.0.1.11   ha2 server02

Installation
We will install pacemaker, it should install corosync as an dependency, if not install it.

1
apt-get install pacemaker

Edit corosync.conf. The bind address is the network address, NOT the IP. The mcastaddr is default, which is fine.

1
2
3
4
5
6
7
8
cat /etc/corosync/corosync.conf
interface {
        # The following values need to be set based on your environment
        ringnumber: 0
        bindnetaddr: 10.0.1.0
        mcastaddr: 226.94.1.1
        mcastport: 5405
   }

We also want corosync to start pacemaker automatically. If we do not do this, we will have to start pacemaker manually.
ver: 0 Indicates corosync to start pacemaker automatically. Setting it to 1, will require manually start of pacemaker!

1
2
3
4
5
6
cat /etc/corosync/corosync.conf
service {
    # Load the Pacemaker Cluster Resource Manager
    ver:       0
    name:      pacemaker
}

Copy/paste the content of corosync.conf, or scp the file to the second node.

1
scp /etc/corosync/corosync.conf 10.0.1.11:/etc/corosync/corosync.conf

Make corosync starts at boot time.

1
2
3
cat /etc/default/corosync
# start corosync at boot [yes|no]
START=yes

Start corosync

1
/etc/init.d/corosync start

Check the status of the cluster

1
2
3
4
5
6
7
8
Last updated: Fri Jun  9 11:02:55 2017          Last change: Wed Jun  7 14:26:06 2017 by root via cibadmin on server01
Stack: corosync
Current DC: server01 (version 1.1.14-70404b0) - partition with quorum
2 Nodes configured, 2 expected votes
0 Resources configured.
============
Online: [ server01 ]

Copy the config file to the second node

1
scp /etc/corosync/corosync.conf server02:/etc/corosync/

Now on the second node, try to start corosync

1
/etc/init.d/corosync start

Check the status again. We should now hopefully see the second node joining. If this fails check the firewall settings and hosts file (they must be able to resolve).

We are getting some warnings. Use the following commands:

1
2
3
crm configure property stonith-enabled=false
sudo crm configure property no-quorum-policy=ignore
crm_verify -L

Now add a virtual IP to the cluster.

1
crm configure primitive VIP ocf:IPaddr2 params ip=10.0.1.100 nic=eth0 op monitor interval=10s

Now we should have added an VIP/Floating IP, we can test this by a simple ping. Should respond from both nodes.

Adding Resources: Services

Now we are ready to add a service to our cluster. In this example we use a postfix service (smtp) that we want to failover. Postfix must be installed on both nodes

1
crm configure primitive HA-postfix lsb:postfix op monitor interval=15s

Check the status.

1
crm status

As we have not linked the IP to the service yet, postfix could be running on server02 while the IP is on server01. We need to set them both in one HA group.

1
crm configure group HA-Group VIP HA-postfix

If we check the status again, we can see that the two resources are now running on the same server.

1
2
3
4
5
Online: [ server01 server02 ]
 Resource Group: HA-Group
     VIP    (ocf::heartbeat:IPaddr2):   Started server01
     HA-postfix (lsb:postfix):  Started server01

Looks good !

If an resource fails, for some reason, like postfix crashes, and cannot start again, we want to migrate to another server.
Per default the migration-threshold is not defined/set to infinity, which will never migrate it.

When we have 3 fails, migrate the node, and expire the failed resource after 60 seconds. This will allow it to automatically to move it back to this node.

1
2
3
primitive HA-postfix lsb:postfix \
        op monitor interval="15s" \
        meta target-role="Started" migration-threshold="3" failure-timeout=60s

Now we are DONE!

Some extra commands that might be usefull when managing the cluster:

Deleting a resource

1
2
crm resource stop HA-XXXX
crm configure delete HA-XXXX

Where XXXX is the name of the HA cluster.

Migrate / Move Resource

1
crm_resource --resource HA-Group --move --node server02

View configuration

1
crm configure show

View status and fail counts

1
crm_mon -1 --fail

Configure FC Multipath on Debian (HP EVA)

This detailed how to guides to high availability and performance on Debian/Ubuntu for with a dual FC HBA (Brocade) and shared storage on a HP EVA6300. Tested on Debian Linux 5.x and 6.x bits running on HP Proliant Dl360 and DL380 models, with 8GB FC Host Bus Adapters from Brocade.

Configure the software we need

# apt-get install multipath-tools-boot multipath-tools firmware-qlogic sysfsutils lsscsi
# reboot

Verifying that the correct Linux kernel module was loaded

root@debian:~# cat /var/log/dmesg | grep Brocade
[ 11.584057] Brocade BFA FC/FCOE SCSI driver - version: 3.0.2.2
[ 11.654052] scsi1 : Brocade FC/FCOE Adapter, hwpath: 0000:0a:00.0 driver: 3.0.2.2
[ 12.011790] scsi4 : Brocade FC/FCOE Adapter, hwpath: 0000:0a:00.1 driver: 3.0.2.2
root@debian:~# cat /var/log/dmesg | grep scsi
[ 11.550599] scsi0 : hpsa
[ 11.558223] scsi 0:0:0:0: RAID HP P420i 3.54 PQ: 0 ANSI: 5
root@debian:~# modinfo bfa
filename: /lib/modules/3.2.0-4-amd64/kernel/drivers/scsi/bfa/bfa.ko
version: 3.0.2.2
author: Brocade Communications Systems, Inc.
description: Brocade Fibre Channel HBA Driver fcpim

Create the /etc/multipath.conf for the IBM DS8300 storage

First we need to find out the correct wwid:
As multipath is not yet correctly configured, the command below will return “undef” for some paths, as the example below. What we need now is to identify the wwid between parenthesis.

root@debian:~# multipath -ll
fc_storage (3600143801259ba3a0000b00001650000) dm-1 HP,HSV340
size=2.0T features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=1 status=active
| `- 1:0:0:1 sdb 8:16 active ready running
|-+- policy='round-robin 0' prio=1 status=enabled
| `- 1:0:1:1 sdc 8:32 active ready running
|-+- policy='round-robin 0' prio=1 status=enabled
| `- 4:0:0:1 sdd 8:48 active ready running
`-+- policy='round-robin 0' prio=1 status=enabled
  `- 4:0:1:1 sde 8:64 active ready running

Mind the wwid (3600…..)

###############################################################################
# Multipath.conf file for HP EVA system
#
# Version 1.02
# Storage node: HP EVA
# Connection: Dual 8GB FC
#
###############################################################################
 
defaults {
    polling_interval    30
    failback            immediate
    no_path_retry       5
    rr_min_io           100
    path_checker        tur
    user_friendly_names yes
}
 
devices {
 
# These are the default settings for P6300 (HP EVA)
 
    device {
        vendor                   "HP"
        product                  "HSV340"
        path_grouping_policy     group_by_prio
    }
}
 
multipaths {
        multipath {
                wwid                    3600143801259ba3a0000b00001650000
                alias                   fc_storage
                path_grouping_policy    failover
                path_selector           "round-robin 0"
        }
 
}

The internet is broken?

Yesterday, 12th of Aug 2014, the internet grew passed the 512.000 BGP Routes. This was not something new, Cisco warned about this in May 2014:

It wasn’t that long ago (2008) that the table reached 256k routes, triggering action by network administrators to ensure the continued growth of the Internet. Now that the table has passed 500,000 routes, it’s time to start preparing for another significant milestone – the 512k mark.

A nice graph can be found on he.net also showing that the number of ASN’s grew passed the 48K.

If you accept full internet routes on your network this might be the time to verify your maximum table size on those components. Some equipment might need to be rebooted in order for this change to become active.

More information can be found here (read it!)