14.2. Using Linux HA Heartbeat

The Heartbeat program provides a basis for verifying the availability of resources on one or more systems within a cluster. In this context a resource includes MySQL, the filesystems on which the MySQL data is being stored and, if you are using DRBD, the DRBD device being used for the filesystem. Heartbeat also manages a virtual IP address, and the virtual IP address should be used for all communication to the MySQL instance.

A cluster within the context of Heartbeat is defined as two computers notionally providing the same service. By definition, each computer in the cluster is physically capable of providing the same services as all the others in the cluster. However, because the cluster is designed for high-availability, only one of the servers is actively providing the service at any one time. Each additional server within the cluster is a 'hot-spare' that can be brought into service in the event of a failure of the master, it's next connectivity or the connectivity of the network in general.

The basics of Heartbeat are very simple. Within the Heartbeat cluster (see Figure 14.3, “DRBD Architecture”, each machine sends a 'heartbeat' signal to the other hosts in the cluster. The other cluster nodes monitor this heartbeat. The heartbeat can be transmitted over many different systems, including shared network devices, dedicated network interfaces and serial connections. Failure to get a heartbeat from a node is treated as failure of the node. Although we don't know the reason for the failure (it could be an OS failure, a hardware failure in the server, or a failure in the network switch), it is safe to assume that if no heartbeat is produced there is a fault.

Figure 14.3. DRBD Architecture

Heartbeat Architecture

In addition to checking the heartbeat from the server, the system can also check the connectivity (using ping) to another host on the network, such as the network router. This allows Heartbeat to detect a failure of communication between a server and the router (and therefore failure of the server, since it is no longer capable of providing the necessary service), even if the heartbeat between the servers in the clusters is working fine.

In the event of a failure, the resources on the failed host are disabled, and the resources on one of the replacement hosts is enabled instead. In addition, the Virtual IP address for the cluster is redirected to the new host in place of the failed device.

When used with MySQL and DRBD, the MySQL data is replicated from the master to the slave using the DRBD device, but MySQL is only running on the master. When the master fails, the slave switches the DRBD devices to be primary, the filesystems on those devices are mounted, and MySQL is started. The original master (if still available) has it's resources disabled, which means shutting down MySQL and unmounting the filesystems and switching the DRBD device to secondary.

14.2.1. Heartbeat Configuration

Heartbeat configuration requires three files located in /etc/ha.d. The ha.cf contains the main heartbeat configuration, including the list of the nodes and times for identifying failures. haresources contains the list of resources to be managed within the cluster. The authkeys file contains the security information for the cluster.

The contents of these files should be identical on each host within the Heartbeat cluster. It's important that you keep these files in sync across all the hosts. Any changes in the information on one host should be copied to the all the others.

For these examples n example of the ha.cf file is shown below:

logfacility local0
keepalive 500ms
deadtime 10
warntime 5
initdead 30
mcast bond0 225.0.0.1 694 2 0
mcast bond1 225.0.0.2 694 1 0
auto_failback off
node drbd1
node drbd2

The individual lines in the file can be identified as follows:

  • logfacility — sets the logging, in this case setting the logging to use syslog.

  • keepalive — defines how frequently the heartbeat signal is sent to the other hosts.

  • deadtime— the delay in seconds before other hosts in the cluster are considered 'dead' (failed).

  • warntime — the delay in seconds before a warning is written to the log that a node cannot be contacted.

  • initdead — the period in seconds to wait during system startup before the other host is considered to be down.

  • mcast — defines a method for sending a heartbeat signal. In the above example, a multicast network address is being used over a bonded network device. If you have multiple clusters then the multicast address for each cluster should be unique on your network. Other choices for the heartbeat exchange exist, including a serial connection.

    If you are using multiple network interfaces (for example, one interface for your server connectivity and a secondary and/or bonded interface for your DRBD data exchange) then you should use both interfaces for your heartbeat connection. This decreases the chance of a transient failure causing a invalid failure event.

  • auto_failback — sets whether the original (preferred) server should be enabled again if it becomes available. Switching this to on may cause problems if the preferred went offline and then comes back on line again. If the DRBD device has not been synced properly, or if the problem with the original server happens again you may end up with two different datasets on the two servers, or with a continually changing environment where the two servers flip-flop as the preferred server reboots and then starts again.

  • node — sets the nodes within the Heartbeat cluster group. There should be one node for each server.

An optional additional set of information provides the configuration for a ping test that will check the connectivity to another host. You should use this to ensure that you have connectivity on the public interface for your servers, so the ping test should be to a reliable host such as a router or switch. The additional lines specify the destination machine for the ping, which should be specified as an IP address, rather than a hostname; the command to run when a failure occurs, the authority for the failure and the timeout before an non-response triggers a failure. A sample configure is shown below:

ping 10.0.0.1
respawn hacluster /usr/lib64/heartbeat/ipfail
apiauth ipfail gid=haclient uid=hacluster
deadping 5

In the above example, the ipfail command, which is part of the Heartbeat solution, is called on a failure and 'fakes' a fault on the currently active server. You need to configure the user and group ID under which the command should be executed (using the apiauth). The failure will be triggered after 5 seconds.

Note

The deadping value must be less than the deadtime value.

The auth_keys file holds the authorization information for the Heartbeat cluster. The authorization relies on a single unique 'key' that is used to verify the two machines in the Heartbeat cluster. It is used only to confirm that the two machines are in the same cluster and is used to ensure that the

14.2.2. Using Heartbeat with MySQL and DRBD

To use Heartbeat in combination with MySQL you should be using DRBD (see Section 14.1, “Using MySQL with DRBD for High Availability”) or another solution that allows for sharing of the MySQL database files in event of a system failure. In these examples, DRBD is used as the data sharing solution.

Heartbeat manages the configuration of different resources to manage the switching between two servers in the event of a failure. The resource configuration defines the individual services that should be brought up (or taken down) in the event of a failure.

The haresources file within /etc/ha.d defines the resources that should be managed, and the individual resource mentioned in this file in turn relates to scripts located within /etc/ha.d/resource.d. The resource definition is defined all on one line:

drbd1 drbddisk Filesystem::/dev/drbd0::/drbd::ext3 mysql 10.0.0.100

The line is notionally split by whitespace. The first entry (drbd1) is the name of the preferred host, i.e. the server that is normally responsible for handling the service. The last field is virtual IP address or name that should be used to share the service. This is the IP address that should be used to connect to the MySQL server. It will automatically be allocated to the server that is active when Heartbeat starts.

The remaining fields between these two fields define the resources that should be managed. Each Field should contain the name of the resource (and each name should refer to a script within /etc/ha.d/resource.d). In the event of a failure, these resources are started on the backup server by calling the corresponding script (with a single argument, start), in order from left to right. If there are additional arguments to the script, you can use a double colon to separate each additional argument.

In the above example, we manage the following resources:

  • drbddisk — the DRBD resource script, this will switch the DRBD disk on the secondary host into primary mode, making the device read/write.

  • Filesystem — manages the Filesystem resource. In this case we have supplied additional arguments to specify the DRBD device, mount point and filesystem type. When executed this should mount the specified filesystem.

  • mysql — manages the MySQL instances and starts the MySQL server. You should copy the mysql.resource file from the support-files directory from any MySQL release into the /etc/ha.d/resources.d directory.

If you want to be notified of the failure by email, you can add another line to the haresources file with the address for warnings and the warning text:

MailTo::youremail@address.com::DRBDFailure

With the Heartbeat configuration in place, copy the haresources, authkeys and ha.cf files from your primary and secondary servers to make sure that the configuration is identical. Then start the Heartbeat service, either by calling /etc/init.d/heartbeat start or by rebooting both primary and secondary servers.

You can test the configuration by running a manual failover, connect to the primary node and run:

# /usr/lib64/heartbeat/hb_standby

This will cause the current node to relinquish its resources cleanly to the other node.

14.2.3. Using Heartbeat with DRBD and dopd

As a further extension to using DRBD and Heartbeat together, you can enable dopd. The dopd daemon handles the situation where a DRBD node is out of date compared to the master and prevents the slave from being promoted to master in the event of a failure. This stops a situation where you have two machines that have been masters ending up different data on the underlying device.

For example, imagine that you have a two server DRBD setup, master and slave. If the DRBD connectivity between master and slave fails then the slave would be out of the sync with the master. If Heartbeat identifies a connectivity issue for master and then switches over to the slave, the slave DRBD device will be promoted to the primary device, even though the data on the slave and the master is not in synchronization.

In this situation, with dopd enabled, the connectivity failure between the master and slave would be identified and the metadata on the slave wold be set to Outdated. Heartbeat will then refuse to switch over to the slave even if the master failed. In a dual-host solution this would effectively render the cluster out of action, as there is no additional fail over server. In an HA cluster with three or more servers, control would be passed to the slave that has an up to date version of the DRBD device data.

To enable dopd, you need to modify the Heartbeat configuration and specify dopd as part of the commands executed during the monitoring process. Add the following lines to your ha.cf file:

respawn hacluster /usr/lib/heartbeat/dopd  
apiauth dopd gid=haclient uid=hacluster

Make sure you make the same modification on both your primary and secondary nodes.

You will need to reload the Heartbeat configuration:

# /etc/init.d/heartbeat reload

You will also need to modify your DRBD configuration by configuration the outdate-peer option. You will need to add the configuration line into the common section of /etc/drbd.conf on both hosts. An example of the full block is shown below:

common {  
  handlers {  
    outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater";  
  }  
}

Finally, set the fencing option on your DRBD configured resources:

resource my-resource {  
  disk {  
    fencing    resource-only;  
  }
}

Now reload your DRBD configuration:

# drbdadmin adjust all

You can test the system by unplugging your DRBD link and monitoring the output from /proc/drbd.

14.2.4. Dealing with System Level Errors

Because a kernel panic or oops may indicate potential problem with your server, you should configure your server to remove itself from the cluster in the event of a problem. Typically on a kernel panic your system will automatically trigger a hard reboot. For a kernel oops a reboot may not happen automatically, but the issue that caused that oops may still lead to potential problems.

You can force a reboot by setting the kernel.panic and kernel.panic_on_oops parameters of the kernel control file /etc/sysctl.conf. For example:

 kernel.panic_on_oops = 1
 kernel.panic = 1
  

You can also set these parameters during runtime by using the sysctl command. You can either specify the parameters on the command line:

$ sysctl -w kernel.panic=1
  

Or you can edit your sysctl.conf file and then reload the configuration information:

$ sysctl -p
 

By setting both these parameters to a positive value (actually the number of seconds to wait before triggering the reboot), the system will reboot. Your second heartbeat node should then detect that the server is down and then switch over to the failover host.