High Availability Concepts and Theory

High-availability clusters (also known as HA clusters or fail over clusters) are groups of computers that support server applications that can be reliably utilised with a minimum of down-time. 

Availability theory basics:

Availability = Uptime/(Uptime+Downtime)

Clustering

Generally, cluster computing consists of three distinct branches:

Compute clustering uses multiple machines to provide greater computing power for computationally intensive tasks. This type of clustering is not addressed by Red Hat Enterprise Linux.

High-availability (HA) clustering uses multiple machines to add an extra level of reliability for a service or group of services.

Load-balance clustering uses specialised routing techniques to dispatch traffic to a pool of servers.

This article aims to address the latter two types of clustering technology.

Active/Passive Cluster vs Active/Active Cluster

An active-active cluster is typically made up of at least two nodes, both actively running the same kind of service simultaneously. The main purpose of an active-active cluster is to achieve load balancing. Load balancing distributes workloads across all nodes in order to prevent any single node from getting overloaded.

An active-passive cluster also consists of at least two nodes. However, as the name active-passive implies, not all nodes are going to be active. In the case of two nodes, if the first node is already active, the second node must be passive or on standby.

The passive (or failover) server serves as a backup that’s ready to take over as soon as the active (or primary) server gets disconnected or is unable to serve.

Failover Cluster vs Load Balanced Cluster

A failover cluster is a group of servers that work together to maintain high availability of applications and services. If one of the servers, or nodes, fails, another node in the cluster can take over its workload without any downtime (this process is known as failover).

In computing, load balancing distributes workloads across multiple computing resources, such as computers, a computer cluster, network links, central processing units or disk drives. Load balancing aims to optimise resource use, maximise throughput, minimise response time, and avoid overload of any single resource

Shared-Nothing Cluster vs Shared-Disk Cluster

In a shared-nothing cluster environment, each system (node) has its own private (not shared) memory and one or more disks.

Shared-nothing clustering offers excellent scalability. In theory, a shared-nothing multiprocessor can scale up to thousands of processors because the processors do not interfere with one another – no resources are shared.

In a shared-disk cluster environment, all of the connected systems (nodes) share the same disk devices. Each processor still has its own private memory, but all the processors can directly address all the disks.

With some optimisation techniques shared-disk is well-suited to the large-scale processing.

Shared-Disk Shared-Nothing
Quick adaptability to changing workloads Can exploit simpler, cheaper hardware
High availability Almost unlimited scalability
Performs best in a heavy read environment Works well in a high-volume, read/write environment
Data need not be partitioned Data is partitioned across the cluster

Cluster Services and Resources

Cluster service is a service that provides high availability for applications such as databases, messaging and file and print services.

A resource is a service made highly available by a cluster.

Quorum

As per Wikipedia, quorum is the minimum number of members of a deliberative assembly necessary to conduct the business of that group. In other words, quorum is minimum number of votes required for majority.

The quorum configuration in a failover cluster determines the number of failures (failure of nodes) that the cluster can sustain while still remain online.

Quorum is designed to handle the Split Brain scenario.

Fencing

Fencing is the process of isolating a node of a computer cluster or protecting shared resources when a node appears to be malfunctioning

One of fencing approaches is the STONITH method, that stands for “Shoot The Other Node In The Head”, meaning that the suspected node is disabled or powered off.

Split brain

HA clusters usually use a heartbeat private network connection which is used to monitor the health and status of each node in the cluster. For example, the split-brain syndrome may occur when all of the private links go down simultaneously, but the cluster nodes are still running, each one believing they are the only one running. The data sets of each cluster may then randomly serve clients by their own “idiosyncratic” data set updates, without any coordination with the other data sets.

Heartbeat

In computer clusters, heartbeat network is a private network which is shared only by the cluster nodes, and is not accessible from outside the cluster. It is used by cluster nodes in order to monitor each node’s status and communicate with each other.

The heartbeat method uses the FIFO nature of the signals sent across the network. By making sure that all messages have been received, the system ensures that events can be properly ordered.

Redundancy

Redundant nodes are those in a failover cluster, where they are designed to take over operations for one another in the event of the failure of one of the nodes.

Redundant solutions are usually less expensive and easier to implement than clustering solutions. However, during a failure, a redundant system might provide poorer availability than a clustering solution. In an environment where the load is shared between two redundant server components, the failure of one server component might put an excessive load on the other server.

The main advantage of a clustered solution is automatic recovery from failure, that is, recovery without user intervention. Disadvantages of clustering are complexity and inability to recover from database corruption.

Replication

Replication is another word for redundancy.

As per Wikipedia, replication in computing involves sharing information so as to ensure consistency between redundant resources. Commonly used replications include disk storage and databases.

Mean Time Before Failure (MTBF)

MTBF is a metric used by hardware manufacturers to indicate an average time between component failures. MTBF is typically measured in thousands of hours.

Mean Time To Repair (MTTR)

MTTR, as the name implies, is the time required for repair (services’ availability). MTTR is typically measured in hours or minutes.

HA clustering tries to make the MTTR as close to zero as it can by automatically switching in redundant components for failed components as fast as it can.

In theory, if you keep MTTR close to zero, you get almost 100% availability:

Availability = MTBF / (MTBF + MTTR)

Service Level Agreement (SLA)

An SLA is the minimum level of service that a provider will deliver to a client per their agreement.

It is not a guarantee or an assurance that a client will get that service. It normally means that when the service dips below that level, a client can open a support ticket.

Disaster Recovery

Disaster recovery is a strategy that involves a set of policies and procedures to enable the recovery or continuation of vital technology infrastructure and systems following a natural or human-induced disaster.

References

https://en.wikipedia.org/wiki/Split-brain_(computing)
https://en.wikipedia.org/wiki/Heartbeat_network
https://en.wikipedia.org/wiki/Fencing_(computing)
https://en.wikipedia.org/wiki/Load_balancing_(computing)
https://en.wikipedia.org/wiki/Shared_nothing_architecture
https://en.wikipedia.org/wiki/Replication_(computing)
https://en.wikipedia.org/wiki/Mean_time_between_failures
https://en.wikipedia.org/wiki/Disaster_recovery
http://www.jscape.com/blog/active-active-vs-active-passive-high-availability-cluster
http://www.mullinsconsulting.com/db2arch-sd-sn.html
http://www.practicalsqldba.com/2012/07/windows-cluster-understanding-quorum.html
https://docs.oracle.com/cd/E19693-01/819-0992/gcbtw/index.html
http://techthoughts.typepad.com/managing_computers/2007/11/availability-mt.html