«IBM PowerHA SystemMirror for AIX Best Practices Introduction IBM® PowerHA® SystemMirror® for AIX® (formerly IBM HACMP™) was first available in ...»
Redpaper Shawn Bodily
Daniel J. Martin-Corben
Ashraf Ali Thajudeen
William Nespoli Zanatta
IBM PowerHA SystemMirror for AIX Best
IBM® PowerHA® SystemMirror® for AIX® (formerly IBM HACMP™) was first available in
1991 and is now in its 24th release, with over 20,000 PowerHA clusters in production,
worldwide. IBM PowerHA SystemMirror is recognized as a robust, mature high availability solution. PowerHA supports a wide variety of configurations, and offers a great deal of flexibility to the cluster administrator. With this flexibility comes the responsibility to make wise choices because many cluster configurations are available that work regarding the cluster passing verification and being brought online, but those configurations are not ideal in terms of providing availability.
This IBM Redpaper™ publication1 describes choices that the cluster designer can make, and suggests the alternatives that can achieve the highest level of availability.
This paper discusses the following topics:
Designing high availability Cluster components Testing Maintenance Monitoring PowerHA in a virtualized world Summary 1 This document applies to PowerHA 7.1.3 SP1 running under AIX 7.1.3 TL1.
ibm.com/redbooks 1 © Copyright IBM Corp. 2014. All rights reserved.
Designing high availability A fundamental goal of a successful cluster design is the elimination of single points of failure (SPOF).
A high availability solution helps ensure that the failure of any component of the solution, whether it is hardware, software, or system management, does not cause the application and its data to be inaccessible to the user community. This solution is achieved through the elimination or masking of both planned and unplanned downtime. High availability solutions help eliminate single points of failure through appropriate design, planning, selection of hardware, configuration of software, and carefully controlled change management discipline.
To be highly available, a cluster must have no single point of failure. Although the principle of no single point of failure is accepted, it is sometimes inadvertently or deliberately violated. It is inadvertently violated when the cluster designer does not appreciate the consequences of the failure of a specific component. It is deliberately violated when the cluster designer chooses not to put redundant hardware in the cluster. The most common instance is when the cluster nodes that are chosen do not have enough I/O slots to support redundant adapters.
This choice is often made to reduce the price of a cluster, and is generally a false economy;
the resulting cluster is still more expensive than a single node, but has no better availability.
Plan a cluster carefully so that every cluster element has a backup (some say two of everything). A preferred practice is to use either paper or online planning worksheets to do this planning, and save them as part of the on-going documentation of the system. Table 1 lists typical SPOFs within a cluster.
Base the cluster design decisions on whether the cluster designs contribute to availability (that is, eliminate an SPOF) or detract from availability (gratuitously complex).
2 IBM PowerHA SystemMirror for AIX Best Practices Risk analysis Sometimes in reality, eliminating all SPOFs within a cluster is not feasible. Examples might
include network and site:
If the network as a SPOF must be eliminated, then the cluster requires at least two networks. Unfortunately, this eliminates only the network directly connected to the cluster as an SPOF. It is not unusual for the users to be located some number of hops away from the cluster. Each of these hops involves routers, switches, and cabling, and each typically represents another SPOF. Truly eliminating the network as a SPOF can become a massive undertaking.
Eliminating the site as a SPOF depends on distance and the corporate disaster recovery strategy. Generally, this involves using PowerHA SystemMirror Enterprise Edition.
However, if the sites can be covered by a common storage area network, for example buildings within a 2 km radius, then cross-site Logical Volume Manager (LVM) mirroring function as described in the PowerHA Administration Guide is most appropriate, providing the best performance at no additional expense. If the sites are within the range of Peer-to-Peer Remote Copy (PPRC) (roughly, 100 km) and compatible IBM ESS, DS, SVC storage systems are used, then one of the PowerHA SystemMirror Enterprise Edition PPRC technologies is appropriate. Otherwise, consider PowerHA SystemMirror Global Logical Volume Manager (GLVM). For more information, see IBM PowerHA Cookbook for AIX Updates, SG24-7739.
Cluster components The following section describes preferred practices for important cluster components.
Nodes PowerHA v7.1 supports clusters of up to 16 nodes, with any combination of active and standby nodes. Although a possibility is to have all nodes in the cluster running applications (a configuration referred to as mutual takeover), the most reliable and available clusters have at least one standby node: one node that is normally not running any applications, but is available to take them over if a failure occurs on an active node.
Also, be sure to attend to environmental considerations. Nodes should not have a common power supply, which can happen if they are placed in a single rack. Similarly, building a cluster
Choose nodes that have sufficient I/O slots to install redundant network and disk adapters.
(twice as many slots as is required for single node operation). This naturally suggests avoiding processors with small numbers of slots. For high availability best practices, do not consider or plan to use a node unless it has redundant adapters. Blades are an outstanding example. And, just as every cluster resource should have a backup, the root volume group in each node should be mirrored, or be on a RAID device. Furthermore, PowerHA v7.1 added the rootvg system event, which monitors rootvg and can help invoke a fallover in the event of rootvg loss.
Also, choose nodes so that, when the production applications are run at peak load, sufficient CPU cycles and I/O bandwidth still exist to allow PowerHA to operate. The production application should be carefully benchmarked (preferable) or modeled (if benchmarking is not feasible) and nodes chosen so that they do not exceed 85% busy, even under the heaviest expected load.
Note: Size the takeover node to accommodate all possible workloads: if a single standby is backing up multiple primaries, it must be capable of servicing multiple workloads.
On hardware that supports dynamic LPAR operations, PowerHA can be configured to allocate processors and memory to a takeover node before applications are started. However, these resources must actually be available, or acquirable through Capacity Upgrade on Demand (CUoD). Understand and plan for the worst case situation where, for example, all the applications are on a single node.
Networks PowerHA is a network-centric application. PowerHA networks not only provide client access to the applications but are used to detect and diagnose node, network, and adapter failures.
To do this, PowerHA uses these methods, which sends heartbeats over all defined networks:
Before PowerHA v7: Reliable Scalable Cluster Technology (RSCT) PowerHA v7 and later: Cluster Aware AIX (CAA) By gathering heartbeat information on multiple nodes, PowerHA can determine what type of failure occurred and initiate the appropriate recovery action. Being able to distinguish between certain failures, for example the failure of a network and the failure of a node, requires a second network. Although this additional network can be “IP based,” it is possible that the entire IP subsystem can fail within a given node. Therefore, in addition there should be at least one, ideally two, non-IP networks. Failure to implement a non-IP network can potentially lead to a partitioned cluster, sometimes referred to as the split brain syndrome.
This situation can occur if the IP network between nodes becomes severed or in some cases congested. Because each node is still alive, PowerHA concludes the other nodes are down and initiates a takeover. After takeover, one or more applications might be running simultaneously on both nodes. If the shared disks are also online to both nodes, the result can lead to data divergence (massive data corruption). This is a situation that must be avoided, at all costs.
Starting in PowerHA v7 with the use of CAA, the new cluster repository disk automatically provides a form of non-IP heartbeating. Another option is to use SAN heartbeat, which is commonly referred to as sancomm or by the device name it uses called sfwcomm. Using sancomm requires SAN adapters that support target mode and zoning the adapters together so they can communicate with each other.
4 IBM PowerHA SystemMirror for AIX Best Practices
Important network best practices for high availability are as follows:
Failure detection is possible only if at least two physical adapters per node are in the same physical network or VLAN. Be extremely careful when you make subsequent changes to the networks, with regards to IP addresses, subnetmasks, intelligent switch port settings and VLANs.
The more unique types, both IP and non-IP, of networks the less likely of ever reporting a false-node-down failure.
Where possible, use EtherChannel, Shared Ethernet Adapters (SEA), or both, through the Virtual I/O Server (VIOS) with PowerHA to aid availability.
Note: PowerHA sees EtherChannel configurations as single adapter networks. To aid problem determination, configure the netmon.cf file to allow ICMP echo requests to be sent to other interfaces outside of the cluster. See the PowerHA administration web
page for further details:
http://www-01.ibm.com/support/knowledgecenter/SSPHQG_7.1.0/com.ibm.powerha.a dmngd/ha_admin_kickoff.htm When you use multiple adapters per network, each adapter needs an IP address in a different subnet, using the same subnet mask.
Currently, PowerHA supports IPv6 and Ethernet only.
Ensure you have in place the correct network configuration rules for the cluster with regards to EtherChannel, Virtual adapter support, service, and persistent addressing. For
more information, see the PowerHA planning web page:
http://www-01.ibm.com/support/knowledgecenter/SSPHQG_7.1.0/com.ibm.powerha.plan gd/ha_plan.htm Name resolution is essential for PowerHA. External resolvers are deactivated under certain event processing conditions. Avoid problems by configuring /etc/netsvc.conf and NSORDER variable in /etc/environment to ensure that the host command checks the local /etc/hosts file first.
Read the release notes that are stored in /usr/es/sbin/cluster/release_notes. Watch for new and enhanced features, such as collocation rules, persistent addressing and fast failure detection.
Configure persistent IP labels to each node. These IP addresses are available at AIX boot time and PowerHA strives to keep them highly available. They are useful for remote administration, monitoring, and secure node-to-node communications. Consider implementing a host-to-host IPsec tunnel between persistent labels between nodes. This can ensure that sensitive data, such as passwords, are not sent unencrypted across the network, for example when using the C-SPOC option to change a user password.
If you have several virtual clusters split across frames, ensure boot subnet addresses are unique per cluster. This minimizes problems with netmon reporting the network is up when indeed the physical network outside the cluster might be down.
Adapters As stated previously, each network defined to PowerHA should have at least two adapters per node. Although it is possible to build a cluster with fewer, the reaction to adapter failures is more severe; the resource group must be moved to another node. AIX provides support for both EtherChannel and Shared Ethernet Adapters. This often allows the cluster node to logically have defined one adapter interface per network. This reduces the number of IP
Many IBM Power Systems™ servers contain built-in virtual Ethernet adapters. These historically have been known as Integrated Virtual Ethernet (IVE) or Host Ethernet Adapters (HEA). Some newer systems now contain Single Root I/O Virtualization (SRIOV) adapters.
Most of these adapters provide multiple ports. One port on such an adapter should not be used to back up another port on that adapter, because the adapter card is a common point of failure. The same is often true of the built-in Ethernet adapters; in most IBM Power Systems servers, ports have a common adapter. When the built-in Ethernet adapter can be used, a preferred practice is to provide an extra adapter in the node, with the two backing up each other. However, be aware that, when using these specific types of adapters, in many cases, Live Partition Mobility might be unable to be used.
Also be aware of network detection settings for the cluster and consider tuning these values.
These values apply to all networks. However, be careful when you use custom settings, because setting these values too low can lead to undesirable results, like false takeovers.
These settings can be viewed and modified by using either the clmgr command or smitty sysmirror.
Applications The most important part of making an application run well in a PowerHA cluster is understanding the application’s requirements. This is particularly important when designing the resource group policy behavior and dependencies. For high availability to be achieved, the application must be able to stop and start cleanly and not explicitly prompt for interactive input. Some applications tend to bond to a particular operating system characteristic such as a uname, serial number, or IP address. In most situations, these problems can be overcome.
The vast majority of commercial software products that run under AIX are suited to be clustered with PowerHA.