WWW.THESIS.DISLIB.INFO
FREE ELECTRONIC LIBRARY - Online materials, documents
 
<< HOME
CONTACTS



Pages:   || 2 | 3 | 4 |

«IBM PowerHA SystemMirror for AIX Best Practices Introduction IBM® PowerHA® SystemMirror® for AIX® (formerly IBM HACMP™) was first available in ...»

-- [ Page 1 ] --

Dino Quintero

Alex Abderrazag

Redpaper Shawn Bodily

Daniel J. Martin-Corben

Reshma Prathap

Kulwinder Singh

Ashraf Ali Thajudeen

William Nespoli Zanatta

IBM PowerHA SystemMirror for AIX Best

Practices

Introduction

IBM® PowerHA® SystemMirror® for AIX® (formerly IBM HACMP™) was first available in

1991 and is now in its 24th release, with over 20,000 PowerHA clusters in production,

worldwide. IBM PowerHA SystemMirror is recognized as a robust, mature high availability solution. PowerHA supports a wide variety of configurations, and offers a great deal of flexibility to the cluster administrator. With this flexibility comes the responsibility to make wise choices because many cluster configurations are available that work regarding the cluster passing verification and being brought online, but those configurations are not ideal in terms of providing availability.

This IBM Redpaper™ publication1 describes choices that the cluster designer can make, and suggests the alternatives that can achieve the highest level of availability.

This paper discusses the following topics:

Designing high availability Cluster components Testing Maintenance Monitoring PowerHA in a virtualized world Summary 1 This document applies to PowerHA 7.1.3 SP1 running under AIX 7.1.3 TL1.

ibm.com/redbooks 1 © Copyright IBM Corp. 2014. All rights reserved.

Designing high availability A fundamental goal of a successful cluster design is the elimination of single points of failure (SPOF).

A high availability solution helps ensure that the failure of any component of the solution, whether it is hardware, software, or system management, does not cause the application and its data to be inaccessible to the user community. This solution is achieved through the elimination or masking of both planned and unplanned downtime. High availability solutions help eliminate single points of failure through appropriate design, planning, selection of hardware, configuration of software, and carefully controlled change management discipline.

To be highly available, a cluster must have no single point of failure. Although the principle of no single point of failure is accepted, it is sometimes inadvertently or deliberately violated. It is inadvertently violated when the cluster designer does not appreciate the consequences of the failure of a specific component. It is deliberately violated when the cluster designer chooses not to put redundant hardware in the cluster. The most common instance is when the cluster nodes that are chosen do not have enough I/O slots to support redundant adapters.

This choice is often made to reduce the price of a cluster, and is generally a false economy;

the resulting cluster is still more expensive than a single node, but has no better availability.

Plan a cluster carefully so that every cluster element has a backup (some say two of everything). A preferred practice is to use either paper or online planning worksheets to do this planning, and save them as part of the on-going documentation of the system. Table 1 lists typical SPOFs within a cluster.

Base the cluster design decisions on whether the cluster designs contribute to availability (that is, eliminate an SPOF) or detract from availability (gratuitously complex).

–  –  –

2 IBM PowerHA SystemMirror for AIX Best Practices Risk analysis Sometimes in reality, eliminating all SPOFs within a cluster is not feasible. Examples might

include network and site:

If the network as a SPOF must be eliminated, then the cluster requires at least two networks. Unfortunately, this eliminates only the network directly connected to the cluster as an SPOF. It is not unusual for the users to be located some number of hops away from the cluster. Each of these hops involves routers, switches, and cabling, and each typically represents another SPOF. Truly eliminating the network as a SPOF can become a massive undertaking.

Eliminating the site as a SPOF depends on distance and the corporate disaster recovery strategy. Generally, this involves using PowerHA SystemMirror Enterprise Edition.

However, if the sites can be covered by a common storage area network, for example buildings within a 2 km radius, then cross-site Logical Volume Manager (LVM) mirroring function as described in the PowerHA Administration Guide is most appropriate, providing the best performance at no additional expense. If the sites are within the range of Peer-to-Peer Remote Copy (PPRC) (roughly, 100 km) and compatible IBM ESS, DS, SVC storage systems are used, then one of the PowerHA SystemMirror Enterprise Edition PPRC technologies is appropriate. Otherwise, consider PowerHA SystemMirror Global Logical Volume Manager (GLVM). For more information, see IBM PowerHA Cookbook for AIX Updates, SG24-7739.

–  –  –

Cluster components The following section describes preferred practices for important cluster components.

Nodes PowerHA v7.1 supports clusters of up to 16 nodes, with any combination of active and standby nodes. Although a possibility is to have all nodes in the cluster running applications (a configuration referred to as mutual takeover), the most reliable and available clusters have at least one standby node: one node that is normally not running any applications, but is available to take them over if a failure occurs on an active node.





Also, be sure to attend to environmental considerations. Nodes should not have a common power supply, which can happen if they are placed in a single rack. Similarly, building a cluster

–  –  –

Choose nodes that have sufficient I/O slots to install redundant network and disk adapters.

(twice as many slots as is required for single node operation). This naturally suggests avoiding processors with small numbers of slots. For high availability best practices, do not consider or plan to use a node unless it has redundant adapters. Blades are an outstanding example. And, just as every cluster resource should have a backup, the root volume group in each node should be mirrored, or be on a RAID device. Furthermore, PowerHA v7.1 added the rootvg system event, which monitors rootvg and can help invoke a fallover in the event of rootvg loss.

Also, choose nodes so that, when the production applications are run at peak load, sufficient CPU cycles and I/O bandwidth still exist to allow PowerHA to operate. The production application should be carefully benchmarked (preferable) or modeled (if benchmarking is not feasible) and nodes chosen so that they do not exceed 85% busy, even under the heaviest expected load.

Note: Size the takeover node to accommodate all possible workloads: if a single standby is backing up multiple primaries, it must be capable of servicing multiple workloads.

On hardware that supports dynamic LPAR operations, PowerHA can be configured to allocate processors and memory to a takeover node before applications are started. However, these resources must actually be available, or acquirable through Capacity Upgrade on Demand (CUoD). Understand and plan for the worst case situation where, for example, all the applications are on a single node.

Networks PowerHA is a network-centric application. PowerHA networks not only provide client access to the applications but are used to detect and diagnose node, network, and adapter failures.

To do this, PowerHA uses these methods, which sends heartbeats over all defined networks:

Before PowerHA v7: Reliable Scalable Cluster Technology (RSCT) PowerHA v7 and later: Cluster Aware AIX (CAA) By gathering heartbeat information on multiple nodes, PowerHA can determine what type of failure occurred and initiate the appropriate recovery action. Being able to distinguish between certain failures, for example the failure of a network and the failure of a node, requires a second network. Although this additional network can be “IP based,” it is possible that the entire IP subsystem can fail within a given node. Therefore, in addition there should be at least one, ideally two, non-IP networks. Failure to implement a non-IP network can potentially lead to a partitioned cluster, sometimes referred to as the split brain syndrome.

This situation can occur if the IP network between nodes becomes severed or in some cases congested. Because each node is still alive, PowerHA concludes the other nodes are down and initiates a takeover. After takeover, one or more applications might be running simultaneously on both nodes. If the shared disks are also online to both nodes, the result can lead to data divergence (massive data corruption). This is a situation that must be avoided, at all costs.

Starting in PowerHA v7 with the use of CAA, the new cluster repository disk automatically provides a form of non-IP heartbeating. Another option is to use SAN heartbeat, which is commonly referred to as sancomm or by the device name it uses called sfwcomm. Using sancomm requires SAN adapters that support target mode and zoning the adapters together so they can communicate with each other.

4 IBM PowerHA SystemMirror for AIX Best Practices

Important network best practices for high availability are as follows:

Failure detection is possible only if at least two physical adapters per node are in the same physical network or VLAN. Be extremely careful when you make subsequent changes to the networks, with regards to IP addresses, subnetmasks, intelligent switch port settings and VLANs.

The more unique types, both IP and non-IP, of networks the less likely of ever reporting a false-node-down failure.

Where possible, use EtherChannel, Shared Ethernet Adapters (SEA), or both, through the Virtual I/O Server (VIOS) with PowerHA to aid availability.

Note: PowerHA sees EtherChannel configurations as single adapter networks. To aid problem determination, configure the netmon.cf file to allow ICMP echo requests to be sent to other interfaces outside of the cluster. See the PowerHA administration web

page for further details:

http://www-01.ibm.com/support/knowledgecenter/SSPHQG_7.1.0/com.ibm.powerha.a dmngd/ha_admin_kickoff.htm When you use multiple adapters per network, each adapter needs an IP address in a different subnet, using the same subnet mask.

Currently, PowerHA supports IPv6 and Ethernet only.

Ensure you have in place the correct network configuration rules for the cluster with regards to EtherChannel, Virtual adapter support, service, and persistent addressing. For

more information, see the PowerHA planning web page:

http://www-01.ibm.com/support/knowledgecenter/SSPHQG_7.1.0/com.ibm.powerha.plan gd/ha_plan.htm Name resolution is essential for PowerHA. External resolvers are deactivated under certain event processing conditions. Avoid problems by configuring /etc/netsvc.conf and NSORDER variable in /etc/environment to ensure that the host command checks the local /etc/hosts file first.

Read the release notes that are stored in /usr/es/sbin/cluster/release_notes. Watch for new and enhanced features, such as collocation rules, persistent addressing and fast failure detection.

Configure persistent IP labels to each node. These IP addresses are available at AIX boot time and PowerHA strives to keep them highly available. They are useful for remote administration, monitoring, and secure node-to-node communications. Consider implementing a host-to-host IPsec tunnel between persistent labels between nodes. This can ensure that sensitive data, such as passwords, are not sent unencrypted across the network, for example when using the C-SPOC option to change a user password.

If you have several virtual clusters split across frames, ensure boot subnet addresses are unique per cluster. This minimizes problems with netmon reporting the network is up when indeed the physical network outside the cluster might be down.

Adapters As stated previously, each network defined to PowerHA should have at least two adapters per node. Although it is possible to build a cluster with fewer, the reaction to adapter failures is more severe; the resource group must be moved to another node. AIX provides support for both EtherChannel and Shared Ethernet Adapters. This often allows the cluster node to logically have defined one adapter interface per network. This reduces the number of IP

–  –  –

Many IBM Power Systems™ servers contain built-in virtual Ethernet adapters. These historically have been known as Integrated Virtual Ethernet (IVE) or Host Ethernet Adapters (HEA). Some newer systems now contain Single Root I/O Virtualization (SRIOV) adapters.

Most of these adapters provide multiple ports. One port on such an adapter should not be used to back up another port on that adapter, because the adapter card is a common point of failure. The same is often true of the built-in Ethernet adapters; in most IBM Power Systems servers, ports have a common adapter. When the built-in Ethernet adapter can be used, a preferred practice is to provide an extra adapter in the node, with the two backing up each other. However, be aware that, when using these specific types of adapters, in many cases, Live Partition Mobility might be unable to be used.

Also be aware of network detection settings for the cluster and consider tuning these values.

These values apply to all networks. However, be careful when you use custom settings, because setting these values too low can lead to undesirable results, like false takeovers.

These settings can be viewed and modified by using either the clmgr command or smitty sysmirror.

Applications The most important part of making an application run well in a PowerHA cluster is understanding the application’s requirements. This is particularly important when designing the resource group policy behavior and dependencies. For high availability to be achieved, the application must be able to stop and start cleanly and not explicitly prompt for interactive input. Some applications tend to bond to a particular operating system characteristic such as a uname, serial number, or IP address. In most situations, these problems can be overcome.

The vast majority of commercial software products that run under AIX are suited to be clustered with PowerHA.

–  –  –



Pages:   || 2 | 3 | 4 |


Similar works:

«The organizer’s guide to Venue Racing with RowPro School and Gym Editions Version 4 Digital Rowing Inc. 60 State Street, Suite 700 Boston, MA 02109 USA www.digitalrowing.com assist@digitalrowing.com © 2013 Digital Rowing Inc. All rights reserved. The authorized user of a valid copy of RowPro software may reproduce this publication for the purpose of learning to use RowPro software. No part of this publication may be reproduced or transmitted for commercial purposes. Digital Rowing, RowPro,...»

«! NYICN 2010 Board Nominee Bios Dear NYICN Members, We are so proud to announce this year’s nominees for our 2010 Board Elections! As you will read, all of this year’s nominees are outstanding in their accomplishments as well as their dedication to help improve the lives of youth in and from care in Canada! Each applicant has submitted up to a 250 word bio to help inform our voting members (youth ages 14 – 24) before the voting process during our Annual General Meeting scheduled for...»

«Roughan & O’Donovan N5 Westport to Bohola Road Project Consulting Engineers Constraints Study Report N5 Westport to Bohola Road Project Constraints Study Report TABLE OF CONTENTS 1. Introduction 1.1 Need for the Scheme 1.2 Scheme Development 1.3 Objective of the Constraints Study Area 1.4 Format of the Report 2. Background to the Scheme 2.1 N5 Westport – Castlebar Road Project 2.2 N5 Ballyvary – Bohola Road Project 2.3 N5/N59 North – Westport Relief Road 2.4 Castlebar Ring Road Project...»

«Transformations of Choral Lyric Traditions in the Context of Athenian State Theater GREGORY NAGY This text was originally published as an article in Arion 3 (1994/5) 41–55. In this on-line version, the original page-numbers of Nagy 1994/5 will be indicated within braces (“{” and “}”). For example, “{49 | 50}” indicates where p. 49 of the original article ends and p. 50 begins. These indications will be useful to readers who need to look up references made in previous scholarship...»

«Research paper IJBARR Impact factor: 0.314 E-ISSN No. 2347 –685X ISSN 2348 – 0653 PERFORMANCE EVALUATION OF ANDHRA BANK & BANK OF MAHARASHTRA WITH CAMEL MODEL Dr. Mahua Biswas Assistant Professor, Department of Management, Dayananda Sagar Institutions, Bangalore. Abstract The present study focuses on the evaluation of the performance of two public sector banks viz., Andhra Bank and Bank of Maharashtra with CAMEL model. Andhra Pradesh & Maharashtra states are among the top five most populace...»

«Red Hat JBoss Operations Network 3.3 3.3 Release Notes Important Release Information for Red Hat JBoss Operations Network Jared Morgan Zach Rhoads Ella Deon Ballard Red Hat JBoss Operations Network 3.3 3.3 Release Notes Important Release Information for Red Hat JBoss Operations Network Jared Mo rgan jmo rgan@redhat.co m Zach Rho ads zach@redhat.co m Ella Deo n Ballard dlackey@redhat.co m Legal Notice Co pyright © 20 15 Red Hat. The text o f and illustratio ns in this do cument are licensed by...»

«TNO Report Description of current temporal emission patterns and sensitivity of predicted AQ for temporal emission patterns EU FP7 MACC deliverable report D_D-EMIS_1.3 Authors Hugo Denier van der Gon, Carlijn Hendriks, Jeroen Kuenen, Arjo Segers, Antoon Visschedijk TNO, Princetonlaan 6, 3584 CB Utrecht, The Netherlands December 2011 Intended for: EU FP7 MACC (Monitoring Atmospheric Composition and Climate), Grant agreement no.: 218793; coordinator dr Adrian Simmons, ECMWF, UK. Contents Abstract...»

«Lateral-Torsional Buckling of Flange-Tapered I-Beams Michael Brett Thomas A project submitted to the faculty of Brigham Young University in partial fulfillment of the requirements for the degree of Master of Science Paul W. Richards, Chair Richard J. Balling Michael A. Scott Department of Civil Engineering Brigham Young University June 2014 Copyright © 2014 Brett Thomas All Rights Reserved ABSTRACT Lateral-Torsional Buckling of Flange-Tapered I-Beams Michael Brett Thomas Department of Civil...»

«The Florida Senate Interim Report 2012-122 September 2011 Committee on Environmental Preservation and Conservation BOTTLE DEPOSITS Issue Description A bottle redemption program often referred to as a “bottle bill” requires an additional fee on beverage containers, such as bottles and cans, at the time of purchase. These fees work like a deposit and usually are totally or partially recovered by individuals who recycle these containers. Increased recycling reduced green house gas emissions,...»

«Biennials: Prospect and Perspectives. International Conference at ZKM | Center for Art and Media Karlsruhe 27.02.—01.03.2014 a zkm e-paper Introduction Peter Weibel Foreword Ronald Grätz Preface Elke aus dem Moore Biennials: Challenges and Opportunities, Widening and Limitations Andrea Buddensieg Research on the Topic of Biennials at ZKM Biennials: Prospect and Perspectives Keynote Ute Meta Bauer Shifting Gravity – Force Fields: Biennials Today Panel Discussions Panel I Biennials and...»

«Shire of Greenough Municipal Inventory of Heritage Places Proposed addition Place Details Name: Tibradden Homestead Group Former Name: Tibradden Station Type of Place: Homestead and Outbuildings, including Flour Mill Address: No. 1318 Sandsprings Road, Tibradden HCWA No: 4630 Assessment Number: 61040 Photograph Photo Description: Front elevation of the Tibradden Homestead. Date: 14/11/2011 Site Details Lot/Locn: Lot 4 Diagram/Plan: P8774 Vol/Folio: 22/328A Description Walls: Stone/Mud Brick/CGI...»

«Pre-turbo After-treatment System Development using a 1D Modeling Approach ABSTRACT The location of a diesel engine’s after-treatment system can dramatically affect engine performance. A numerical study was conducted to verify that impact in relation to engine performance. A part of that investigation included examining the advantages and disadvantages of locating the after-treatment system before or after the turbocharger. The operating conditions that were used to conduct the testing...»





 
<<  HOME   |    CONTACTS
2017 www.thesis.dislib.info - Online materials, documents

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.