Tuesday, August 23, 2011

Data Center Designing: Network Connectivity Standards


Network Connectivity Standards

IF your organization looking for a solution to host it’s entire application environment in an external vendor’s data center, your organization must be considering lot many factors to define the standards of the data center. Below post describes some of the Network Connectivity standards used for development, implementation and management of a Data Center managed by a vendor.
This post assumes that you should have an understanding of the concepts and terminology associated with data communications networks, protocols, and equipment used in this Schedule.

Network Transport Options – Connection

The Transaction Link between provides the end users access to the DC Environment managed by vendor.
The following Standards are designed to facilitate the security, reliability, and scalability of the Network Connectivity, via the Transaction Link, to the Environment at the Data Center.

��� IP Addressing for the Data Center

To avoid IP address conflicts, Vendor will only route network traffic to globally unique public IP addresses registered through one of the regional Internet registry organizations. Vendor will not route traffic to private IP addresses, including addresses in the following IP address ranges:
  • Class A: 10.0.0.0/8
  • Class B: 172.16.0.0/12
  • Class C: 192.168.0.0/16

Network Address Translation and Port Address Translation

If Organization uses private addresses, Network Address Translation (NAT) and Port Address Translation (PAT) can be used to map Organization’s private addresses to public addresses. Organization is responsible for providing the public IP addresses and for performing the NAT/PAT.

NAT and PAT Policies

One-to-one static NAT must be used for all network devices, including, without limitation, Organization’s printers or print servers to which Vendor will initiate a connection. PAT is allowed for any network device that initiates the connection with Vendor , such as user workstations connecting to the Vendor Programs.

Ports

During network initialization, Vendor will provide a list of ports that must be open on Organization’s network to establish Network Connectivity to the Environment at the Data Center. The list of required ports is specific to the type of Network Connectivity selected by Organization and will include source and destination addresses. Access through these ports is required before Vendor can begin installing or troubleshooting Network Connectivity.
VPN Requirements The following table lists the ports that must be enabled for Vendor to manage the Vendor VPN.
Application NamePortProtocol
PingICMPIP
TracerouteICMPIP
SSH & SFTP (if applicable)22TCP
HTTPS443TCP
ISAKMP500UDP
IPSEC50IP
NTP123UDP
Additional ports may be required. During network initialization, the Network Connectivity Form will define detailed source and destination port requirements

Network Design Considerations

Organization should consider certain factors described in this Section when selecting a Network Connectivity design for the Transaction Link between Organization and the Data Center.

Bandwidth

Bandwidth refers to the amount of traffic to be carried through a network and must be considered by Organization when designing Network Connectivity. The amount of bandwidth required depends on the particular applications being used, the number of users concurrently accessing the Environment, and the nature of the transactions being processed. Network bandwidth must be monitored and adjusted as application usage grows and changes.
Bandwidth Considerations
Vendor recommends the bandwidth sizes listed in the following table as a starting point for network sizing.
Sample ApplicationBandwidth For Each User Concurrently Accessing the Environment
Self-service applications4-6 kbps
Forms applications10-12 kbps
Internal Portal2-6 kbps
Files access10-12 kbps
Siebel application10-12 kbps
PeopleSoft application10-12 kbps
Organization must design its network to reduce overall network latency wherever possible. The network design should remove as much distance as possible between the End User’s locations and the application server. This can be accomplished with a network design that does not add unnecessary distance or hops between endpoints and ensuring that the network provider has the capacity available on the most efficient network cable system between the two end points.
Properly configured and sized network routers usually induce very little delay. However, the delay through routers can become a major source of overall network latency when the network connections are congested or routers are improperly configured. Organization should ensure that all network devices and links are properly sized and configured and are running optimally.

Error Rate

Organization must consider the error rate when designing Network Connectivity. All network links are subject to transmission errors such as dropped packets. A high error rate can lead to increased latency and poor network performance. Each segment in a network can experience independent errors that add to the total error rate for the entire link. To ensure that a network operates at peak performance, Organization should ensure that the end-to-end error rate does not exceed 0.01%.

Connection Types

Organization must use a Vendor -standard VPN for Network Connectivity. Vendor supplies, configures, and manages an IPSec compliant VPN device that is installed on Organization’s network. This VPN is used to establish secure Network Connectivity to Vendor over the Transaction Link.
There are various standard configurations available for the Vendor provided VPN device that may accommodate Organization’s network and security policies.

ISP Circuit Types

The Vendor standard VPN connection uses Organization’s ISP circuit. Organization should choose the type of circuit that best meets its use of the Computer and Administration Services.
Dedicated Internet Circuit
A dedicated circuit is Organization’s Internet circuit that is only used for application connection activities. A dedicated circuit can help prevent problems related to over use by isolating traffic to Vendor from Organization’s other Internet traffic. A dedicated circuit can help Organization achieve stable and predictable Network Connectivity. This option is the recommended minimum for a Transaction Link.
Shared Circuit
An Internet circuit that is used both for Network Connectivity to Vendor and Organization’s other Internet activities is called a shared circuit. If Organization selects this circuit type, Organization must ensure that the existing circuit has enough unused capacity to support Vendor traffic.
A shared circuit should only be considered for a Transaction Link if the circuit performance is highly stable and actively monitored for performance and capacity.

VPN Configuration


Vendor provides, configures, and manages a VPN device for an Vendor standard VPN connection over the Transaction Link. Vendor configures the VPN device based on Organization’s network topology and Vendor policies. This section describes the various VPN configurations and Vendor requirements.
The Vendor -provided VPN device has two interfaces (one external and one internal). Vendor generally uses both interfaces (dual-arm mode), but can use a configuration with only one interface (single-arm mode) as described below.

External Interface

The following guidelines apply to external interfaces:
  • The external interface must be connected to a switch between Organization’s border router and firewall.
  • The external interface may be connected to a firewall DMZ interface.
  • The external interface must not be directly connected to the Internet. The external untrusted interface should be connected to the Internet behind Organization’s border router to enable Organization to apply Access Control Lists (ACLs) to secure Organization’s Environment from unsolicited traffic.
  • The external interface must have a globally unique public IP address. Private addressing is not permitted on this interface.

Internal Interface

The following guidelines apply to internal interfaces:
  • The internal interface must be connected to a firewall DMZ interface.
  • The internal interface must not be connected to the same subnet as the external interface.
  • The internal interface must not be connected to Organization’s Internet.
  • The internal interface must have a public IP address.
  • The internal interface must not have a private IP address, unless Organization configures its firewall to use the internal interface as a transit link.

Dual-Arm Configuration

Vendor recommends a dual-arm configuration if Organization has or requires a DMZ port on its firewall. A standard dual-arm configuration is shown in the following diagram.
Description
A dual arm configuration uses both of the VPN device interfaces. The external (or untrusted) interface handles the encrypted traffic between Vendor and Organization over the Transaction Link. The internal (or trusted) interface is connected to a secure portion of Organization’s network and receives and transmits the unencrypted traffic.
The Vendor VPN is logically located between Organization’s firewall and Internet border router. A second connection is placed within a DMZ on Organization’s firewall.
Routing Path
When using a routing path for a dual arm configuration, layer 3 connectivity to Organization’s Environment is established by directing routes towards Organization’s firewall. This can be accomplished by either using default routing or by using routing protocols for packet redirection to the firewall. A static route that redirects traffic to the Vendor VPN is placed on Organization’s firewall.
Advantages
The advantages of using a routing path for a dual-arm configuration are:
  • Minimal routing changes are required.
  • All traffic is auditable by Organization using the firewall.
  • The firewall can be used for access control.
Disadvantages
The disadvantages of using a routing path for a dual-arm configuration are:
A DMZ on Organization’s firewall is required.

Single-Arm Configuration

A single-arm configuration may be used if Organization does not have a DMZ interface on its firewall or an available globally unique public IP address on a separate subnet. A standard single-arm configuration is shown in the following diagram.
��� Description
A single-arm configuration uses only the external interface for the VPN tunnel. The Vendor VPN is logically located between Organization’s firewall and Internet border router and both the encrypted and unencrypted traffic over the Transaction Link are handled by the single interface.
���Routing Path
When using a routing path for a single-arm configuration, layer 3 connectivity to Organization’s Environment is established by directing routes towards Organization’s firewall. This can be accomplished by either utilizing default routing or by using routing protocols for packet redirection to the firewall. A static route that will redirect traffic to the Vendor VPN is placed on Organization’s firewall.
���Advantages
The advantages of using a routing path for a single-arm configuration are:
  • Minimal routing changes are required.
  • One IP address is required for the VPN. (Note: Additional IP addresses are required to set up printing since printers require a one-to-one static NAT.)
  • One switch port is required.
  • All traffic is auditable by Organization using the firewall.
  • The firewall can be utilized for access control.
Disadvantages
The disadvantages of using a routing path for a single-arm configuration are:
  • Data flow between the firewall and the VPN is not encrypted. This weakness can be mitigated by confirming that no hosts are capable of reading traffic, the VPN equipment is in a secured location, and proper access control lists (ACLs) are created on Organization’s border router.
  • The full throughput of VPN cannot be used.

Vendor High Availability VPN

If required, Organization may request a high availability (HA) VPN configuration involving two VPNs. Additional fees applies. Vendor ’recommended high availability configuration provides Organization with two Vendor -configured VPN devices running in dual arm mode. A standard high availability installation is shown in the following diagram.
Description
A high availability dual arm VPN configuration uses both of the VPN device interfaces. The external (or untrusted) interface handles the encrypted traffic between Vendor and Organization over the Transaction Link. The internal (or trusted) interface is connected to a secure portion of Organization’s network and receives and transmits the unencrypted traffic
The Vendor VPN is logically located between Organization’s firewall and Internet border router. A second connection is placed within a DMZ on Organization’s firewall.
Routing Path
When using a routing path for the high availability VPN configuration, layer 3 connectivity to Organization’s Environment is established by directing Vendor routes towards Organization’s firewall. This can be accomplished by either using default routing or by using routing protocols for packet redirection to the firewall. A static route that redirects traffic to the Vendor VPN is placed on Organization’s firewall.
Advantages
The advantages of using a routing path for a high availability dual arm VPN configuration are:
    • Minimal routing changes are required.
    • All traffic is auditable by Organization using the firewall.
    • The firewall can be utilized for access control.
Disadvantage
The disadvantages of using a routing path for a high availability dual arm VPN configuration are:
A DMZ on the Organization’s firewall is required.

Required Ports and Protocols

Vendor requires access to the VPN device to establish and maintain Network Connectivity through the VPN tunnel. The following table lists the ports that are required to be open between Vendor and the VPN device. During VPN installation, Vendor will provide a more detailed list that contains source and destination addresses.
The following IP ports and protocols are required for establishing an IPSec tunnel, managing the Vendor -provided network equipment, and monitoring the network link to the Organization-side VPN device.Application NamePort #ProtocolComments
SSH22TCPNetwork Management for Netscreen VPN
HTTPS443TCPNetwork Management for Netscreen VPN
ICMPPingIPMonitoring and Diagnostics
ICMPTracerouteIPMonitoring and Diagnostics
ISAKMP500UDPVPN Tunnel
IPSEC50IPVPN Tunnel
NTP123UDPNetwork Time Protocol for VPN device

SAN (Storage area Networking) for System Administrators



In conventional IT systems, storage devices are connected to servers bymeans of SCSI cables. The idea behind storage networks is that these SCSI cables arereplaced by a network, which is installed in addition to the existing LAN. Server and storage devices can exchange data over this new network using the SCSI protocol

SAN Components


1. Server Hardware – The actual machines which are configured to use the data from the storage devices. The major vendors for Server hardware are ORACLE ( formerly SUN Microsystems) , IBM, HP, Fujitsu ..etc.
2. Storage Hardware – The actual Disk arrays from smaller size to enterprise size. The major Vendors for Storage Hardware are EMC, Oracle (formerly knows as SUN Microsystems), IBM, HP , DELL …etc.
3. HBA – Host Bus adapters acts as initiator in SCSI storage subsystem and main function is converting SCSI commands into Fiber Channel format and establish connection from server side to the SAN. You can compare HBA cards with Ethernet adapters in our regular networking. Major vendors are SUN, Emulex, Qlogic, JNI ..etc.
4. SAN Switches – These are similar to Ethernet switches , but do switching for fiber networks. Major vendors are Brocade, Cisco ….etc.

SAN Topology

In this exercise, we will be looking at various phases of SAN configuration and practically looking at the configuration of two Solaris machines with Emulex HBAs connected to a EMC Clarion Storage.
From the diagram you can notice two Solaris servers named “gurkul1 and gurkul2″, two SAN switches named “SN1 and SN2″ and EMC device with two Storage arrays “array1 and array2″.
The SAN Switch and the Storage array together comprises a fabric in the SAN.

PORTS

In regular networking we do have ports in Ethernet adapters to interconnect the networking devices using network cables, in the similar way in Storage Area Networking we do have different types of ports which are used to interconnect Servers , Switches and Storage. Below are the common type of ports that we commonly discuss while talking about SAN.
  • ‘N’ port: Node ports used for connecting peripheral storage devices to switch fabric or for point to point configurations
  • ‘F’ port: Fabric ports reside on switches and allow connection of storage peripherals (‘N’ port devices)
  • ‘E’ port: Expansion ports are essentially trunk ports used to connect two Fibre Channel switches
  • ‘G’ port: A generic port capable of operating as either an ‘E’ or ‘F’ port. Its also capable of acting in an ‘L’ port capacity— known as a ‘G’ port
  • ‘L’ port: Loop ports are used in arbitrated loop configurations to build storage peripheral networks without FC switches. These ports often also have ‘N’ port capabilities and are called ‘NL’ ports
Important point to remember here is End to end connection managed by N ports, and Switching and addressing handled by fabric Ports.

SAN Device Identification

World-wide name (WWN): Similar to an Ethernet address in the traditional networking world, this is a unique name assigned by the manufacturer to the HBA. This name is then used to grant and restrict access to other components of the SAN. There are two basic forms of WWN.
1. World Wide Node Name (WWNN)—Assigned to Fibre Channel node device by vendor
2. World Wide Port Name (WWPN) —Assigned to Fibre Channel Host BusAdapter port by vendor
Normally that the HBA vendor will provide a tool that allows the system administrator to query the wwn of the HBA (e.g. Emulex supplies the lputil application). The world-wide name of the symmetrix fibre adapters (FA) can be obtained from EMC, and inaddition to the WWPN for EMC we should also note the SCSI target ID assigned for specific FA in order to make configuration in Solaris Side.
For host bus adapters, the world wide name is typically displayed in the /var/adm/messages file after the fibre card and software driver have been installed. An example of for an Emulex PCI fibre HBA is as follows. In this case, the relevant value the WWPN, the World-Wide Port Name.
May 18 09:26:28 gurkul1 lpfc: [ID 242157 kern.info] NOTICE:
lpfc0:031:Link Up Event received Data: 1 1 0 0
May 18 09:26:31 gurkul1 lpfc: [ID 129691 kern.notice] NOTICE:
lpfc0: Firmware Rev 3.20 (D2D3.20X3)
May 18 09:26:31 gurkul1 lpfc: [ID 664688 kern.notice] NOTICE:
lpfc0: WWPN:11:11:11:11:11:11:11:01
WWNN:20:22:22:22:22:22:22:22 DID 0×210913
May 18 09:26:31 gurkul1 lpfc: [ID 494464 kern.info] NOTICE:
lpfc1:031:Link Up Event received Data: 1 1 0 0
May 18 09:26:34 gurkul1 lpfc: [ID 129691 kern.notice] NOTICE:
lpfc1: Firmware Rev 3.20 (D2D3.20X3)
May 18 09:26:34 gurkul1 lpfc: [ID 664688 kern.notice] NOTICE:
lpfc1: WWPN:11:11:1111:11:11:11:02
WWNN:20:22:22:22:22:22:22:02 DID 0×210913
For the purpose of our configuration example I am adding the list of WWPNs numbers that we are going to use here ( we should collect this information before configuring the Server/Storage)
Host nameportSCSI targetWorld Wide Name
Gurkul1Hba1
1111111111111101
Gurkul1Hba2
1111111111111102
Gurkul2Hba1
2121212121212101
Gurkul2Hba2
2121212121212102
EMCFa1atarget 11555555555555551a
EMCFa1btarget 12555555555555551b
EMCFA2atarget 21555555555555552a
EMCFa2btarget 22555555555555552b



SAN Connections
The below diagram illustrates several principals common to most san topologies:
  • There are two separate switched (“SN1″ and “SN2″). Each switch therefore comprises an independent “fabric”.
  • The array provides multiple I/O paths to every disk device. In the diagram beside, for example, the disk device “Disk01″ is accessible via both FA1A and FA1B. Depending on the array, the two paths may be simultaneously active (active/active) or only one path may be valid at a time (active/passive).
  • Each host has a connection into each fabric. Host-based software load-balances the traffic between the two links if the array supports active/active disk access. Furthermore, the host should adjust to a single link failure by rerouting all traffic along the surviving path.

Storage PATH for Gurkul1 Server
Take a moment to examine the diagram below and consider the host “gurkul1″. Assuming that it requires highly-available access to disk Disk02 in Array2 , note that there are two separate paths to that device.
Path1 : HBA1–> SN1 (P1) –> SN1 (P5) –> FA1B –> Array2 –> Disk02
Path2 : HBA2–> SN2 (P1) –> SN2 (P5) –> FA2B –> Array2 –> Disk02

Storage PATH for Gurkul2 Server

Similar way to Gurkul1 now we can examine the diagram beside to identify the storage paths available for ”gurkul2″. Assuming that it requires highly-available access to disk Disk01 in Array1 . And again here we have total separate paths to that device.
Path1 : HBA1 –> SN1 (P3) –> SN1 (P4) –> FA2A –> Array1 –> Disk01
Path2 : HBA2 –> SN2 (P3) –> SN2 (P4) –> FA1A –> Array1 –> Disk01

Target configuration that we want to configure

  • For gurkul1: need access to the LUN02 (Disk02 ) to LUN05 (Disk05) from the storage array2, through FA1B and FA2B
  • For gurkul2: need access to the LUN01 (Disk01) to LUN04( Disk04) from the storage array1, through FA1A and FA2A
  • And both servers require access to disk00 – lun00 on each adapter they are connected

Solaris 8/9 host side configuration

Typically, there are two configuration files that need to be updated once the vendor’s hba software has been installed. The hba driver’s configuration file typically resides in the /kernel/drv directory, and must be updated to support persistent binding and any other configuration requirements specified by the array vendor. Secondly, the Solaris “sd” driver configuration file sd.conf must be updated to tell the operating system to scan for more than the default list of scsi disk devices. The examples below describe the process for configuring Emulex cards in to support an EMC Symmetrix array.

1. Configuring /kernel/drv/lpfc.conf

A. To configure Gurkul1 Server with below storage paths
Path1 : HBA1 (lpfc0) –> SN1 (P1) –> SN1 (P5) –> FA1B ( target 12 )–> Array2 –> Disk02
Path2 : HBA2 (lpfc1) –> SN2 (P1) –> SN2 (P5) –> FA2B ( target 22) –> Array2 –> Disk02
We Have to update below entries in /kernel/drv/lpfc.conf
fcp-bind-WWPN=”555555555555551B:lpfc0t12″,
“555555555555552B:lpfc1t22″;
B.To configure Gurkul2 Server with below storage paths
Path1 : HBA1(lpfc0) –> SN1 (P3) –> SN1 (P4) –> FA2A ( target 11) –> Array1 –> Disk01
Path2 : HBA2 (lpfc1) –> SN2 (P3) –> SN2 (P4) –> FA1A (target 21) –> Array1 –> Disk01
We have to update below entries in /kernel/drv/lpfc.conf
fcp-bind-WWPN=”555555555555551A:lpfc0t11″,
“555555555555552A:lpfc1t21″;
2. Configuring /kernel/drv/sd.conf   

By default, the Solaris server will scan for a limited number of scsi devices. The administrator has to update the/kernel/drv/sd.conf file to tell the sd driver to scan for a broader range of scsi devices. In both cases, the target number associated with the WWPN of the fiber array adapter is arbitrary. In our case, we’ve assigned scsi targets 11, 12, 21, and 22 to the four array adapters. The following list describes the additions to the /kernel/drv/sd.conf file for each of the three hosts:


A. Gurkul1 Server:
# Entries added for host gurkul1 to “see” lun ( 0,2,3,4,5 ) on FA1B and FA2B with target 12 and 22
# FA1B = 555555555555551B
name=”sd” target=12 lun=0 hba=”lpfc0″ wwn=”555555555555551B”;
name=”sd” target=12 lun=2 hba=”lpfc0″ wwn=”555555555555551B”;
name=”sd” target=12 lun=3 hba=”lpfc0″ wwn=”555555555555551B”;
name=”sd” target=12 lun=4 hba=”lpfc0″ wwn=”555555555555551B”;
name=”sd” target=12 lun=5 hba=”lpfc0″ wwn=”555555555555551B”;
# FA2Bb = 555555555555552B
name=”sd” target=22 lun=0 hba=”lpfc1″ wwn=”555555555555552B”;
name=”sd” target=22 lun=2 hba=”lpfc1″ wwn=”555555555555552B”;
name=”sd” target=22 lun=3 hba=”lpfc1″ wwn=”555555555555552B”;
name=”sd” target=22 lun=4 hba=”lpfc1″ wwn=”555555555555552B”;
name=”sd” target=22 lun=5 hba=”lpfc1″ wwn=”555555555555552B”;
B. Gurkul2 Server:
# Entries added for host gurkul1 to “see” lun(0,1,2,3,4) on FA1A and FA2A with target 11 and 21
# FA1B = 555555555555551A
name=”sd” target=12 lun=0 hba=”lpfc0″ wwn=”555555555555551A”;
name=”sd” target=12 lun=1 hba=”lpfc0″ wwn=”555555555555551A”;
name=”sd” target=12 lun=2 hba=”lpfc0″ wwn=”555555555555551A”;
name=”sd” target=12 lun=3 hba=”lpfc0″ wwn=”555555555555551A”;
name=”sd” target=12 lun=4 hba=”lpfc0″ wwn=”555555555555551A”;
# FA2Bb = 555555555555552A
name=”sd” target=22 lun=0 hba=”lpfc1″ wwn=”555555555555552A”;
name=”sd” target=22 lun=1 hba=”lpfc1″ wwn=”555555555555552A”;
name=”sd” target=22 lun=2 hba=”lpfc1″ wwn=”555555555555552A”;
name=”sd” target=22 lun=3 hba=”lpfc1″ wwn=”555555555555552A”;
name=”sd” target=22 lun=4 hba=”lpfc1″ wwn=”555555555555552A”;

3. Update the /etc/system file as per EMC’s requirements for the Symmetrix.
 

set sd:sd_max_throttle=20
set scsi_options = 0x7F8
* Powerpath? Enter
* No sd:sd_io_time=0×78
* Yes sd:sd_io_time=0x3C
set sd:sd_io_time=0x3C
4. Perform the Final Reconfiguration Reboot
Perform a reconfiguration reboot (e.g. “reboot — -r”) on all three servers.
And after the reboot see the outout for
#format
You should see the desired disks. Put a Sun label on them via the “format” command and the configuration is complete.

SAN Storage Operations for Enterprise System Administrators


 Enterprise Unix system administrators will always have two challenges, every day, to deal with … the first one is providing sufficient Computing resources for the business applications and the second one is providing sufficient storage for the business applications.
We will discuss about managing computing resources( CPU and Memory) in different post, in this post we will be focusing on the SAN storage requests that an enterprise system administrator has to handle in his every day life.
Why SAN?
Because it is the common trend for most of Organisations to use SAN ( Storage area Networking) as an option to deal with the on-demand requests of the storage.
Note: I am using the word “Enterprise”, just to give you an idea that the environment is composed of heterogeneous technologies in Server hardware(HP / SUN / IBM ) , operating systems( Solaris / Linux / Windows) and applications (Databases / Middle ware / third party).

Types of Storage Requests related to SAN

1. New SAN Storage Allocation for newly build Servers
2. Additional SAN storage allocation for existing Servers
3. Unused Storage Reclaim for existing Servers
4. Storage migration from one SAN storage device to another

1. New SAN Storage Allocation for Newly Build Servers

These requests usually arise after installing a new server in the environment with an operating systems, and before configuring the applications / database in that server.
Steps Involved in Configuring SAN Storage to New Server
1. Finding Storage Requirement
    • Size of the Storage Required
    • Type of Redundancy required – RAID 5 or RAID 0
Note: Responsible Person – System Administrator
2. Install Server with SAN Storage STACK
    • Installing HBA Card drivers – Qlogic , Emulex , SUN native
    • Installing Powerpath software - EMC Powerpath, STMS , MPXIO, SAS …etc
    • Disk Management / Volume Management
Note: Responsible Person – System Administrator
3. SAN Storage Allocation
    • Zoning
    • Lun Masking
    • Providing LUN number to System Administrators for Server Side Configuration
Note: Responsible Person – Storage Administrator
4. Recognize and Configure storage from the Server Operating System
    • Detect and Configure New Storage LUNs from the Operating system – either dynamically or with reconfiguration reboot.
    • Configure Server to use all available fiber paths to the SAN Storage
    • Label the LUN Devices into local operating system
    • Volume Management on New Storage LUNS – using SVM or VxVM
    • Creating Filesystems on New Storage LUNS ( Note : Some Database applications like Sybase wont’ require filesystems to be created on all storage devices, it will raw deviced directly for the database operations)
    • Creating Mount points for the new filesystems and configure them to automount during the boot.
Note: Responsible Person – System Administrator

There are different procedures involved at this step depending on the Storage device, Operating System and Volume manager. For now I am not discussing that part here, but will be posting the actual implementation steps for different scenarios in my next posts

Below is the diagram where you can see a complete picture of how the SAN allocation happens to a server , and how the storage will be accessed and configured from the server side.

2. Additional SAN Storage Allocation for Existing Servers

These requests usually arise when ever an existing server, configured with storage, has reached it’s storage thresh-hold (for example : the condition that 80% of total allocated storage utilized )
Steps Involved in Configuring Additional SAN Storage to Existing Server
1. SAN Storage Allocation
  • Zoning
  • Lun Masking
  • Providing LUN number to System Administrators for Server Side Configuration
Note: Responsible Person – System Administrator
2. Recognize and Configure storage from the Server Operating System
  • Detect and Configure New Storage LUNs from the Operating system – either dynamically or with reconfiguration reboot.
  • Configure Server to use all available fiber paths to the SAN Storage
  • Label the LUN Devices into local operating system
  • Volume Management on New Storage LUNS – using SVM or VxVM
  • Creating Filesystems on New Storage LUNS ( Note : Some Database applications like Sybase wont’ require filesystems to be created on all storage devices, it will raw deviced directly for the database operations)
  • Creating Mount points for the new filesystems and configure them to automount during the boot.
Note: Responsible Person – System Administrator

3. Reclaiming Unused Storage from Existing Servers

There are two situation that initiates the Storage reclaim requests , the first one is Server decommission ( means Server going to be removed from the network) and the second one is Specific application in the server which is using external storage has decommissioned but the server is still alive in the network and supporting other applications.
Steps involved in Reclaiming Storage

1. Unconfigure Storage from the server
  • Unmount all related filesystems which are residing on top of the storage that is going to reclaim.
  • Detach the Storage Disks from any of the existing volumes
  • Clear the devices identification ..example : Veritas disk unsetup using vxdiskunsetup …etc
2. Reclaim Disks from the Storage side<
  • Zone changes
  • Changes to Lun-masking
3. Clean up Server dynamic Configuration entries with reconfiguration reboot of the server
  • Reconfigure storage multipathing software to cleanup the storage entries those are reclaimed
  • Server Operating system that use dynamic reconfiguration during the initial detection of the storage disks, will keep the information inside operating system configuration and it is possible that Operating systems complain with storage errors ( device not found) once the storage completely removed from this server. Hence it is always recommended to perform reconfiguration reboot after the storage reclaim in order to allow operating system to adjust it’s configuration according to the existing Storage connections.

4. Migrating Storage from One Storage Device to Another Device

These Requests normally arise when ever the existing storage not able to meet the on-going storage demand or the existing Storage have reached to End of life support from it’s vendor. And these could also arise if the organisation plans to move from high expensive storage to low expensive storage.
There are two types of migration techniques are in practice now a days , they are
1. Host Based Storage Migration
In host based storage migration, it is system administrator responsibility to replicate the existing data from the old storage to new storage once the Storage allocated new storage to the system.
And in this migration Operating system will see both new storage LUNs and old Storage Luns simultaneously for the period of that data replication happening from Operating System Side.
Once the Replication completes, the system administrator will follow the above storage reclaim procedure to release old storage LUNs from the operating system.
2.Storage side Migration
In Storage side migration , data replication from old storage to new storage will be taken care by the storage team. Once the data replicated system administration will make reconfiguration reboot to the server so that the operating system recognizes the storage disks from the new storage devices using the same existing storage paths.

Beginners Lesson – Veritas Cluster Services for System Admin

The purpose of this post is to make the Cluster concept easy for those young brothers who have just started their career as System Administrator. while writing this post I have only one thing in mind, i.e. explain the entire cluster concept with minimum usage of technical jargon and make it as simple as possible. That’s all about the introduction, let us go to actual lesson.
In any organisation, every server in the network will have a specific purpose in terms of it’s usage, and most of the times these servers are used to provide stable environment to run software applications that are required for organisation’s business. Usually, these applications are very critical for the business, and organisations cannot afford to let them down even for minutes. For Example: A bank having an application which takes care of it’s internet banking.
From the below figure you can see an application running on a standalone server which is configured with Unix Operating System and Database( oracle / sybase / db2 /mssql … etc). And the organisation considered to run it as standalone application just because it was not critical in terms of business, and in other words the whenever the application down it wont impact the actual business.
Usually, the application clients for these application will connect to the application server using the server name , server IP or specific application IP.

Standalone Application Server
Let us assume, if the organisation is having an application which is very critical for it’s business and any impact to the application will cause huge loss to the organisation. In that case, organisation is having one option to reduce the impact of the application failure due to the Operating system or Hardware failure, i.e Purchasing a secondary server with same hardware configuration , install same kind of OS & Database, and configure it with the same application in passive mode. And “failover” the application from primary server to these secondary server whenever there is an issue with underlying hardware/operating system of primary server.

Application Server with Highly Available Configuration
What is failover?
Whenever there is an issue related to the primary server which make application unavailable to the client machines, the application should be moved to another available server in the network either by manual or automatic intervention. Transferring application from primary server to the secondary server and making secondary server active for the application is called “failover” operation. And the reverse Operation (i.e. restoring application on primary server ) is called “Failback
Now we can call this configuration as application HA ( Highly Available ) setup compared to the earlier Standalone setup. you agree with me ?
Now the question is, how is this manual fail over works when there is an application issue due to Hardware/Operating System?
Manual Faiover basically involves below steps:
  1. Application IP should failover secondary node
  2. Same Storage and Data should be available on the secondary node
  3. Finally application should failover to the secondary node.

Application Server failover to Secondary Server

Challenges in Manual Failover Configuration

  1. Continuously monitor resources.
  2. Time Consuming
  3. Technically complex when it involves more dependent components for the application.
Then, what is alternative?
Just go for an automated failover software which will group the both primary server and secondary server related to the application, and always keep an eye on primary server for any failures and failover the application to secondary server automatically when ever there is an issue with primary server.
Although we are having two different servers supporting the application, both of them are actually serving the same purpose. And from the application client perspective they both should be treated as single application clusterserver ( composed of multiple physical servers in the background).
Wow…. Cluster .
Now, you know that cluster is nothing but “group of individual servers working together to server the same purpose ,and appear as a single machine to the external world”.
What are the Cluster Software available in the market, today? There are many, depending on the Operating System and Application to be supported. Some of them native to the Operating System , and others from the third party vendor
List of Cluster Software available in the market
  • SUN Cluster Services – Native Solaris Cluster
  • Linux Cluster Server – Native Linux cluster
  • Oracle RAC – Application level cluster for Oracle database that works on different Operating Systems
  • Veritas Cluster Services – Third Party Cluster Software works on Different Operating Systems like Solaris / Linux/ AIX / HP UX.
  • HACMP – IBM AIX based Cluster Technology
  • HP UX native Cluster Technology
And In this post, we are actually discussing about VCS and its Operations. This post is not going to cover the actual implementation part or any command syntax of VCS, but will cover the concept how VCS makes application Highly Available(HA).
Note: So far, I managed to explain the concept without using much complex terminology, but now it’s time to introduce some new VCS terminology to you, which we use in every day operations of VCS. Just keep little more focus on each new term.

VCS Components

VCS is having two types of Components 1. Physical Components 2. Logical Components

Physical Components:

1. Nodes
VCS nodes host the service groups (managed applications). Each system is connected to networking hardware, and usually also to storage hardware. The systems contain components to provide resilient management of the applications, and start and stop agents.
Nodes can be individual systems, or they can be created with domains or partitions on enterprise-class systems. Individual cluster nodes each run their own operating system and possess their own boot device. Each node must run the same operating system within a single VCS cluster.
Clusters can have from 1 to 32 nodes. Applications can be configured to run on specific nodes within the cluster.
2. Shared storage
Storage is a key resource of most applications services, and therefore most service groups. A managed application can only be started on a system that has access to its associated data files. Therefore, a service group can only run on all systems in the cluster if the storage is shared across all systems. In many configurations, a storage area network (SAN) provides this requirement.
You can use I/O fencing technology for data protection. I/O fencing blocks access to shared storage from any system that is not a current and verified member of the cluster.
3. Networking Components
Networking in the cluster is used for the following purposes:
  • Communications between the cluster nodes and the Application Clients and external systems.
  • Communications between the cluster nodes, called Heartbeat network.
Logical Components
1. Resources
Resources are hardware or software entities that make up the application. Resources include disk groups and file systems, network interface cards (NIC), IP addresses, and applications.
1.1. Resource dependencies
Resource dependencies indicate resources that depend on each other because of application or operating system requirements. Resource dependencies are graphically depicted in a hierarchy, also called a tree, where the resources higher up (parent) depend on the resources lower down (child).
1.2. Resource types
VCS defines a resource type for each resource it manages. For example, the NIC resource type can be configured to manage network interface cards. Similarly, all IP addresses can be configured using the IP resource type.
VCS includes a set of predefined resources types. For each resource type, VCS has a corresponding agent, which provides the logic to control resources.
2. Service groups
A service group is a virtual container that contains all the hardware and software resources that are required to run the managed application. Service groups allow VCS to control all the hardware and software resources of the managed application as a single unit. When a failover occurs, resources do not fail over individually— the entire service group fails over. If there is more than one service group on a system, a group may fail over without affecting the others.
A single node may host any number of service groups, each providing a discrete service to networked clients. If the server crashes, all service groups on that node must be failed over elsewhere.
Service groups can be dependent on each other. For example a finance application may be dependent on a database application. Because the managed application consists of all components that are required to provide the service, service group dependencies create more complex managed applications. When you use service group dependencies, the managed application is the entire dependency tree.
2.1. Types of service groups
VCS service groups fall in three main categories: failover, parallel, and hybrid.
  • Failover service groups
A failover service group runs on one system in the cluster at a time. Failover groups are used for most applications that do not support multiple systems to simultaneously access the application’s data.
  • Parallel service groups
A parallel service group runs simultaneously on more than one system in the cluster. A parallel service group is more complex than a failover group. Parallel service groups are appropriate for applications that manage multiple application instances running simultaneously without data corruption.
  • Hybrid service groups
A hybrid service group is for replicated data clusters and is a combination of the failover and parallel service groups. It behaves as a failover group within a system zone and a parallel group across system zones.
3. VCS Agents
Agents are multi-threaded processes that provide the logic to manage resources. VCS has one agent per resource type. The agent monitors all resources of that type; for example, a single IP agent manages all IP resources.
When the agent is started, it obtains the necessary configuration information from VCS. It then periodically monitors the resources, and updates VCS with the resource status.
4. Cluster Communications and VCS Daemons
Cluster communications ensure that VCS is continuously aware of the status of each system’s service groups and resources. They also enable VCS to recognize which systems are active members of the cluster, which have joined or left the cluster, and which have failed.
4.1. High availability daemon (HAD)
The VCS high availability daemon (HAD) runs on each system. Also known as the VCS engine, HAD is responsible for:
    • building the running cluster configuration from the configuration files
    • distributing the information when new nodes join the cluster
    • responding to operator input
    • taking corrective action when something fails.
The engine uses agents to monitor and manage resources. It collects information about resource states from the agents on the local system and forwards it to all cluster members. The local engine also receives information from the other cluster members to update its view of the cluster.
The hashadow process monitors HAD and restarts it when required.
4.2. HostMonitor daemon
VCS also starts HostMonitor daemon when the VCS engine comes up. The VCS engine creates a VCS resource VCShm of type HostMonitor and a VCShmg service group. The VCS engine does not add these objects to the main.cf file. Do not modify or delete these components of VCS. VCS uses the HostMonitor daemon to monitor the resource utilization of CPU and Swap. VCS reports to the engine log if the resources cross the threshold limits that are defined for the resources.
4.3. Group Membership Services/Atomic Broadcast (GAB)
The Group Membership Services/Atomic Broadcast protocol (GAB) is responsible for cluster membership and cluster communications.
  • Cluster Membership
GAB maintains cluster membership by receiving input on the status of the heartbeat from each node by LLT. When a system no longer receives heartbeats from a peer, it marks the peer as DOWN and excludes the peer from the cluster. In VCS, memberships are sets of systems participating in the cluster.
  • Cluster Communications
GAB’s second function is reliable cluster communications. GAB provides guaranteed delivery of point-to-point and broadcast messages to all nodes. The VCS engine uses a private IOCTL (provided by GAB) to tell GAB that it is alive.
4.4. Low Latency Transport (LLT)
VCS uses private network communications between cluster nodes for cluster maintenance. Symantec recommends two independent networks between all cluster nodes. These networks provide the required redundancy in the communication path and enable VCS to discriminate between a network failure and a system failure. LLT has two major functions.
  • Traffic Distribution
LLT distributes (load balances) internode communication across all available private network links. This distribution means that all cluster communications are evenly distributed across all private network links (maximum eight) for performance and fault resilience. If a link fails, traffic is redirected to the remaining links.
  • Heartbeat
LLT is responsible for sending and receiving heartbeat traffic over network links. The Group Membership Services function of GAB uses this heartbeat to determine cluster membership.
4.5. I/O fencing module
The I/O fencing module implements a quorum-type functionality to ensure that only one cluster survives a split of the private network. I/O fencing also provides the ability to perform SCSI-3 persistent reservations on failover. The shared disk groups offer complete protection against data corruption by nodes that are assumed to be excluded from cluster membership.
5. VCS Configuration files.
5.1. main.cf
/etc/VRTSvcs/conf/config/main.cf is key file interms VCS configuration. the “main.cf” file basically explains below information to the VCS agents/VCS daemons.
  • What are the Nodes available in the Cluster?
  • What are the Service Groups Configured for each node?
  • What are the resources available in each Service Group, the types of resources and it’s attributes?
  • What are the dependencies each resource having on other resources?
  • What are the dependencies each service group having on other Service Groups?
5.2. types.cf
The file types.cf, which is listed in the include statement in the main.cf file, defines the VCS bundled types for VCS resources. The file types.cf is also located in the folder /etc/VRTSvcs/conf/config.
5.3. Other Important files
  • /etc/llthosts—lists all the nodes in the cluster
  • /etc/llttab—describes the local system’s private network links to the other nodes in the cluster

Sample VCS Setup

From the below figure you can understand the VCS sample setup configured for an application which is running with Database and Shared Storage.
Why we need Shared Storage for Clusters?
Normally, database servers were configured to store their database on SAN storage and it is mandatory to these database to be reachable to the all other nodes, in the cluster, in order to fail over the database from one node another node. And That is the reason both the nodes in the below figure configured with common shared SAN storage, and in this model all the cluster nodes can see the storage devices from their local operating systems but at a time only one node ( active ) can make write operations to the storage.
Why each server need two Storage Paths ( connected to two HBAs)?
To provide redundancy to the server’s storage connection and to avoid single point of failure in storage connection. When ever you notice multiple storage paths connected to any server, you can safely assume that there is some storage multipath software running on the Operating system e.g. multipathd, emc powerpath, hdlm, mpio …etc.
Why each server need two network connection to physical network?
This is again , to provide redundancy for network connection of the server and to avoid single point of failure in server physical network connectivity. When ever you see dual physical network connection, you can assume that Server is using some king of IP multipath software to mange dual path . e.g. IPMP in solaris, NIC Bonding in linux …. etc.
Why we need minimum two Heart beat Connections, between the cluster nodes?
When the VCS lost all it’s heartbeat connections except the last one, the condition is called cluster jeopardy. When the Cluster in jeopardy state any of the below things could happen
1) The loss of the last available interconnect link
In this case, the cluster cannot reliably identify and discriminate if the last interconnect link is lost or the system itself is lost and hence the cluster will form a network partition causing two or more mini clusters to be formed depending on the actual network partition. At this time, every Service Group that is not online on its own mini cluster, but may be online on the other mini cluster will be marked to be in an “autodisabled” state for that mini cluster until such time that the interconnect links start communicating normally.
2) The loss of an existing system which is currently in jeopardy state due to a problem
In this case, the situation is exactly the same as explained in step 1 forming two or more mini clusters.
In case where both both the LLT interconnect links disconnect at the same time and we do not have any low-pri links configured, then the cluster cannot reliably identify if it is the interconnects that have disconnected and will assume that the other system is down and now unavailable. Hence in this scenario, the cluster would consider this like a system fault and the service groups will be attempted to be onlined on each mini cluster depending upon the system StartupList defined on each Service Group. This may lead to a possible data corruption due to Applications writing to the same underlying data on storage from different systems at the same time. This Scenario is well known as “Split Brain Condition” .

Typical VCS Setup for an application with Database

This is all about introduction on VCS, and please stay tuned for the next posts , where I am going to discuss about actual administration of VCS.

Please don’t forget to drop your comments and inputs in the comment section.
Have Happy System Administration!!!!!!