Wednesday, September 14, 2011

Linux Virtual Server: Load Balance Your Networked Services


With increasing workload on web and mail servers, scalable server on a cluster of computers is found to be an effective solution. The Linux Virtual Server is a one such highly scalable and highly available server built on a cluster of real servers. It is extremely fast and allows load balancing of services like web, mail and media services. It has an advanced load balancer running on the Linux operating system.
In this article, I will be explaining some of the LVS terms and types of LVS clusters commonly employed. I have also explained a simple LVS-DR cluster configuration for better understanding.
Lets see what an IP virtual Server is:
IPVS is an aggregation of kernel patches that were consolidated into the stock version of the Linux kernel starting with version 2.4.23.
The IPVS enabled kernel lets you convert any computer running Linux to a cluster load balancer. The IPVS enabled cluster load balancer and the cluster nodes are jointly called a LINUX VIRTUAL SERVER (LVS).
LVS DIRECTOR: The LVS load balancer receives all incoming client requests for services and decides which cluster node should reply to each request. The load balancer is called an LVS Director or simply a Director.
REAL SERVERS: The nodes inside an LVS cluster are called real servers and the computers that connect to the cluster to request its services are called client computers.

LVS IP address Name conventions

Virtual IP address (VIP): The IP address, the Director uses to provide services to client computers.
Real IP address (RIP) : The IP address used on the cluster nodes.
Directors IP address (DIP): The IP address used by the the Director to connect to RIP network.
Client computers IP address (CIP): The IP address assigned to a client computer, that it uses as a
source IP address for the requests sent to the cluster.

Types of LVS clusters

LVS-NAT : In LVS-NAT configuration, the Director uses the ability of Linux kernel to change the
network IP addresses and ports as packets pass through the kernel.
Working: A request for a cluster service is accepted by the Director on its VIP. The Director
forwards this request to a cluster node on its RIP. The cluster node then replies to the request by
sending the packet back through the Director so the Director can perform the translation that is necessary to convert the cluster node’s RIP address into the VIP address that is owned by the Director. This makes it appear to client computers outside the cluster as if all packets are sent and received from a single IP address (VIP).
LVS-DR : In an LVS-DR configuration, the Director forwards all incoming requests to the nodes inside the cluster, but the nodes inside the cluster send their replies directly back to client computers.
Working : The request from client computer or CIP is sent to Director’s VIP. The Director forwards the request to a cluster node or real server using the same VIP destination IP address. The cluster node then sends a reply packet directly to the client computer. This reply packet uses the VIP as its source IP address. The client computer is thus fooled, as it thinks, it is talking to a single computer. In reality, it is sending request packets to one computer and receiving reply packets from another.
LVS-TUN : IP tunneling can be used to forward packets from one subnet or virtual LAN, to another subnet or VLAN, even when the packets must pass through another network or Internet. Building on the IP tunneling capability that is part of the Linux kernel, the LVS-TUN forwarding method allows you to place cluster nodes on a cluster network that is not on the same network segment as the Director.

LVS Scheduling Methods:

When the Director receives an incoming request from a client computer to access a cluster service on its VIP, it has to decide which cluster node should get the request. The scheduling methods, the Director use to make this decision fall into two basic categories: Fixed Scheduling and Dynamic Scheduling.
DYNAMIC Scheduling Methods: Dynamic scheduling methods give you more control over the incoming workload, with little or no penalty, since they only require a small amount of extra processing load on the Director. When dynamic scheduling methods are used, the director keeps track of the number of active and inactive connections for each cluster node and uses this information to determine which cluster node to use when a new request arrives for a cluster service. An active connection is a TCP network session that remains open while the client computer and cluster node are sending data to each other.
Main dynamic scheduling methods are:
. Least-connection
. Weighted least-connection
. Shortest expected delay
. Never queue

More on LVS-DR CLUSTER

The LVS-DR cluster is made possible by configuring all nodes in the cluster and the Director with same VIP address; despite having this common address, though, the client computers will send their packets to the Director. The director can therefore balance the incoming workload from the client computers by using one of the LVS scheduling methods.
There are a couple of things to be noted before going to explain the working .
1. The Network Interface Card (NIC), the director uses for network communication is connected to the same physical network that is used by the cluster node and the client computer. The VIP, RIP and CIP are all on the same physical network.
2. The same virtual IP (VIP) is configured in two places. Both in Director and Real servers. In Real servers, the VIP is configured on the loopback interface of NIC card. Loopback device is a logical network device used by all networked computers to deliver packets locally. So Network packets with a destination address of VIP, will be sent to the loopback device on real server.
WORKING: An ARP broadcast from the client computer asked, ” who owns VIP?” and the director replied to the broadcast using its MAC address and said that it was the owner. The client computer then constructed the first packet of the network conversation and inserted the proper destination MAC address to send the packet to the director. When packet 1 arrives at the director, the director forwards the packet to the real server, leaving the source and destination address unchanged. Only the MAC address is changed from the Director’s MAC address to the real server’s RIP MAC address.
When the packet reaches real server, the packet is routed to the loopback device, because it is where the routing table inside the kernel on the real server is configured to send it. The packet is then received by a daemon running locally on the real server listening on VIP, and that the daemon knows what to do with the packet. In our case the daemon is the Apache HTTPd web server.
The HTTPd daemon then prepares a reply packet and sends it back out through the RIP interface with the source address set to VIP. The packet does not go back through the Director, because the real servers do not use the Director as their default gateway in an LVS-DR cluster. Reply packet is directly sent back to the client computer ( hence the name direct routing ).

A simple LVS-DR cluster configuration

ldirectord : The ldirectord daemon monitors the health of real servers by sending requests to access cluster resources on the real IP of each real server. When a real server does not reply to the ldirectord daemon running on the Director, the ldirectord daemon issues the correct ipvsadm command to remove it from the IPVS table for the VIP address. Later, when the real server comes back online, ldirectord issues the correct ipvsadm command to add the real server back into the IPVS table.
Requirements :
1. Kernel with LVS support and the ability to suppress ARP replies (2.4.26 or greater in the 2.4 series, 2.6.4 or greater in the 2.6 series ).
2. ldirectord perl program
192.168.1.41 is director and 192.168.1.40 and 192.168.1.42 are real servers. 192.168.1.45 is the virtual IP and is configured in the eth1 interface of director.
Configuration in Real servers.
Hide loopback interface :
We need to tell the real servers to ignore ARP broadcast from client computers that are searching for the owner (MAC address) of the VIP. We can use a script to accomplish this.
# ! /bin/bash
VIP=192.168.1.45
host=`/bin/hostname`
case "$1" in
start)
/sbin/ifconfig lo down
/sbin/ifconfig lo up
echo 1 > /proc/sys/net/ipv4/conf/lo/arp_ignore
echo 2 > /proc/sys/net/ipv4/conf/lo/arp_announce
echo 1 > /proc/sys/net/ipv4/conf/all/arp_ignore
echo 2 > /proc/sys/net/ipv4/conf/all/arp_announce

/sbin/ifconfig lo:0 $VIP netmask 255.255.255.255 up
/sbin/route add -host $VIP dev lo:0
;;
stop)
/sbin/ifconfig lo:0 down
echo 0 > /proc/sys/net/ipv4/conf/lo/arp_ignore
echo 0 > /proc/sys/net/ipv4/conf/lo/arp_announce
echo 0 > /proc/sys/net/ipv4/conf/all/arp_ignore
echo 0  > /proc/sys/net/ipv4/conf/all/arp_announce
;;
status)

islothere=`/sbin/ifconfig lo:0 | grep $VIP`
isrothere=`netstat-rn | grep "lo:0" | grep $VIP`
if [ ! "$islothere" -o ! "isrothere" ];then
echo “LVS-DR real server stopped.”
elsepar
echo “LVS-DR running.”
fipar
;;
*)

echo “$0: Usage: $0 {start|status|stop}”
exit 1
;;
esac
This script is placed in /etc/init.d/ directory. I named this script as lvsdr. Place it in both real servers and start the script. It is found that 192.168.1.45 has been configured in the loopback interface of real servers after running the script.
Healthcheck webpage is created in real servers.
On the real servers 192.168.1.40 and 192.168.1.42, a simple healthcheck webpage is created. A director assumes everything is okay with
real server by connecting to the real server at the port 80 and check whether
the TCP/IP connection is accepted.
echo “OKAY” > /var/www/html/.healthcheck.html
Configuration in Director
Install ldirectord and required software components
ldirectord is a perl program. Normally it comes with heartbeat package. You can install heartbeat using yum, apt-get. The startup script can be seen at /etc/ha.d/resource.d/
Test ldirectord installation
To test ldirectord installation, ask it to display its help information with the -h switch.
—————————–
/usr/sbin/ldirectord -h
—————————–
Your screen should show the ldirectord help page. Otherwise it will show a message about a problem finding necessary modules. You must then download and install the perl module.
Create the ldirectod configuration file in /etc/ha.d/conf
ldirectord uses a configuration file to build the IPVS table.
ldirectord.cf
checktimeout=20

checkinterval=5

autoreload=yes

quiescent=no

logfile="info"

virtual=192.168.1.45

real=192.168.1.40 gate 2 ".healtcheck.html","OKAY"

real=192.168.1.42 gate 2 ".healtcheck.html","OKAY"

service=http

checkport=80

protocol=tcp

scheduler=wrr

checktype=negotiate

fallback=127.0.0.1
NB: you must indent the lines after the virtual line with at least four spaces or the tab character.
checktimeout=20
This sets the checktimeout value to the number of seconds ldirectord will wait for a health check to complete. If the check fails for any reason or does not complete within the check timeout period, ldirectord will remove the real server from the IPVS table.
checkinterval=5
This checkinterval is the time span for which ldirectord sleeps between checks.
autoreload=yes
This enables the autoreload option, which causes the ldirectord program to periodically calculate the md5sum to check this configuration file for changes and automatically apply them when you change the file. This feature helps you to change the cluster configuration easily.
quiescent=no
A node is quiesced when it fails to reply within its checktimeout period. When you set this option ldirectord will displace the real server from the IPVS table rather than quiesce it. If you do not set this option to no, the cluster may seem to be down to some of the client computers when a node crashes, because they were assigned to the node before it crashed and the connection tracking record or persistent connection template still remains on the Director.
logfile=”info”
This entry tells ldirectord to use the syslog facility for logging error messages. If no specific directory or file are given to write log messages, they are written to /var/log/ldirectord.log
virtual=x.x.x.x:80
This line specifies the VIP address and port number that we want to install on the LVS Director.
The next group of indented lines specifies which real servers inside the cluster will be able to offer the resource to client computers
real=
service=http
This line indicates which service ldirectord should use when testing the health of the real server.
checkport=80
This line indicates that the health check request for the http service should be performed on port 80
protocol=tcp
This entry specifies the protocol that will be used by this virtual service. It can be set to tcp,udp or fwm.
scheduler=wrr
This indicates that we want to use the weighted round-robin load balancing technique
checktype=negotiate
This option describes which method the ldirectord daemon should use to monitor the real servers for this VIP. checktype can be set to one of the following.
negotiate: This method connects to the real server and sends the request string you specify. If the reply string you specify is not received within the checktimeout period, the node is considered dead.
connect: This method simply connects to the real server at the specified checkport and assumes everything is okay on the real server if a basic TCP/IP connection is accepted by the real server. This is not as reliable as negotiate method for detecting problems.
off: this disables ldirectord’s health check monitoring of the real servers.
fallback=127.0.0.1
The fallback address is the IP address and port number that client connections should be redirected to when there are no real servers in the IPVS table that can satisfy their requests for a service.
Test your configuration.
Check whether real server IP 192.168.1.40 and 192.168.1.42 are pingable from Director.
Httpd service is started in all servers
Starting the ldirectord daemon on the Director to see if your IPVS table is created.
a. /etc/ha.d/resource.d/ldirectord -d ldirectord.cf start
You should see the ldirectord debug output indicating that the ipvsadm commands have been run to create the virtual server followed by commands such as following
DEBUG2: check_http: http:///.healtcheck.html is down
b. Now bring the realservre online. The debug output message from ldirectord should change to
DEBUG2: check_http: http:///.healtcheck.html is up.
c. Check to be sure that the virtual service was added to the IPVS table with the command
ipvsadm -L –n
If everything is working properly, the result will be as follows.
IP Virtual Server versiion

x.x.x (size=4096)

prot LocalAddress:Port Scheduler Flags

--> RemoteAddress:Port   Forward   Weight  ActiveConn   InActConn

TCP 192.168.1.45:80 wrr

->192.168.1.40:80 Routetab       2           0                0

->192.168.1.42:80 Routetab       2           0                0
d. Test ldirectord by shutting down apache on the real server. check whether the ipvsadm table is changed and the real servers weight is set to 0.
e. Restart the real server. The weight will return to 2.
3 different web pages are created on director and the two real servers.
a. this is director .——— on director
b. this is real server 1 —– on real server 1. 192.168.1.40
c. this is real server 2——- on real server 2 . 192.168.1.42
You can view the load balancer switching to real servers when you access the site using ip 192.168.1.45 .
We can combine Heart beat with LVS package to design a High availability Linux cluster with no single point of failure.