Wednesday, September 14, 2011

Load Balancing - Things you should know about


We’ve all heard about the wonders of Load Balancing, and how it can improve the performance of a site(s) or service. In this post we’ll take a look a few things you should know about before going ahead and setting it up for your site.

Mirroring your data

Since the site data is going to be served by multiple servers, it is important to make sure that the site data served by all is the same. Depending on the way the sites are designed, this could be an easy or a complicated process. If you are serving simple static sites in which the site code itself does not change much, then you can simply copy the data to all servers and leave them there. However, if your site involves constant updates to the code itself, then you’ll have to find a way to propagate these changes to other servers. You can implement some form of file sharing, or setup a program to sync data on all servers.
Please note, the changes I am referring to here are changes to the code of the website files itself. If the changes are made to a database, then it should just be enough to make that database accessible from all servers. However, in this case, the database becomes a single point of failure. If the database crashes, non of the servers will be able to serve the site. To avoid this, you’ll have to load balance the MySQL service, or setup Master-Slave or Master-Master replication. This will make sure that there are more than one instances of the database available.

Session Persistence

Another important issue that one must not forget to address is the issue of sessions. Each time a customer visits a load balanced site, various information pertaining to that session will usually be stored on the server handling that request. If as part of the load balancing, the customer is switched to another server half-way during a session, the session details will be lost. There are various means by which this can be overcome. Some hardware load balancers ensure that all requests from a particular customer are always sent to the same server. The disadvantage is that true load balancing does not occur as customers are sent to the server they were first connected to, and not the server with least load. This can be overcome by not storing session data on the server. That can be done by either storing session data in a central database, or storing it in the clients browser. If using a database, the problem mentioned above could still occur. Session data stored in a clients browser is usually in the form of cookies. Just make sure they are properly encrypted, in case sensitive information needs to be stored.

Hardware or Software Load Balancers

Depending on your budget you can choose between Hardware or Software Load Balancers. If you are availing of the Load Balancer service from your DC, inquire as to what they are using. Hardware Load Balancers are more robust and are more efficient at Load Balancing, but are more expensive. There are many Software Load Balancers available today that do a pretty good job, but they are still limited to the server hardware on which they are installed. If you plan to go for a Software Load Balancer, make sure you setup a fail-over. You could have two Load Balancing servers handling all requests and passing them to the back-end server. If one fails all requests should be sent to the other. Some Software load balancers come with this feature and will monitor the other load balancer. If one fails that other takes up all the incoming requests.

Keep these points in mind when you are thinking about setting up a Load Balancer for your servers. In most cases they fit right it and you shouldn’t have to worry. But instead of setting them up and then correcting the problems, it would be best to plan ahead. Most negative reviews for load balancer setups are because of people don’t analyze their current setup before going ahead.