You are here: Home | Load Balance Tutorial | |
Internet Load Balance |
|
Abstract:Load Balance is a important step for Internet growth. In this page we review the solutions proposed by academic researches and commercial products. We concentrate our attention to WWW, since it is the most used Internet service. Appropriate solution for other services such us FTP, NTTP or proprietary protocols are also presented. |
Why we need Internet Load Balance?A need for advanced solution to manage high load traffic is necessary, as the WWW increases in size and complexity. Today, the most common solution is to have a single web server running on a very powerful machine with great capacity of memory, fast processors, fast access disk array. This way to face the problem is somewhat limited because the solution is not much scalable.In fact, as soon as the volume of traffic increases, arises the need to modify the hardware configuration of the machine running web server. There are cases in which one must even forced to replace hardware. Besides, this solution is not fault tolerant. If there is a hardware problem in the machine, or a software failure in HTTP server, the WWW service stops working and published information is no longer available on line. We are identifying three requirements that modern systems should meet in order to publish information:
|
Note: the solution is not fault-tollerant. If a machine stops to works, all the DNS server around to word could continue to serve the not-working address because of the presence of a cache mechanism. DNS offers Network Scalability, but low control of load balance. It works only using round-robin and doesn't allow small amount of rotation time.
Note: these kind of solutions are generally fault-tollerant. They offer Network Scalability, and a wide range of policies to load balance. They have two problems: almost all of them are only appliable to HTTP or to single port protocols, all of them need a central point of dispatching (i.e. a possible point of failiure).
Roughly speaking, a machine (call it A) connects a dispatcher machine (call it D) that re-labels the request and redistributes it to one of many "hidden" machines (call them W1, ..., Wn) operating at different IP addresses. The traffic is shared according to some kind of policy and it is possible to implement fault tolerance in a transparent way.
There are some protocols (like FTP) that uses more than a connection (on different TCP/UDP ports) to work correctly. In this situation is necessary to redistribute this kind of incoming connection to the same hidden machine.
Note that all the traffic from W1, ..., Wn to A must traverse D.
IBM describes a router which they used for parallelizing web server queries for the Olympic Games web server. It takes a TCP/IP connection request, and re-labels it, redistributes it to one of many web servers ("mirrors") operating at different IP addresses. Each server maintains an identical set of web pages. The user is unaware of the existence of multiple web servers/mirrors, as they (i.e. their browser) connect to the externally published, well-known domain name. The mirrors were geographically distributed (Atlanta, New York, California), and requests were routed to the least-busy and/or ping-time closest server.
Note: these kind of solutions are generally fault-tollerant. They offer Network Scalability, and a wide range of policies to load balance. They have a problem problems: all of them need a central point of dispatching (i.e. a possible point of failiure).
Another method eliminates the need for a dispatcher by broadcasting on the local segment, and having servers respond selectively based on a hash of the source address. It this way there is no need for a central dispatcher machine.
Note: this solution is fault-tollerant, offers Network Scalability and a wide range of policies to load balance. It show a great quality: the dispatching fuction is realized in a distribuited way by a set of cooperating machines that provide to reconfigure themself, if some of them stop to work. The main problem is that the solution need to modify the kernel of the used machines.
First at all, it does not modify packets (like in NAT). Remember that in NAT one is obliged to modify packets once on the way in, and once again on the way out. It only monitors the inbound client-to-server requests. The outbound traffic, which is typically much larger than the inbound client-to-server requests, goes directly from server to client. It operates using in a tricky way the TCP/IP stack.
It has Wide Area Support for load balancing remote servers, and the ability to set up pre-defined rules to administer load balancing. It runs on IBM AIX, Sun Solaris, and Microsoft Windows NT operating systems.
Note: this solution is fault-tollerant, offers Network Scalability and a wide range of policies to load balance. The main problem is that the solution need to modify the kernel of the machines (but they have solutions for many architectures), and the need of a central point of dispatching (but used only for the traffic client-to-server)
Note: these kind of solutions are fault-tollerant, offers Network Scalability and a wide range of policies to load balance. The main problem is the need of a central point of dispatching. They offer sophisticate control of the traffic at protocol level. For example, you can decide what is the bandwith dedicated to WWW, to FTP, to Telnet. You can also dedice to privilege a WWW request on the base of given URL. But they need to act at TCP level rather than IP level.
NAT is a standard solution since there are strong accademic researches about it and commercial products. It need a central machine that dispatch incoming traffic and outgoing answer. It also support geografical dispatching and multi-port protocols.
One-IP is an example of distributed dispatcher to share traffic. Another example is Network Dispatcher. Using them there is no need of a single and central dispatcher point.
Traffic shaping and Content Based Switching are innovative tecnologies that works at protocols level rather than with IP address. They offer QoS connection but they need again a central dispatcher.
Antonio Gullì |