You are here: Home | Load  Balance Tutorial
 

Internet

Load Balance

Abstract:Load Balance is a important step for Internet growth. In this page we review the solutions proposed by academic researches and commercial products. We concentrate our attention to WWW, since it is the most used Internet service. Appropriate solution for other services such us FTP, NTTP or proprietary protocols are also presented.
 

 

We are talking about:

  • DNS Load Balance

  • HTTP Load Balance

  • NAT: a solution for virtual IP

  • One-IP: single IP, multiple hosts

  • IBM Network Dispatcher

  • Content Based Switching & Traffic Shaping.

See my paper presented at webnet 98 on Jamming Net: a server to balance www load.

 

 

 

Why we need Internet Load Balance?

A need for advanced solution to manage high load traffic is necessary, as the WWW increases in size and complexity. Today, the most common solution is to have a single web server running on a very powerful machine with great capacity of memory, fast processors, fast access disk array. This way to face the problem is somewhat limited because the solution is not much scalable.

In fact, as soon as the volume of traffic increases, arises the need to modify the hardware configuration of the machine running web server. There are cases in which one must even forced to replace hardware. Besides, this solution is not fault tolerant. If there is a hardware problem in the machine, or a software failure in HTTP server, the WWW service stops working and published information is no longer available on line. We are identifying three requirements that modern systems should meet in order to publish information:

  • Network Scalability: preserving the used hardware architecture and adding only a new HTTP server running on a different machine, whenever the incoming traffic to an existing HTTP server increases.

  • Load Balancing: sharing traffic among a group of HTTP servers according to some policies which depend on local load or some pseudo random heuristic.

  • Fault Tolerance: In case of fault of one of the servers we want to be able to recover, stopping its use and replacing it with one of alive servers automatically.

An unique IP address for every Machine

Every machine on internet has an unique IP address. When a machine A connects a machine B, it uses the IP address of its partner. When you write something like "http://www.di.unipi.it/~gulli" your browser uses the DNS system to transform the symbolic name "www.di.unipi.it " in a unique numeric address ("131.114.11.37."). Then, this address is used to contact the target machine.

RFC 1794 (DNS Support for Load Balancing)

RFC 1794 (alt) provides a description of a way of answering DNS requests for a single, well-known domain name with multiple, different IP addresses. This allows connections to a single domain to be routed to geographically separate servers, without forcing all IP traffic through one NAT-enabled router. As such, it is geographically load-balancing solution. However, the mechanics of doing the actual load balancing are considerable more difficult. One serious limitation of DNS-based load balancing is that many/most implementations of BIND view TTL's of less than 300 seconds, and zone refreshes more often than 15 minutes as being irrational and the symptom of a broken configuration. Given that most HTTP transfers only last a few seconds, and socket keep-alives for minutes, it is impossible to do load balancing on a fine grain. At best, DNS can support only a "stochastic" load balancing, redirecting clients to servers randomly, as various caches in various resolvers expire at random (although small) intervals.

Note: the solution is not fault-tollerant. If a machine stops to works, all the DNS server around to word could continue to serve the not-working address because of the presence of a cache mechanism. DNS offers Network Scalability, but low control of load balance. It works only using round-robin and doesn't allow small amount of rotation time.


HTTP/WWW Load Balancing

Several URL-based load balancing technologies are generally available, either as open source, or as products.

Note: these kind of solutions are generally fault-tollerant. They  offer Network Scalability, and a wide range of policies to load balance. They have two problems: almost all of them are only appliable to HTTP or to single port protocols, all of them need a central point of dispatching (i.e. a possible point of failiure).


NAT: a solution for virtual IP

Every machine on internet has an unique IP address. When a machine A connects a machine B, it uses the address of its partner. NAT (Network Address Translation) is a tecnique to virtualize a pool of distinct IP addresses and to present it as a unique address.

Roughly speaking, a machine (call it A) connects a dispatcher machine (call it D) that re-labels the request and redistributes it to one of many "hidden" machines (call them W1, ..., Wn) operating at different IP addresses. The traffic is shared according to some kind of policy and it is possible to implement fault tolerance in a transparent way.

There are some protocols (like FTP) that uses more than a connection (on different TCP/UDP ports) to work correctly. In this situation is necessary to redistribute this kind of incoming connection to the same hidden machine.

Note that all the traffic from W1, ..., Wn to A must traverse D.

IBM describes a router which they used for parallelizing web server queries for the Olympic Games web server. It takes a TCP/IP connection request, and re-labels it, redistributes it to one of many web servers ("mirrors") operating at different IP addresses. Each server maintains an identical set of web pages. The user is unaware of the existence of multiple web servers/mirrors, as they (i.e. their browser) connect to the externally published, well-known domain name. The mirrors were geographically distributed (Atlanta, New York, California), and requests were routed to the least-busy and/or ping-time closest server.

Note: these kind of solutions are generally fault-tollerant. They  offer Network Scalability, and a wide range of policies to load balance. They have a problem problems: all of them need a central point of dispatching (i.e. a possible point of failiure).


ONE-IP

The ONE-IP Project implements network clustering using techniques that in many ways are superior to traditional NAT as described above. One method of achieving distribution is with packets routed to a gateway which then dispatches based on hardware address, rather than IP address. Thus, all servers on the LAN segment have the same IP address, and reply to clients with that single IP address, thus avoiding the overhead of NAT re-writing. The choice of redistribution is done at the MAC level via ARP. Furthermore, since dispatching is stateless, and the router, gateway and servers sit on the same segment, failover of the dispatcher is considerably simplified.

Another method eliminates the need for a dispatcher by broadcasting on the local segment, and having servers respond selectively based on a hash of the source address. It this way there is no need for a central dispatcher machine.

Note: this solution is fault-tollerant, offers Network Scalability and a wide range of policies to load balance. It show a great quality: the dispatching fuction is realized in a distribuited way by a set of cooperating machines that provide to reconfigure themself, if some of them stop to work. The main problem is that the solution need to modify the kernel of the used machines.


After NAT: Network Dispatcher

IBM propose an enhanced version of its software "Shock Absorber". It is called Network Dispatcher and has some intresting features.

First at all, it does not modify packets (like in NAT). Remember that in NAT one is obliged to modify packets once on the way in, and once again on the way out. It only monitors the inbound client-to-server requests. The outbound traffic, which is typically much larger than the inbound client-to-server requests, goes directly from server to client. It operates using in a tricky way the TCP/IP stack.

It has Wide Area Support for load balancing remote servers, and the ability to set up pre-defined rules to administer load balancing. It runs on IBM AIX, Sun Solaris, and Microsoft Windows NT operating systems.

Note: this solution is fault-tollerant, offers Network Scalability and a wide range of policies to load balance. The main problem is that the solution need to modify the kernel of the machines (but they have solutions for many architectures), and the need of a central point of dispatching (but used only for the traffic client-to-server)


The future: Content based switching and Traffic Shaping

The next step to load balance is to consider the protocols and not only the IP address. Here cames two new concepts: Content based switching and Traffic Shaping works at TCP/IP level, so they need a central machine that dispatch connection (both incoming and outgoing traffic) with many priorized ques. It must resemble the TCP packets from the IP segment.

Note: these kind of solutions are fault-tollerant, offers Network Scalability and a wide range of policies to load balance. The main problem is the need of a central point of dispatching. They offer sophisticate control of the traffic at protocol level. For example, you can decide what is the bandwith dedicated to WWW, to FTP, to Telnet. You can also dedice to privilege a WWW request on the base of given URL. But they need to act at TCP level rather than IP level.


Conclusion

Load Balance is an very urgent step in Internet. You can start with DNS solution, but it offers very poor control of sharing and it is not fault-tollerant. If you need to balance WWW traffic only, you can use one of the URL-based load balancing technologies. If you need a sofware solution for balancing a single port protocol you can use Jamming.Net.

NAT is a standard solution since there are strong accademic researches about it and commercial products. It need a central machine that dispatch incoming traffic and outgoing answer. It also support geografical dispatching and multi-port protocols.

One-IP is an example of distributed dispatcher to share traffic. Another example is Network Dispatcher. Using them there is no need of a single and central dispatcher point.

Traffic shaping and Content Based Switching are innovative tecnologies that works at protocols level rather than with IP address. They offer QoS connection but they need again a central dispatcher.

Antonio Gullì
gulli@di.unipi.it