Jamming.Net: a Server to Balance WWW Load
Antonio Gulli gulli@di.unipi.it
Computer Science Dept. Univ. Pisa, Studio Associato I-Tech
1 The Need of Distributed Servers
A need for advanced solution to manage high load traffic is necessary,
as the WWW increases in size and complexity. Today, the most common solution is to have
a single web server running on a very powerful machine with great capacity of memory,
fast processors, fast access disk array. This way to face the problem is somewhat limited
because the solution is not much scalable. In fact, as soon as the volume of traffic
increases, arises the need to modify the hardware configuration of the machine running
web server. There are cases in which one must even forced to replace hardware.
Besides, this solution is not fault tolerant. If there is a hardware problem in the
machine, or a software failure in HTTP server, the WWW service stops working and published
information is no longer available on line. We are identifying three requirements that modern
systems should meet in order to publish information:
- Network Scalability
: preserving the used hardware architecture and
adding only a new HTTP server running on a different
machine, whenever the incoming traffic to an existing HTTP
server increases.
- Load Balancing
: sharing traffic among a group of HTTP servers
according to some policies which depend on local
load or some pseudo random heuristic.
- Fault Tolerance
: In case
of fault of one of the servers we want to be able to
recover, stopping its use and replacing it with one
of alive servers automatically.
2 Jamming.Net: a Server to Load Balance Existing WWW ServersTo address these
requirements, we developed a software server
architecture that tries to satisfy them. One more need
we want to satisfy is not to build another web server;
in fact commercial and public domain servers exist with
many features in term of programmability (server side
script, module extensions and so on..) and with many
tools dedicated to manage them. So, we develop a program
acting in front of pre-existing set of HTTP servers,
each one eventually running on different computers. In
fig. 1 we have the position in network architecture in
which Jamming.Net is placed. Like a traditional www
proxy [1], Jamming.Net is situated between a www client
(the browser) and the information content provider.
Compared to a www proxy the marked difference is that
Jamming.Net is able to decide what HTTP server has to
provide the answer for every incoming request, on the
bases of local information. Moreover Jamming.Net is
transparent to the www client, in fact there is no need
to configure the browser in order to use it.
Figure 1: The position in network architecture in which
Jamming.Net is placed.
2.1 Java Multithread Architecture
To Build this kind of architecture we decide to use Java [2] as developing language; it grants us
desirable features such us portability, object oriented programming, general network API and
multithreading. While portability allows us to run Jamming.Net on the most common platform
(both Unix and NT), O-O Programming allows us a clear prototyping, finally network API programming
and multithread allows us to gain benefits from lightweight process and BSD-like network features
[3].
Inner server architecture is described in fig. 2. It consists of many threads. A subset of them
lives forever and provides general functions such us management, periodic fault discovery,
handling of incoming connections. Another subset is created dynamically for every incoming
connection and only lives for the duration of same connection. We can explain it in detail:
- One Listening Thread that listens to incoming connections from Web clients,
creates a new thread to handle each one, choosing a www server to serve them, among the
configured set. Since now, we call worker a HTTP
server in the configured set. The method of choosing
a worker is a matter of balance, and we are going to
talk about it later.
- A set of Switching Threads, each one created dynamically by the listening thread
when a new HTTP connection arrives. They take the job to store & forward
all HTTP packets moving from the assigned www client
to the chosen worker, as well as all answer from the worker
to the client. In this way, they die as soon as
there is no more data to handle.
- One Control Thread acceptinging to commands given
through a control panel applet. In fact these
commands are used to modify the server's state in
real time (balance policy, workers' type,
fault-tolerance). We are going to talk about
Jamming.Net management later.
- One Sampling Thread, running at low priority, which performs periodic network's
sampling in order to discover a worker's fault.
Figure 2:Jamming.Net server architecture
2.2 Balancing Policy and Network Scalability
Inside Jamming.Net, load balancing consists in choosing which worker serves incoming requests,
dynamically and for every single HTTP connection. This technique is more useful than HTTP
Redirect or DNS Balancing methods [4] because it gives the administrator control on the load
generated by each WWW request. In fact, HTTP Redirect performs a redirect to another
web server only for the first incoming connection. All subsequent requests go directly to that server.
DNS Balancing has many problems well described in [4]: above all, it shows poor fault tolerance
and poor balancing in case of spot traffic.
Jamming.Net implements three balancing policies in order to let the listener thread choose which
worker serves the incoming request:
- Round Robin
: for every
incoming connection, next worker in modulo is
chosen.
- Uniform Random
: a worker
is chosen basing on a generated uniform random
number.
- Origin IP Hashing
: given
the IP address of WWW client (the browser), a worker
is chosen by applying a hash function on it.
Consequently incoming connections from the same
machine are always served by the same worker, by
using this technique.
Network Scalability is simply reached enlarging the set of HTTP servers acting as workers.
This could be exploited via the management applet, without performing any Jamming.Net reset.
So, when traffic goes over a certain threshold, a network administrator can expand the system
by preserving the existing hardware and software installation.
2.3 Fault Tolerance
Fault tolerance is the set of techniques used by the server to discover the failure of a worker.
Implemented techniques are grouped in two classes:
- Synchronous Sampling
: as
seen before, a worker is selected when a www
connection is arriving, basing on the chosen
balancing policy. If it happens that a worker does
not answer before a time-out threshold, a fault
condition is raised. This kind of testing is
synchronous; in fact it depends on both www client
request and chosen balancing policy.
- Asynchronous Sampling
:
as seen in architecture schema, a sampler thread
stays alive with minimum priority. When a threshold
time arrives it choices a worker, produces a request
and waits for an answer. If it does not come, then a
fault condition is raised. This kind of testing is
asynchronous since it does not depend on www
requests originating from clients.
If a fault is raised, some strategies of recovery could be exploited:
- Worker Remove
: the worker is removed from
the list of those available, and another available
worker is selected.
- Administrator Notify
:
the event of worker's fault is notified to the
administrator. He / She could decide what to do to
recover from fault.
2.4 Security
Some security controls are performed by server:
- Time-out on Listener side
: a system of time-outs is exploited on
the listener thread in order to avoid that incoming
connections with no HTTP request could cause a waste
of resources.
- Network address verifying
: when a new worker is inserted in the
set of those available, Jamming.Net performs a
control that it has a valid network address, and
that there is a www server on the given port. This
is to prevent an erroneous configuration.
2.5 System Cache
In order to reduce communication latency for small TCP-IP connections, a simple LRU cache mechanism
was implemented. Each store & forward thread accesses cache concurrently by
using a mutex mechanism. Cached data is stored in the local file system and loaded in memory
by means of a hash table. It is possible to configure: data TTL, maximum dimension of cached data
and maximum cache dimension. By adopting a very small cache substantially improves the achieved
performance.
2.6 Workers Architecture: Replication or Shared File System
Workers are machines running a WWW server. There are two possible
strategies for mantaining the content of these servers. The information published
could be:
- Replicated
: for example
on a Unix System using the "rdist" [5]
mechanism
- Shared
: for example on
an Unix System using NFS or AFS [6], on NT windows
via SMB [7]
Workers could be placed on the same LAN of Jamming.Net server or, improving performances, on a
dedicated LAN. In such cases, the machine running Jamming.Net must act as a router for this
auxiliary LAN. We can note that in these architectures Jamming.Net could act as a simple firewall.
3 Management Applet
The whole server configuration is performed by using an applet written by using JDK1.0 and
consequently running on most common browsers. It is possible to modify the inner configuration of
Jamming.Net without stopping and restarting the server, by using the applet. We can see a subset
of the applet's tabbed panels in fig. 3, they are used to group together server functionality:
- Workers' Control
: this
panel gives to the administrator the possibility to
modify the workers' set known by Jamming.Net.
- Balance Technique
: this
panel allows the administrator to select the
balancing policy used by Jamming.Net
- Fault Panel
: this panel
permits the administrator to select what to do if a
worker fault is raised, what kind of sampling to
perform (synchronous and/or asynchronous) and to
choose sampling intervals.
- Cache Panel: this panel permits the administrator to configure and to manage system
cache, to choose maximum occupation allowed in file system, maximum file dimension and data TTL.
Interactions with server are performed by using three buttons. The Submit button
submits to the server the choices selected in the applet. The Resync button loads
from server the actual configuration. The Status button loads the server's status
(number of faults and statistics).
4 Current State of Work and Future WorkCurrently, we have a working system that
meets the requirements described above. Jamming.Net is
used to share traffic on a test bed architecture
composed of three workers running Linux on a
bi-processor Pentium with Apache[8] used as HTTP server.
We are testing what is the maximum throughput reached by
the distributed architecture under different load
conditions, different dimension of data requested and
different balancing policy. Jamming.Net's code is
available on request and the executable was accepted by
Java Application Catalyst Italia (maintained by Sun
Italy), Gamelan (maintained by Developer.com), JavaToys
and JARS. We believe that this work could point out the
general need to move from an unique information server
to a scenario of distributed servers' set.
5 Acknowledgments
I want to thank Dr. R. Perego and Dr. S. Orlando because they have suggested me some ideas
behind Jamming.Net. I also thank Prof. G. Attardi, all Agent Group in CS Multimedia Lab, all
the cool people belonging to Studio I-tech, Dr. Dario Lupi, Nico Tranquilli and Dr. Paola Padella.
They all gave me critic review about this work.
6 References
[1] A Distributed Testbed for National Information Provisioning, http://ircache.nlanr.net/Cache/
[2] Java Computing, http://www.sun.com/java/
[3] JavaSoft: Technical Articles, http://developer.javasoft.com/developer/technicalArticles/#network
[4] ONE-IP:Techniques for Hosting a Service on a Cluster of Machines,
http://www6.nttlabs.com/HyperNews/get/PAPER196.html
[5] Distributing files remotely over TCP/IP, http://obgyn.umsmed.edu:457/NetAdminG/rdistN.rdist.html
[6] Transarc AFS Documentation, http://eis.jpl.nasa.gov/afs/transarc.com/public/www/Public/Documentation/afs_doc.html
[7] SMB Resources and Information, http://www.ns.aus.com/smb-res.html
[8] Apache's site, http://www.apache.org |