In recent years, clusters of multiprocessor machines became one of the
most important parallel architectures. The architecture was promoted by
the ASCI (Accelerated Strategic Computing Initiative) project [
1].
There are economic, as well as performance reasons for this promotion.
The building blocks of such supercomputers are multiprocessor machines
(nodes). The nodes consist of a number of processors (2-64) which have
a shared memory. The nodes are standard machines or consist of standard
components and therefore they can be produced cheaply. They are connected
by a fast interconnection network. Depending on the network topology,
there are at least two levels of parallel hierarchy.
Processors in the same node can communicate
very quickly using the shared memory, while the communication between
processors of different nodes has to be done over the slower network.
The architecture is very scalable. It is possible to add more nodes or
even different types of nodes step by step. The architecture permits
to build scalable systems with a good price performance ratio. And they
also can compete with other systems concerning peak performance. Looking
at the actual Top500 list [
35], more than 30% of the
fastest supercomputers
in the world belong to this kind of architecture. The following is a survey
on endeavours to exploit the hybrid architecture in order to receive the
peak performance of the hardware. The survey is structured as follows.
The first section gives a more detailed overview of the considered architectures
and their position in the world of supercomputing. In the second section
models and methodologies to develop efficient algorithms on the target
platform are presented. Section three shows which languages and libraries
are used in practice to programm the systems. The last section deals with
algorithms, libraries and applications which are especially build and
optimized for clusters of multiprocessors.