next up previous
Next: Bibliography Up: Hierarchical Parallel Computation Models Previous: Abstract models for hierarchical

Software tools for hierarchical parallel programming

Parallel applications typically manage a huge amount of data and have to block and reduce the amount of communication as well as that of I/O operation. In this second part of the chapter we survey EM and HM programming tools that can be exploited for parallel computation, or are being integrated with parallel programming tools. Tools are classified with respect to the following criteria.
Sequential HM support
Most tools deal with EM/HM structures on parallel disk but only in a sequential programming environment.
Parallel HM support
Some tools allow or require parallel coordinated access to the data (e.g. parallel I/O greatly benefits from coordination).
Memory and Parallel structure
tools implementing different abstractions differ in their expressive constraints.
Caching and Prefetching Approach
Some tools supply a programming interface that allows to efficiently program explicit HM algorithms. On the other hand, we can also use application benchmarking and tuning to improve caching and prefetching policies. It is also feasible to design application-specific policies for these tasks.
Licensing Policy
whether the software is free or commercial, and source code is available.

This is the list of software tools to discuss.

The TPIE library. [14] It simplifies the implementation of algorithms based on the PDM model. It provides data structures and algorithms to solve batched problems, with a strongly stream-oriented interface to the data. Carefully designed block transfer engines manage the disk(s), but interprocess coordination is left to the user. Extension of the library to the full PDM model (with parallel disks) is planned. In a parallel environment, TPIE currently provides to a process efficient access to data on a local disk.
LEDA [15] is a commercial library of data structures and algorithms for combinatorial and geometric computation. Its secondary memory extension LEDA-sm [16] is publicly available under the GPL license. The implementation relies on a library kernel (the EM manager), which implements a PDM abstraction and programming interface over concrete disk devices.
The Vic* compiler [17] uses the PDM model as a reference, and its support has been implemented over a set of different sequential and parallel architectures. It has been used to evaluate the actual performance several sequential and parallel PDM algorithms, see for instance [18].
The aforementioned Paderborn PUB BSP library [19], which can support BSP* and D-BSP algorithms.
The MPI-2 standard specifies a parallel I/O interface. Programs written using MPI can exploit message passing parallelism and a shared disks space, while remaining largely portable. A full discussion of parallel file systems is not appropriate here, we summarize the MPI-IO approach and its rationale [20,21]. We mention an example of the unexpected interactions among the program access patterns and the optimizations made at different data management levels [22].
The SIMPLE programming environment [23] unifies into a single programming interface the support for shared-memory (SMP) and message passing hardware (clusters). All-to-all, barrier and other parallel primitives are implemented exploiting the underlying (unrestricted, heterogeneous) architecture in the best way.[*]
GMS is a software layer to support global data caching within the aggregate memory of a network. PGMS [24] adds prefetching by exploiting program-provided hints. The system exploits a three level memory hierarchy (disks, global memory, local m.).
TIP [25] is another system that exploits application-provided hints for prefetching and caching. The work [26] describes a tool to generate prefetching hints for future I/O by speculative execution.

next up previous
Next: Bibliography Up: Hierarchical Parallel Computation Models Previous: Abstract models for hierarchical
Massimo Coppola 2002-02-08