Communication Libraries

There are many ways to program multiprocessor computers. SMP computers can be programmed using multithreading packages, or using message-passing. Distributed memory systems commonly use message-passing approaches of some sort, often having their own custom communication library at the heart. The one common feature between all modern multiprocessor machines is that they now support the MPI message-passing interface.

While MPI may not be the optimal solution on some systems, it does provide the same functionality and therefore makes MPI codes truly portable across a wide range of platforms. The application programmer must still optimize for the general case in order to make the code run well. For example, mapping a simulation space to a 2D virtual mesh may work well since the 2D virtual mesh maps well onto a wide variety of physical topologies such as a 3D mesh, hypercube, fat tree, etc.

Message-passing libraries

The MPI defines the syntax and functionality of the library calls, and is defined by a committee. It is not a library itself. There are many MPI implementations that meet this standard. Supercomputer vendors usually have their own MPI library that is optimized for their particular machines. There are also several MPI implementations that are distributed freely and work on a variety of architectures, MPICH and LAM being the most commonly used.

Below is a listing of the more common message-passing libraries. Included are two older libraries that are not used much any more due to the popularity of MPI. PVM and TCGMSG are predecessors of MPI, and their time has passed even though some of their codes have not.

MPICH ANL Full MPI for almost every architecture
LAM/MPI Indiana University Full MPI for many architectures
MP_Lite Ames Lab High performance, light weight MPI
MPI/Pro MPI/Pro Commercial MPI
PVM ORNL Parallel Virtual Machine
TCGMSG The Chemistry Group Message system

One-sided communications

Normal message-passing communications are two-sided, require the cooperation of both the sending and receiving nodes. There are also libraries that support one-sided communications, namely gets and puts. These operations allow one node to get data from, or put data to, another node without its cooperation.

This approach certainly sounds superior to using two-sided communications, where it is necessary to coordinate each data transfer on both sides. And it definitely is useful at times. In general, it is not as easy to use as it would seem since it often requires that the application programmer manually perform handshaking that is done automatically with two-sided communications. For example, a node can't get data from another node until the source node signals that the data is ready for transfer, so there are two communication calls needed anyway.

The MPI standard has defined a set of one-sided calls in the newest 2.0 release. At this time, not all of the MPI implementations support these calls, so MPI programs that use them may not be portable for a while.

The SHMEM library developed by Cray is a one-sided library for the Cray T3E and SGI Origin systems. It has become so popular that it is growing into a standard and is being implemented on a variety of platforms. The GPSHMEM library is a general purpose SHMEM library that provides the same one-sided interface but is implemented on top of lower level libraries.

Higher level libraries

There are several higher level libraries built on top of these message-passing libraries that can make life easier for the application programmer. Unfortunately, there is always a trade-off, and in this case it comes in a loss in efficiency. Additional research is needed to more fully understand the full cost of these methods.

The Global Array toolkit makes any distributed memory system look somewhat like an SMP at the application level. A global address space allows data to be stored across the nodes without the application programmer having to worry about the data layout. The data is then stored and retrieved when needed by any node, independent of what node it is actually on. All of the messy message-passing is hidden from the programmer.

The distributed data interface (DDI) of the quantum chemistry code GAMESS is another example of this approach. This layer provides the application with a global view of memory that is abstracted from the underlying message-passing layer.


Links to more advanced topics
ARMCI one-sided communication library
M-VIA OS bypass library
MVICH: MPI on top of M-VIA

Ames Laboratory | Condensed Matter Physics | Disclaimer | ISU Physics