Introduction to MP_Lite:
MP_Lite is a message-passing library similar in its purpose and capabilities
to MPI implementations or PVM. Its advantage is that it is a very small package that
you can take anywhere and compile in under a minute. Its streamlined design allows it
to outperform other packages on some of the platforms supported. On platforms not specifically
supported,
the MP_Lite syntax can be run on top of MPI without loss of efficiency.
This introduction will walk you through the steps needed to compile the library,
link it to your code or one of the test programs, and run it. Hopefully
I have anticipated some problems that might occur, but if not you might check the
problems link on the MP_Lite main page,
or email me at turner@ameslab.gov
as a last resort.
Minimum characteristics
In order to run MP_Lite, you must be in an environment that provides certain
elements of a singe-system image. The nodes must share a common file system and
they must be able to communicate freely.
This is the case on traditional massively parallel processing (MPP) machines such as a
Cray T3E, Intel Paragon, or an IBM SP.
On workstation/PC clusters or a loose set of workstations/PCs, the machines must run
the same operating system, share a common file system,
and you must be able to ssh or rsh between machines
without being prompted for a password.
To test this, simply do an 'ssh other_host_name ls' which should return the
contents of your home directory from the other machine without prompting you for a
password.
If you have problems, see the man pages for ssh or rsh to set up the
appropriate .rhosts file, or talk with your system administrator.
The mprun script that starts a job running uses rsh (or ssh) to start the
individual programs on each node.
If ssh is disabled on your systems in favor of
rsh, simply change the 'set $SH=ssh' line at the beginning of the bin/mprun script.
Uncompressing the tar file
Download the MP_Lite.tar.gz tar file from
the MP_Lite homepage.
gunzip MP_Lite.tar.gz
tar -xvf MP_Lite.tar
cd MP_Lite
NOTE: You may want to add the path to the MP_Lite/bin directory to
your $path environmental variable.
Compiling MP_Lite
When compiling MP_Lite, you must specify the native communication library that
is available on the system. Below are examples of all the platforms supported, and an
explanation of the options on each. Compilation usually takes under a minute.
Unix Workstations/PCs
make tcp uses a SIGIO interrupt driven method
that guarantees message progress at all times.
Performance can often be improved by increasing the TCP socket buffer
sizes. MP_Lite does this automatically, but you may need to tune the
system to allow for larger buffer sizes.
Performance is very poor under AIX due to the slow time (50 ms) for
a SIGIO interrupt to propagate.
SMP systems
The TCP module contains support for both message-passing using
sockets and SMP message-passing using shared-memory segments.
make tcp will force all communications (on-node and off-node)
to use sockets.
#define USE_SHM in tcp.c to enable use
of a memory segment to handle SMP communications. This currently cannot
be used in conjunction with sockets.
Another option for SGI SMP systems that has not been fully tested is to
use the native shmem library by doing a make shmem.
Cray T3E
The Cray T3E version runs on top of the native shmem library.
Use make T3E or make shmem to compile.
Communication rates are up to 320 MB/sec with a 9 µs latency. This is only
slightly better than the current Cray-optimized MPI, so there is not much
reason to use this module on the T3E anymore.
Scalar machines
make scalar will simply remove all the message-passing calls in a graceful
manner allowing you to run your code on scalar machines without requiring
use of an mprun or mpirun startup script. The timing functions will
still work.
MPI
If you are using the MP_Lite syntax, and need to run on a platform not specifically
supported here, you can always compile MP_Lite using make mpi
to run on top of MPI provided it is
installed. In this way, you can use the MP_Lite syntax without sacrificing any
portability. Do not try to run MPI programs on top of MP_Lite that has
been compiled using 'make mpi', as the two levels of MPI will conflict.
Make sure to specify the proper MPI include path and library when linking
your code to the libmplite.a library.
Testing MP_Lite
make pong will compile and link the pong.c test code. To run this, use the
mprun command in the MP_Lite/bin directory.
mprun -np 2 -h host1 host2 pong
This will start pong running between host1 and host2. Type mprun -usage
for a more complete listing of the mprun options.
Linking your code
Once MP_Lite has been compiled, simply compile your code and link in the MP_Lite
library libmplite.a. The pong.c test case above should provide you with an
example of this. If you are running your code with MP_Lite syntax on top of
MPI (you compiled with make mpi), you will also need to link in the MPI library.
Running your code
mprun -np N -h host1 [host2 ...] program args
mprun has many options that make it easy to specify the hosts to run your code on.
Each time it is run, it generates a file .mplite.config with the information from
the command line. This file contains the number of processors on the first line, the
number of network cards on the second line for workstation/PC clusters (
> 1 means channel bonding),
the program name and arguments next, followed by a list of the hostnames.
Subsequent mprun commands will start by reading the existing .mplite.config
file, overriding any parameters with those specified on the command line.
In other words, if you
want to run the same configuration as you did last time, simply type mprun with no
parameters. If you want to change only one part, such as running with the same host list
but with fewer nodes, simply specify the new number of nodes (mprun -np new_N).
You can tell mprun to start from an alternate configuration file using the -c parameter.
mprun -c alternate_config ... would start with a config file other than
the default .mplite.config.
For workstation/PC clusters where the node names are similar but incremented (node0,
node1, ...), you can specify an anchor point instead of listing all hostnames
(mprun -np N -a node3 prog args would run on N nodes starting with node3).
For SMP systems where the hostname is the same, you can just specify the number of
processes to start and the hostname once.
mprun -np N -smp hostname prog args would run N processes on hostname.
For running on top of MPI (MP_Lite compiled with make mpi), you can still use
mprun to launch your code. Simply use -mpi as the first parameter then do the
rest normally (mprun -mpi -np N ...).
This will only work with the newer versions of MPI which can handle the
machine list in the command line. For older versions, mprun -mpi2 is set
up to use the -arch command with the machine list in machines.LINUX. You can
modify the mprun script to fit your system if desired.
You may, of course, launch your code using mpirun instead if you desire.