One long-standing challenge in distributed-memory parallel computing is mapping an algorithm to the various network topologies on modern high-performance computing systems. This is even more important with the advent of tera-scale computing where localizing communications is an absolute necessity for scalable algorithms. Many scientific algorithms can be efficiently mapped onto a virtual 2D or 3D mesh. This virtual 2D or 3D mesh topology of the application can then be mapped efficiently onto most network hardware topologies while ensuring that neighboring nodes on the virtual mesh have direct hardware connections. The goal of this research will be to automatically provide the best mapping of a 2D or 3D virtual mesh to the underlying network topology at run-time.
A tool called NodeMap will be created that will automatically determine the topology of a network using a variety of techniques. These may include everything from use of the gethostname() function to determine processes on the same SMP system, custom functions such as myphysnode() on the Cray T3E when available, static configuration information where possible, and simple latency and bandwidth measurements if necessary. It will also be capable of determining the best mapping of a Cartesian topology onto the given nodes.
The MPI_Cart_create() function has a reorder flag (currently unused in most implementations) that determines whether the function will reorganize the nodes to optimize the algorithm to the topology of the network. This function will run the NodeMap tool to determine the best mapping of the virtual mesh to the nodes automatically for each run. It will also test out the mapping using global shift operations to find any links where contention may exist. If the application programmer creates an algorithm for a virtual Cartesian mesh, the code will be optimally mapped to the nodes automatically, providing both efficiency and portability across many high-performance computing systems.