MP_Lite Performance using Channel-Bonded Gigabit Ethernet

          Channel-bonding is the striping of data from each message over multiple interfaces. Channel-bonding of multiple Gigabit Ethernet cards in a cluster can increase the throughput dramatically without increasing the cost of a cluster significantly. Unfortunately, it is not that easy to achieve good performance in practice.

          The Linux kernel has the ability to do low-level channel bonding. This works alright at Fast Ethernet speeds, where a doubling of the throughput can be achieved using 2 cards. It is not tuned for Gigabit speeds yet. The graph below shows that using 2 GigE cards actually produces poorer performance than using just a single GigE card.

          MP_Lite can do channel bonding at a higher level by striping data from a single message across multiple sockets set up between each pair of computers. The algorithm also tries to hide latency effects by increasing the amount of data being striped exponentially, starting with small chunks to get each interface primed, then doubling the size each time to hide the latency. This is a flexible approach, working for any Unix system, but will always suffer from a loss of potential performance due to the higher latency involved.

          With careful tuning of the algorithm, a nearly ideal doubling of the throughput has been achieved using 2 GigE cards. Only a minimal benefit can be produced from using a 3rd GigE card.

          The best approach would be to fix the problems with the Linux kernel bonding.c module so that the bonding can be done at a much lower level where the latency would play less of a role. This would allow the benefits of channel bonding to be passed on to full MPI and PVM implementations without requiring any modifications.