Support User Manuals

Q-Logic IB6054601-00 D Switch User Manual

Open as PDF

of 122

IB6054601-00 D A-1

Appendix A

Benchmark Programs

Several MPI performance measurement programs are installed from the

mpi-benchmark RPM. This Appendix describes these useful benchmarks and how

to run them. These programs are based on code from the group of Dr. Dhabaleswar

K. Panda at the Network-Based Computing Laboratory at the Ohio State University.

For more information, see:

http://nowlab.cis.ohio-state.edu/

These programs allow you to measure the MPI latency and bandwidth between two

or more nodes in your cluster. Both the executables, and the source for those

executables, are shipped. The executables are shipped in the

mpi-benchmark

RPM, and installed under

/usr/bin. The source is shipped in the mpi-devel RPM

and installed under

/usr/share/mpich/examples/performance.

The examples given below are intended only to show the syntax for invoking these

programs and the meaning of the output. They are NOT representations of actual

InfiniPath performance characteristics.

A.1

Benchmark 1: Measuring MPI Latency Between Two Nodes

In the MPI community, latency for a message of given size is defined to be the time

difference between a node program’s calling

MPI_Send and the time that the

corresponding

MPI_Recv in the receiving node program returns. By latency, alone

without a qualifying message size, we mean the latency for a message of size zero.

This latency represents the minimum overhead for sending messages, due both to

software overhead and to delays in the electronics of the fabric. To simplify the

timing measurement, latencies are usually measured with a ping-pong method,

timing a round-trip and dividing by two.

The program

osu_latency, from Ohio State University, measures the latency for a

range of messages sizes from 0 to 4 megabytes. It uses a ping-pong method, in

which the rank 0 process initiates a series of sends and the rank 1 process echoes

them back, using the blocking MPI send and receive calls for all operations. Half

the time interval observed by the rank 0 process for each such exchange is a

measure of the latency for messages of that size, as defined above. The program

uses a loop, executing many such exchanges for each message size, in order to

get an average. It defers the timing until the message has been sent and received

a number of times, in order to be sure that all the caches in the pipeline have been

filled.

previous next