MPUTIL_INIT

Initialize the MPUTIL timing environment

Input Parameter

thread_level
Indicate the desired level of threadedness. See below.

Notes

When benchmarking codes on SMP nodes, it is common to run the benchmark on a single core. However, in many applications, this is not the configuration that is important for the application. It is common for the application to be running the same (or similar) code on all cores at the same time. The MPUTIL_XXX macros provide a relatively easy way to run the same, single-core benchmark on 1, 2, 3, ..., k cores for a node with k total cores. Additional macros makes it easy to generate simple tables where there is a separate column for each number of active cores for the benchmark. See MPUTIL_LABEL and MPUTIL_OUTAPP for output.

A typical program that uses these macros will have the following structure:

   ...
   MPUTIL_INIT(0);
   ...any code that is executed only once, independent of the number of cores
   MPUTIL_BEGIN;
   ...any code that each active core must execute, such as initialization
   MPUTIL_LABEL("label text for row, in printf format");
   MPUTIL_SYNC;
   ... benchmark code, including timer calls.  Total time in tval
   MPUTIL_OUTAPP("\t%.2e\n",tval);
   MPUTIL_END;
   MPUTIL_FINALIZE;

The thread_level is used only when creating the parallel version, which uses MPI processes to execute the benchmark in parallel. To simplify the interface, this does not use the MPI-defined levels (thus, the benchmark never need inlude mpi.h). The valid values are 0 for no threads used (MPI_THREAD_SINGLE), 1 for threads used in loops (MPI_THREAD_FUNNELED), and 2 for benchmarks that need MPI_THREAD_MULTIPLE. Most regular benchmarks will use 0, but OpenMP benchmarks will need to set thread_level to 1.