HPC Challenge Benchmark

From Infogalactic: the planetary knowledge core
Jump to: navigation, search
HPC Challenge Benchmark
Original author(s) Innovative Computing Laboratory, University of Tennessee
Stable release 1.5.0a
Development status Active
Platform Cross-platform
License BSD
Website http://icl.cs.utk.edu/hpcc/

The HPC Challenge Benchmark combines several benchmarks to test a number of independent attributes of the performance of high-performance computer (HPC) systems. The project has been co-sponsored by the DARPA High Productivity Computing Systems program, the United States Department of Energy and the National Science Foundation.[1]

Context

The performance of complex applications on HPC systems can depend on a variety of independent performance attributes of the hardware. The HPC Challenge Benchmark is an effort to improve visibility into this multidimensional space by combining the measurement of several of these attributes into a single program.

Although the performance attributes of interest are not specific to any particular computer architecture, the reference implementation of the HPC Challenge Benchmark in C and MPI assumes that the system under test is a cluster of shared memory multiprocessor systems connected by a network. Due to this assumption of a hierarchical system structure most of the tests are run in several different modes of operation. Following the notation used by the benchmark reports, results labeled "single" mean that the test was run on one randomly chosen processor in the system, results labeled "star" mean that an independent copy of the test was run concurrently on each processor in the system, and results labeled "global" mean that all the processors were working in coordination to solve a single problem (with data distributed across the nodes of the system).

Components

The benchmark currently consists of 7 tests (with the modes of operation indicated for each):

  1. HPL[2] (High Performance LINPACK) - measures performance of a solver for a dense system of linear equations (global).
  2. DGEMM - measures performance for matrix-matrix multiplication (single, star).
  3. STREAM[3] - measures sustained memory bandwidth to/from memory (single, star).
  4. PTRANS - measures the rate at which the system can transpose a large array (global).
  5. RandomAccess - measures the rate of 64-bit updates to randomly selected elements of a large table (single, star, global).
  6. FFT - performs a Fast Fourier Transform on a large one-dimensional vector using the generalized Cooley-Tukey algorithm (single, star, global).
  7. Communication Bandwidth and Latency - MPI-centric performance measurements based on the b_eff[4] bandwidth/latency benchmark.

Performance Attributes

At a high level, the tests are intended to provide coverage of four important attributes of performance: double-precision floating-point arithmetic (DGEMM and HPL), local memory bandwidth (STREAM), network bandwidth for "large" messages (PTRANS, RandomAccess, FFT, b_eff), and network bandwidth for "small" messages (RandomAccess, b_eff). Some of the codes are more complex than others and can have additional performance sensitivities. For example, in some systems HPL performance can be limited by network bandwidth and/or network latency.

Competition

The annual HPC Challenge Award Competition at the Supercomputing Conference focuses on four of the most challenging benchmarks in the suite:

There are two classes of awards:

  • Class 1: Best performance on a base or optimized run submitted to the HPC Challenge website.[5]
  • Class 2: Most "elegant" implementation of four or five computational kernels including three or more of the HPC Challenge benchmarks.[6]

See also

References

  1. Lua error in package.lua at line 80: module 'strict' not found.
  2. Lua error in package.lua at line 80: module 'strict' not found.
  3. Lua error in package.lua at line 80: module 'strict' not found.
  4. Lua error in package.lua at line 80: module 'strict' not found.
  5. The benchmark is designed to allow replacement of a limited set of functions with more highly optimized versions while remaining a "base" run. Additional (but still limited) modifications are allowed under the category of "optimized" runs.
  6. Lua error in package.lua at line 80: module 'strict' not found.

External links