HPC Challenge Benchmark

HPC Challenge Benchmark
Original author(s)	Innovative Computing Laboratory, University of Tennessee
Stable release	1.5.0a
Development status	Active
Platform	Cross-platform
License	BSD
Website	http://icl.cs.utk.edu/hpcc/

The HPC Challenge Benchmark combines several benchmarks to test a number of independent attributes of the performance of high-performance computer (HPC) systems. The project has been co-sponsored by the DARPA High Productivity Computing Systems program, the United States Department of Energy and the National Science Foundation.^[1]

Context

The performance of complex applications on HPC systems can depend on a variety of independent performance attributes of the hardware. The HPC Challenge Benchmark is an effort to improve visibility into this multidimensional space by combining the measurement of several of these attributes into a single program.

Although the performance attributes of interest are not specific to any particular computer architecture, the reference implementation of the HPC Challenge Benchmark in C and MPI assumes that the system under test is a cluster of shared memory multiprocessor systems connected by a network. Due to this assumption of a hierarchical system structure most of the tests are run in several different modes of operation. Following the notation used by the benchmark reports, results labeled "single" mean that the test was run on one randomly chosen processor in the system, results labeled "star" mean that an independent copy of the test was run concurrently on each processor in the system, and results labeled "global" mean that all the processors were working in coordination to solve a single problem (with data distributed across the nodes of the system).

Components

The benchmark currently consists of 7 tests (with the modes of operation indicated for each):

HPL^[2] (High Performance LINPACK) - measures performance of a solver for a dense system of linear equations (global).
DGEMM - measures performance for matrix-matrix multiplication (single, star).
STREAM^[3] - measures sustained memory bandwidth to/from memory (single, star).
PTRANS - measures the rate at which the system can transpose a large array (global).
RandomAccess - measures the rate of 64-bit updates to randomly selected elements of a large table (single, star, global).
FFT - performs a Fast Fourier Transform on a large one-dimensional vector using the generalized Cooley-Tukey algorithm (single, star, global).
Communication Bandwidth and Latency - MPI-centric performance measurements based on the b_eff^[4] bandwidth/latency benchmark.

Performance Attributes

At a high level, the tests are intended to provide coverage of four important attributes of performance: double-precision floating-point arithmetic (DGEMM and HPL), local memory bandwidth (STREAM), network bandwidth for "large" messages (PTRANS, RandomAccess, FFT, b_eff), and network bandwidth for "small" messages (RandomAccess, b_eff). Some of the codes are more complex than others and can have additional performance sensitivities. For example, in some systems HPL performance can be limited by network bandwidth and/or network latency.

Competition

The annual HPC Challenge Award Competition at the Supercomputing Conference focuses on four of the most challenging benchmarks in the suite:

Global HPL
Global RandomAccess (OR BSS Random Access Benchmark)
EP STREAM (Triad) per system
Global FFT

There are two classes of awards:

Class 1: Best performance on a base or optimized run submitted to the HPC Challenge website.^[5]
Class 2: Most "elegant" implementation of four or five computational kernels including three or more of the HPC Challenge benchmarks.^[6]

References

↑ Lua error in package.lua at line 80: module 'strict' not found.
↑ Lua error in package.lua at line 80: module 'strict' not found.
↑ Lua error in package.lua at line 80: module 'strict' not found.
↑ Lua error in package.lua at line 80: module 'strict' not found.
↑ The benchmark is designed to allow replacement of a limited set of functions with more highly optimized versions while remaining a "base" run. Additional (but still limited) modifications are allowed under the category of "optimized" runs.
↑ Lua error in package.lua at line 80: module 'strict' not found.

External links

HPC Challenge Benchmark Official Website
HPC Challenge Award Competition Official Website
BSS Random Access Benchmark Performance Evaluation and Optimization of Random Memory Access on Multicores with High Productivity (Best Paper Award) at ACM/IEEE HiPC 2010

[1] Lua error in package.lua at line 80: module 'strict' not found.

[2] Lua error in package.lua at line 80: module 'strict' not found.

[3] Lua error in package.lua at line 80: module 'strict' not found.

[4] Lua error in package.lua at line 80: module 'strict' not found.

[5] The benchmark is designed to allow replacement of a limited set of functions with more highly optimized versions while remaining a "base" run. Additional (but still limited) modifications are allowed under the category of "optimized" runs.

[6] Lua error in package.lua at line 80: module 'strict' not found.

[1]

[2]

[3]

[4]

[5]

[6]

HPC Challenge Benchmark

Contents

Context

Components

Performance Attributes

Competition

See also

References

External links

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools