Memory-level parallelism

Memory-level parallelism (MLP) is a term in computer architecture referring to the ability to have pending multiple memory operations, in particular cache misses or translation lookaside buffer (TLB) misses, at the same time.

In a single processor, MLP may be considered a form of instruction-level parallelism (ILP). However, ILP is often conflated with superscalar, the ability to execute more than one instruction at the same time. E.g., a processor such as the Intel Pentium Pro is five-way superscalar, with the ability to start executing five different microinstructions in a given cycle, but it can handle four different cache misses for up to 20 different load microinstructions at any time.

It is possible to have a machine that is not superscalar but which nevertheless has high MLP.

Arguably a machine that has no ILP, which is not superscalar, which executes one instruction at a time in a non-pipelined manner, but which performs hardware prefetching (not software instruction level prefetching) exhibits MLP (due to multiple prefetches outstanding) but not ILP. This is because there are multiple memory operations outstanding, but not instructions. Instructions are often conflated with operations.

Furthermore, multiprocessor and multithreaded computer systems may be said to exhibit MLP and ILP due to parallelism—but not intra-thread, single process, ILP and MLP. Often, however, we restrict the terms MLP and ILP to refer to extracting such parallelism from what appears to be non-parallel single threaded code.

References

Glew, A. (1998). "MLP yes! ILP no!" (abstract / slides), In Wild and Crazy Ideas Session, 8th International Conference on Architectural Support for Programming Languages and Operating Systems, October 1998.
Lua error in package.lua at line 80: module 'strict' not found.
Lua error in package.lua at line 80: module 'strict' not found.
Lua error in package.lua at line 80: module 'strict' not found.
Lua error in package.lua at line 80: module 'strict' not found.
Lua error in package.lua at line 80: module 'strict' not found.

v t e Parallel computing
General	Distributed computing Cloud computing High-performance computing
Levels	Bit Instruction Task Data Memory
Multithreading	Temporal Simultaneous Preemptive Cooperative
Theory	PRAM model Analysis of parallel algorithms Amdahl's law Gustafson's law Cost efficiency Karp–Flatt metric Slowdown Speedup
Elements	Process Thread Fiber Instruction window
Coordination	Multiprocessing Memory coherency Cache coherency Cache invalidation Barrier Synchronization Application checkpointing
Programming	Models Implicit parallelism Explicit parallelism Concurrency Non-blocking algorithm
Hardware	Flynn's taxonomy SISD SIMD MISD MIMD Pipelined processor Superscalar processor Vector processor Multiprocessor symmetric asymmetric Memory shared distributed distributed shared UMA NUMA COMA Massively parallel computer Computer cluster Grid computer
APIs	POSIX Threads OpenMP OpenCL OpenHMPP OpenACC MPI PVM UPC TBB Boost.Thread Global Arrays Ateji PX Charm++ Cilk Coarray Fortran CUDA Dryad C++ AMP PLINQ TPL
Problems	Embarrassingly parallel Software lockout Scalability Race condition Deadlock Livelock Starvation Deterministic algorithm Parallel slowdown
Category: parallel computing Media related to Lua error in package.lua at line 80: module 'strict' not found. at Wikimedia Commons

v t e CPU technologies
Architecture	Von Neumann Harvard (Modified Harvard) Dataflow TTA
Instruction set	ASIP CISC RISC EDGE EPIC MISC OISC VLIW NISC ZISC TRIPS Comparison
Word size	1-bit 4-bit 8-bit 9-bit 10-bit 12-bit 15-bit 16-bit 18-bit 22-bit 24-bit 25-bit 26-bit 27-bit 31-bit 32-bit 33-bit 34-bit 36-bit 39-bit 40-bit 48-bit 50-bit 60-bit 64-bit 128-bit 256-bit 512-bit variable
Execution	Instruction pipelining Bubble Operand forwarding Out-of-order execution Register renaming Speculative execution Branch predictor Memory dependence prediction Hazards
Parallel level	Bit Bit-serial Word Instruction Scalar Superscalar Task Thread Process Data Vector Memory
Multithreading	Temporal Simultaneous Preemptive Cooperative
Flynn's taxonomy	SISD SIMD MISD MIMD SPMD Addressing mode
Types	Digital signal processor (DSP) GPGPU Microcontroller Physics processing unit System on a chip (SoC) Cellular
Components	Address generation unit (AGU) Arithmetic logic unit (ALU) Barrel shifter Floating-point unit (FPU) Back-side bus Multiplexer Demultiplexer Registers Memory management unit (MMU) Translation lookaside buffer (TLB) Cache Register file Microcode Control unit Clock rate
Power management	APM ACPI Dynamic frequency scaling Dynamic voltage scaling Clock gating
CPU hardware security	NX bit Hardware restriction (firmware) Trusted Execution Technology Secure cryptoprocessor Hardware security module Hengzhi chip

Memory-level parallelism

See also

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools