LLVM

From Infogalactic: the planetary knowledge core
Jump to: navigation, search
LLVM
225px
Developer(s) LLVM Developer Group
Initial release 2003
Stable release 3.7.1 / January 5, 2016; 8 years ago (2016-01-05)[1]
Written in C++
Operating system Cross-platform
Type Compiler
License University of Illinois/NCSA Open Source License[2]
Website llvm.org

The LLVM compiler infrastructure project (formerly Low Level Virtual Machine) is a compiler infrastructure designed to be a set of reusable libraries with well-defined interfaces.

LLVM is written in C++ and is designed for compile-time, link-time, run-time, and "idle-time" optimization of programs written in arbitrary programming languages. Originally implemented for C and C++, the language-agnostic design of LLVM has since spawned a wide variety of front ends: languages with compilers that use LLVM include Common Lisp, ActionScript, Ada, D, Fortran, OpenGL Shading Language, Haskell, Java bytecode, Julia, Objective-C, Swift, Python, R, Ruby, Rust, Scala,[3] C#[4][5][6] and Lua.

The LLVM project started in 2000 at the University of Illinois at Urbana–Champaign, under the direction of Vikram Adve and Chris Lattner. LLVM was originally developed as a research infrastructure to investigate dynamic compilation techniques for static and dynamic programming languages. LLVM was released under the University of Illinois/NCSA Open Source License,[2] a permissive free software licence. In 2005, Apple Inc. hired Lattner and formed a team to work on the LLVM system for various uses within Apple's development systems.[7] LLVM is an integral part of Apple's latest development tools for Mac OS X and iOS.[8] Quite recently, Sony has been using LLVM's primary front end Clang compiler in the software development kit (SDK) of its PS4 console.[9]

The name LLVM was originally an initialism for Low Level Virtual Machine, but this became increasingly less apt as LLVM became an umbrella project that included a variety of other compiler and low-level tool technologies, so the project abandoned the initialism.[10] Now, LLVM is a brand that applies to the LLVM umbrella project, the LLVM intermediate representation, the LLVM debugger, the LLVM C++ standard library, etc. LLVM is administered by the LLVM Foundation. Its president is Tanya Lattner, a compiler engineer and Chris Lattner's spouse.[11]

The Association for Computing Machinery presented Adve, Lattner, and Evan Cheng with the 2012 ACM Software System Award for LLVM.[12]

Overview and description

LLVM can provide the middle layers of a complete compiler system, taking intermediate form (IF) code from a compiler and emitting an optimized IF. This new IF can then be converted and linked into machine-dependent assembly code for a target platform. LLVM can accept the IF from the GCC toolchain, allowing it to be used with a wide array of extant compilers written for that project.

LLVM can also generate relocatable machine code at compile-time or link-time or even binary machine code at run-time.

LLVM supports a language-independent instruction set and type system.[13] Each instruction is in static single assignment form (SSA), meaning that each variable (called a typed register) is assigned once and is frozen. This helps simplify the analysis of dependencies among variables. LLVM allows code to be compiled statically, as it is under the traditional GCC system, or left for late-compiling from the IF to machine code in a just-in-time (JIT) compiler fashion similar to Java. The type system consists of basic types such as integers or floats and five derived types: pointers, arrays, vectors, structures, and functions. A type construct in a concrete language can be represented by combining these basic types in LLVM. For example, a class in C++ can be represented by a combination of structures, functions and arrays of function pointers.

The LLVM JIT compiler can optimize unneeded static branches out of a program at runtime, and thus is useful for partial evaluation in cases where a program has many options, most of which can easily be determined unneeded in a specific environment. This feature is used in the OpenGL pipeline of Mac OS X Leopard (v10.5) to provide support for missing hardware features.[14] Graphics code within the OpenGL stack was left in intermediate form, and then compiled when run on the target machine. On systems with high-end GPUs, the resulting code was quite thin, passing the instructions onto the GPU with minimal changes. On systems with low-end GPUs, LLVM would compile optional procedures that run on the local central processing unit (CPU) that emulate instructions that the GPU cannot run internally. LLVM improved performance on low-end machines using Intel GMA chipsets. A similar system was developed under the Gallium3D LLVMpipe, and incorporated into the GNOME shell to allow it to run without a proper 3D hardware driver loaded.[15]

When it comes to the run-time performance of the compiled programs, GCC previously outperformed LLVM by about 10% on average.[16][17] Newer results do indicate, however, that LLVM has now caught up with GCC in this area, and is now compiling binaries of approximately equal performance, except for programs using OpenMP.[18]

Components

LLVM has become an umbrella project containing multiple components.

Front ends: programming language support

LLVM was originally written to be a replacement for the existing code generator in the GCC stack,[19] and many of the GCC front ends have been modified to work with it. LLVM currently supports compiling of Ada, C, C++, D, Delphi, Fortran, Objective-C and Swift using various front ends, some derived from version 4.0.1 and 4.2 of the GNU Compiler Collection (GCC).

Widespread interest in LLVM has led to a number of efforts to develop entirely new front ends for a variety of languages. The one that has received the most attention is Clang, a new compiler supporting C, Objective-C and C++. Primarily supported by Apple, Clang is aimed at replacing the C/Objective-C compiler in the GCC system with a system that is more easily integrated with integrated development environments (IDEs) and has wider support for multithreading. Objective-C development under GCC was stagnant and Apple's changes to the language were supported in a separately maintained branch.[citation needed]

The Utrecht Haskell compiler can generate code for LLVM which, though the generator is in the early stages of development, has been shown in many cases to be more efficient than the C code generator.[20] The Glasgow Haskell Compiler (GHC) has a working LLVM backend that achieves a 30% speed-up of the compiled code when compared to native code compiling via GHC or C code generation followed by compilation, missing only one of the many optimization techniques implemented by the GHC.[21]

There are many other components in various stages of development, including, but not limited to, the Rust compiler, a Java bytecode front end, a Common Intermediate Language (CIL) front end, the MacRuby implementation of Ruby 1.9, various front ends for Standard ML, and a new graph coloring register allocator.[citation needed]

LLVM Intermediate Representation

The core of LLVM is the intermediate representation (IR), a low-level programming language similar to assembly. IR is a strongly typed RISC instruction set which abstracts away details of the target. For example, the calling convention is abstracted through call and ret instructions with explicit arguments. Additionally, instead of a fixed set of registers, IR uses an infinite set of temporaries of the form %0, %1, etc. LLVM supports three isomorphic forms of IR: a human-readable assembly format, a C++ object format suitable for frontends, and a dense bitcode format for serialization. A simple "Hello, world!" program in the assembly format:

@.str = internal constant [14 x i8] c"hello, world\0A\00"

declare i32 @printf(i8*, ...)

define i32 @main(i32 %argc, i8** %argv) nounwind {
entry:
    %tmp1 = getelementptr [14 x i8]* @.str, i32 0, i32 0
    %tmp2 = call i32 (i8*, ...)* @printf( i8* %tmp1 ) nounwind
    ret i32 0
}

[22]

Back ends: instruction set and microarchitecture support

At version 3.4 LLVM supports many instruction sets, including ARM, Qualcomm Hexagon, MIPS, Nvidia PTX (called "NVPTX' in LLVM documentation), PowerPC, AMD TeraScale,[23] AMD GCN, SPARC, z/Architecture (called "SystemZ" in LLVM documentation), x86/x86-64, and XCore. Not all features are available on all platforms; most features are present for x86/x86-64, z/Architecture, ARM, and PowerPC.[24]

LLVM MC

The LLVM Machine Code subproject is LLVM's framework for translating machine instructions between textual forms and machine code. Previously, LLVM relied on the system assembler, or one provided by a toolchain, to translate assembly into machine code. LLVM MC's integrated assembler supports most LLVM targets, including x86, x86-64, ARM, and ARM64. For some targets, including the various MIPS instruction sets, integrated assembly support is usable but still in the beta stage.

Integrated linker: lld

The lld subproject is an attempt to develop a built-in, platform independent linker for LLVM.[25] Currently, Clang and LLVM must invoke the system or target linker to produce an executable. This requires having a separate linker for each desired target, which usually entails either installing or cross-compiling a copy of GNU Binutils for every target. lld aims to remove the dependence on a third party linker.

Debugger

<templatestyles src="Module:Hatnote/styles.css"></templatestyles>

Revision history[26]

Revision History
Version Release Date
3.7.1 Jan 05, 2016
3.7.0 Sep 01, 2015
3.6.0 Feb 27, 2015
3.5.0 Sep 03, 2014
3.4.0 Jan 02, 2014
3.3 Jun 17, 2013
3.2 Dec 20, 2012
3.1 May 22, 2012
3.0 Dec 01, 2011
2.9 Apr 06, 2011
2.8 Oct 05, 2010
2.7 Apr 27, 2010
2.6 Oct 23, 2009
2.5 Mar 02, 2009
2.4 Nov 09, 2008
2.3 Jun 09, 2008
2.2 Feb 11, 2008
2.1 Sep 26, 2007
2.0 May 23, 2007
1.9 Nov 19, 2006
1.8 Aug 09, 2006
1.7 Apr 20, 2006
1.6 Nov 08, 2005
1.5 May 18, 2005
1.4 Dec 09, 2004
1.3 Aug 13, 2004
1.2 Mar 19, 2004
1.1 Dec 17, 2003
1.0 Oct 24, 2003

See also

References

  1. Lua error in package.lua at line 80: module 'strict' not found.
  2. 2.0 2.1 Lua error in package.lua at line 80: module 'strict' not found.
  3. Lua error in package.lua at line 80: module 'strict' not found.
  4. Lua error in package.lua at line 80: module 'strict' not found.
  5. Lua error in package.lua at line 80: module 'strict' not found.
  6. LLVM, Chris Lattner, in The architecture of Open Source Applications, edited by Amy Brown, Greg Wilson, 2011
  7. Lua error in package.lua at line 80: module 'strict' not found.
  8. Lua error in package.lua at line 80: module 'strict' not found.
  9. Lua error in package.lua at line 80: module 'strict' not found.
  10. Lua error in package.lua at line 80: module 'strict' not found.
  11. Lua error in package.lua at line 80: module 'strict' not found.
  12. Lua error in package.lua at line 80: module 'strict' not found.
  13. Lua error in package.lua at line 80: module 'strict' not found.
  14. Lua error in package.lua at line 80: module 'strict' not found.
  15. Michael Larabel, "GNOME Shell Works Without GPU Driver Support", phoronix, 6 November 2011
  16. Lua error in package.lua at line 80: module 'strict' not found.
  17. Lua error in package.lua at line 80: module 'strict' not found.
  18. Lua error in package.lua at line 80: module 'strict' not found.
  19. Lua error in package.lua at line 80: module 'strict' not found.
  20. Lua error in package.lua at line 80: module 'strict' not found.
  21. Lua error in package.lua at line 80: module 'strict' not found.
  22. For the full documentation, refer to llvm.org/docs/LangRef.html.
  23. Lua error in package.lua at line 80: module 'strict' not found.
  24. Target-specific Implementation Notes: Target Feature Matrix // The LLVM Target-Independent Code Generator, LLVM site.
  25. Lua error in package.lua at line 80: module 'strict' not found.
  26. http://llvm.org/releases/

External links