LLVM 2.7 Release Notes
- Introduction
- Sub-project Status Update
- External Projects Using LLVM 2.7
- What's New in LLVM 2.7?
- Installation Instructions
- Portability and Supported Platforms
- Known Problems
- Additional Information
This document contains the release notes for the LLVM Compiler
Infrastructure, release 2.7. Here we describe the status of LLVM, including
major improvements from the previous release and significant known problems.
All LLVM releases may be downloaded from the LLVM releases web site.
For more information about LLVM, including information about the latest
release, please check out the main LLVM
web site. If you have questions or comments, the LLVM Developer's
Mailing List is a good place to send them.
Note that if you are reading this file from a Subversion checkout or the
main LLVM web page, this document applies to the next release, not the
current one. To see the release notes for a specific release, please see the
releases page.
FIXME: llvm.org moved to new server, mention new logo, Ted and Doug new code
owners, web page in llvm-www repos.
The LLVM 2.7 distribution currently consists of code from the core LLVM
repository (which roughly includes the LLVM optimizers, code generators
and supporting tools), the Clang repository and the llvm-gcc repository. In
addition to this code, the LLVM Project includes other sub-projects that are in
development. Here we include updates on these subprojects.
The Clang project is ...
In the LLVM 2.7 time-frame, the Clang team has made many improvements:
- FIXME: C++! Include a link to cxx_compatibility.html
- FIXME: Static Analyzer improvements?
- CIndex API and Python bindings: Clang now includes a C API as part of the
CIndex library. Although we make make some changes to the API in the future, it
is intended to be stable and has been designed for use by external projects. See
the Clang
doxygen CIndex
documentation for more details. The CIndex API also includings an preliminary
set of Python bindings.
- ARM Support: Clang now has ABI support for both the Darwin and Linux ARM
ABIs. Coupled with many improvements to the LLVM ARM backend, Clang is now
suitable for use as a a beta quality ARM compiler.
Previously announced in the 2.4, 2.5, and 2.6 LLVM releases, the Clang project also
includes an early stage static source code analysis tool for automatically finding bugs
in C and Objective-C programs. The tool performs checks to find
bugs that occur on a specific path within a program.
In the LLVM 2.7 time-frame, the analyzer core has sprouted legs and...
The VMKit project is an implementation of
a JVM and a CLI Virtual Machine (Microsoft .NET is an
implementation of the CLI) using LLVM for static and just-in-time
compilation.
With the release of LLVM 2.7, VMKit has shifted to a great framework for writing
virtual machines. VMKit now offers precise and efficient garbage collection with
multi-threading support, thanks to the MMTk memory management toolkit, as well
as just in time and ahead of time compilation with LLVM. The major changes in
VMKit 0.27 are:
- Garbage collection: VMKit now uses the MMTk toolkit for garbage collectors.
The first collector to be ported is the MarkSweep collector, which is precise,
and drastically improves the performance of VMKit.
- Line number information in the JVM: by using the debug metadata of LLVM, the
JVM now supports precise line number information, useful when printing a stack
trace.
- Interface calls in the JVM: we implemented a variant of the Interface Method
Table technique for interface calls in the JVM.
The new LLVM compiler-rt project
is a simple library that provides an implementation of the low-level
target-specific hooks required by code generation and other runtime components.
For example, when compiling for a 32-bit target, converting a double to a 64-bit
unsigned integer is compiled into a runtime call to the "__fixunsdfdi"
function. The compiler-rt library provides highly optimized implementations of
this and other low-level routines (some are 3x faster than the equivalent
libgcc routines).
All of the code in the compiler-rt project is available under the standard LLVM
License, a "BSD-style" license.
DragonEgg is a port of llvm-gcc to
gcc-4.5. Unlike llvm-gcc, which makes many intrusive changes to the underlying
gcc-4.2 code, dragonegg in theory does not require any gcc-4.5 modifications
whatsoever (currently one small patch is needed). This is thanks to the new
gcc plugin architecture, which
makes it possible to modify the behaviour of gcc at runtime by loading a plugin,
which is nothing more than a dynamic library which conforms to the gcc plugin
interface. DragonEgg is a gcc plugin that causes the LLVM optimizers to be run
instead of the gcc optimizers, and the LLVM code generators instead of the gcc
code generators, just like llvm-gcc. To use it, you add
"-fplugin=path/dragonegg.so" to the gcc-4.5 command line, and gcc-4.5 magically
becomes llvm-gcc-4.5!
DragonEgg is still a work in progress. Currently C works very well, while C++,
Ada and Fortran work fairly well. All other languages either don't work at all,
or only work poorly. For the moment only the x86-32 and x86-64 targets are
supported, and only on linux.
DragonEgg has not yet been released. Once gcc-4.5 has been released, dragonegg
will probably be released as part of the following LLVM release.
The LLVM Machine Code (MC) Toolkit project is ...
An exciting aspect of LLVM is that it is used as an enabling technology for
a lot of other language and tools projects. This section lists some of the
projects that have already been updated to work with LLVM 2.7.
Need update.
Pure
is an algebraic/functional programming language based on term rewriting.
Programs are collections of equations which are used to evaluate expressions in
a symbolic fashion. Pure offers dynamic typing, eager and lazy evaluation,
lexical closures, a hygienic macro system (also based on term rewriting),
built-in list and matrix support (including list and matrix comprehensions) and
an easy-to-use C interface. The interpreter uses LLVM as a backend to
JIT-compile Pure programs to fast native code.
Pure versions 0.43 and later have been tested and are known to work with
LLVM 2.7 (and continue to work with older LLVM releases >= 2.5).
Roadsend PHP (rphp) is an open
source implementation of the PHP programming
language that uses LLVM for its optimizer, JIT and static compiler. This is a
reimplementation of an earlier project that is now based on LLVM.
Unladen Swallow is a
branch of Python intended to be fully
compatible and significantly faster. It uses LLVM's optimization passes and JIT
compiler.
TCE is a toolset for designing
application-specific processors (ASP) based on the Transport triggered
architecture (TTA). The toolset provides a complete co-design flow from C/C++
programs down to synthesizable VHDL and parallel program binaries. Processor
customization points include the register files, function units, supported
operations, and the interconnection network.
TCE uses llvm-gcc/Clang and LLVM for C/C++ language support, target
independent optimizations and also for parts of code generation. It generates
new LLVM-based code generators "on the fly" for the designed TTA processors and
loads them in to the compiler backend as runtime libraries to avoid per-target
recompilation of larger parts of the compiler chain.
SAFECode is a memory safe C
compiler built using LLVM. It takes standard, unannotated C code, analyzes the
code to ensure that memory accesses and array indexing operations are safe, and
instruments the code with run-time checks when safety cannot be proven
statically.
This release includes a huge number of bug fixes, performance tweaks and
minor improvements. Some of the major improvements and new features are listed
in this section.
LLVM 2.7 includes several major new capabilities:
Extensible metadata solid.
Debug info improvements: using metadata instead of llvm.dbg global variables.
This brings several enhancements including improved compile times.
New instruction selector.
GHC Haskell ABI/ calling conv support.
Pre-Alpha support for unions in IR.
New InlineHint and StackAlignment function attributes
Code generator MC'ized except for debug info and EH.
New SCEV AA pass: -scev-aa
Inliner reuses arrays allocas when inlining multiple callers to reduce stack usage.
MC encoding and disassembler apis.
Optimal Edge Profiling?
Instcombine is now a library, has its own IRBuilder to simplify itself.
New llvm/Support/Regex.h API. FileCheck now does regex's
Many subtle pointer invalidation bugs in Callgraph have been fixed and it now uses asserting value handles.
MC Disassembler (with blog post), MCInstPrinter. Many X86 backend and AsmPrinter simplifications
Various tools like llc and opt now read either .ll or .bc files as input.
Malloc and free instructions got removed, along with LowerAllocations pass.
compiler-rt support for ARM.
completely llvm-gcc NEON support.
Can transcode from GAS to intel syntax with "llvm-mc foo.s -output-asm-variant=1"
JIT debug information with GDB 7.0
New CodeGen Level CSE
CMake can now run tests, what other improvements?
ARM/Thumb using reg scavenging for stack object address materialization (PEI).
New SSAUpdater and MachineSSAUpdater classes for unstructured ssa updating,
changed jump threading, GVN, etc to use it which simplified them and speed
them up.
Combiner-AA improvements, why not on by default?
Pre-regalloc tail duplication
x86 sibcall / tailcall optimization in CCC mode.
New LSR with "full strength reduction" mode. Description?
Codegen level OptimizeExtsPass pass, takes advantage of x86 subregs.
Better code size analysis in loop unswitch, inliner code split out to a new
CodeMetrics class for reuse.
The ARM backend now has good support for ARMv4 backend (tested on StrongARM
hardware), previously only supported ARMv4T and newer.
Half-float support in APFloat
Indirect branch + address of label (blog post), particularly useful for interpreters.
Many changes to the pass ordering for improved optimization effectiveness.
BasicAA improved to be less dependent on "type safe" pointers, it can now look
through bitcasts more aggressively.
GVN PHI Translation improvements. blog post: http://blog.llvm.org/2009/12/advanced-topics-in-redundant-load.html
llvm.objectsize.
MachineSSAUpdater.h
PostRA scheduler for X86?
llvm.dbg.value, not being used by default though, more in 2.8. Many improvements to debug info
Support for the GCC option -fno-schedule-insns
non-temporal load/store
libllvm2.7.so?? configure with --enable-shared
dbgs() and -debug-buffer-size=N
New MicroBlaze backend. http://en.wikipedia.org/wiki/MicroBlaze
XMM subreg modeling for extraction of the low element.
Opt now works conservatively if no target data is set (is this fully working?)
Target data now has notion of 'native' integer data types which optimizations can use.
ARM backend generates instructions in unified assembly syntax.
New Analysis/InstructionSimplify.h interface for simplifying instructions that don't exist.
Jump threading is now much more aggressive at simplifying correlated
conditionals and threading blocks with otherwise complex logic. CondProp pass
removed (functionality merged into jump threading).
X86 and XCore supports returning arbitrary return values, returning too many values is
supported by returning through a hidden pointer.
verbose-asm now produces information about spill slots and loop nests
Defaults to RTTI off (smaller code size!), packagers should build with make REQUIRE_RTTI=1.
AndersAA got removed
PredSimplify, LoopVR, GVNPRE, RSProfiling (random sampling profiling) got removed.
LLVM command line tools now overwrite their output, before they would only do this with -f.
DOUT removed, use DEBUG(errs() instead.
Much stuff converted to use raw_ostream instead of std::ostream.
TargetAsmInfo renamed to MCAsmInfo
llvm/ADT/iterator.h gone.
LLVM IR has several new features for better support of new targets and that
expose new optimization opportunities:
In addition to a large array of minor performance tweaks and bug fixes, this
release includes a few major enhancements and additions to the optimizers:
Also, -anders-aa was removed
We have put a significant amount of work into the code generator
infrastructure, which allows us to implement more aggressive algorithms and make
it run faster:
New features of the X86 target include:
New features of the PIC16 target include:
Things not yet supported:
- Variable arguments.
- Interrupts/programs.
New features of the ARM target include:
New features of other targets include:
This release includes a number of new APIs that are used internally, which
may also be useful for external clients.
Other miscellaneous features include:
If you're already an LLVM user or developer with out-of-tree changes based
on LLVM 2.6, this section lists some "gotchas" that you may run into upgrading
from the previous release.
- The LLVM interpreter now defaults to not using libffi even
if you have it installed. This makes it more likely that an LLVM built on one
system will work when copied to a similar system. To use libffi,
configure with --enable-libffi.
In addition, many APIs have changed in this release. Some of the major LLVM
API changes are:
- ModuleProvider has been removed
and its methods moved to Module and GlobalValue.
Most clients can remove uses of ExistingModuleProvider,
replace getBitcodeModuleProvider with
getLazyBitcodeModule, and pass their Module to
functions that used to accept ModuleProvider. Clients who
wrote their own ModuleProviders will need to derive from
GVMaterializer instead and use
Module::setMaterializer to attach it to a
Module.
- GhostLinkage has given up the ghost.
GlobalValues that have not yet been read from their backing
storage have the same linkage they will have after being read in.
Clients must replace calls to
GlobalValue::hasNotBeenReadFromBitcode with
GlobalValue::isMaterializable.
- FIXME: Debug info has been totally redone. Add pointers to new APIs. Substantial caveats about compatibility of .ll and .bc files.
- The llvm/Support/DataTypes.h header has moved
to llvm/System/DataTypes.h.
- The isInteger, isIntOrIntVector, isFloatingPoint,
isFPOrFPVector and isFPOrFPVector methods have been renamed
isIntegerTy, isIntOrIntVectorTy, isFloatingPointTy,
isFPOrFPVectorTy and isFPOrFPVectorTy respectively.
LLVM is known to work on the following platforms:
- Intel and AMD machines (IA32, X86-64, AMD64, EMT-64) running Red Hat
Linux, Fedora Core, FreeBSD and AuroraUX (and probably other unix-like
systems).
- PowerPC and X86-based Mac OS X systems, running 10.3 and above in 32-bit
and 64-bit modes.
- Intel and AMD machines running on Win32 using MinGW libraries (native).
- Intel and AMD machines running on Win32 with the Cygwin libraries (limited
support is available for native builds with Visual C++).
- Sun x86 and AMD64 machines running Solaris 10, OpenSolaris 0906.
- Alpha-based machines running Debian GNU/Linux.
The core LLVM infrastructure uses GNU autoconf to adapt itself
to the machine and operating system on which it is built. However, minor
porting may be required to get LLVM to work on new platforms. We welcome your
portability patches and reports of successful builds or error messages.
This section contains significant known problems with the LLVM system,
listed by component. If you run into a problem, please check the LLVM bug database and submit a bug if
there isn't already one.
- LLVM will not correctly compile on Solaris and/or OpenSolaris
using the stock GCC 3.x.x series 'out the box',
See: Broken versions of GCC and other tools.
However, A Modern GCC Build
for x86/x86-64 has been made available from the third party AuroraUX Project
that has been meticulously tested for bootstrapping LLVM & Clang.
The following components of this LLVM release are either untested, known to
be broken or unreliable, or are in early development. These components should
not be relied on, and bugs should not be filed against them, but they may be
useful to some people. In particular, if you would like to work on one of these
components, please contact us on the LLVMdev list.
- The MSIL, Alpha, SPU, MIPS, PIC16, Blackfin, MSP430, SystemZ and MicroBlaze
backends are experimental.
- The llc "-filetype=asm" (the default) is the only
supported value for this option. The MachO writer is experimental, and
works much better in mainline SVN.
- The X86 backend does not yet support
all inline assembly that uses the X86
floating point stack. It supports the 'f' and 't' constraints, but not
'u'.
- The X86 backend generates inefficient floating point code when configured
to generate code for systems that don't have SSE2.
- Win64 code generation wasn't widely tested. Everything should work, but we
expect small issues to happen. Also, llvm-gcc cannot build the mingw64
runtime currently due
to several
bugs and due to lack of support for
the
'u' inline assembly constraint and for X87 floating point inline assembly.
- The X86-64 backend does not yet support the LLVM IR instruction
va_arg. Currently, the llvm-gcc and front-ends support variadic
argument constructs on X86-64 by lowering them manually.
- The Linux PPC32/ABI support needs testing for the interpreter and static
compilation, and lacks support for debug information.
- Support for the Advanced SIMD (Neon) instruction set is still incomplete
and not well tested. Some features may not work at all, and the code quality
may be poor in some cases.
- Thumb mode works only on ARMv6 or higher processors. On sub-ARMv6
processors, thumb programs can crash or produce wrong
results (PR1388).
- Compilation for ARM Linux OABI (old ABI) is supported but not fully tested.
- The SPARC backend only supports the 32-bit SPARC ABI (-m32); it does not
support the 64-bit SPARC ABI (-m64).
- 64-bit MIPS targets are not supported yet.
- On 21164s, some rare FP arithmetic sequences which may trap do not have the
appropriate nops inserted to ensure restartability.
The only major language feature of GCC not supported by llvm-gcc is
the __builtin_apply family of builtins. However, some extensions
are only supported on some targets. For example, trampolines are only
supported on some targets (these are used when you take the address of a
nested function).
If you run into GCC extensions which are not supported, please let us know.
- Fortran support generally works, but there are still several unresolved bugs
in Bugzilla. Please see the
tools/gfortran component for details.
The llvm-gcc 4.2 Ada compiler works fairly well; however, this is not a mature
technology, and problems should be expected.
- The Ada front-end currently only builds on X86-32. This is mainly due
to lack of trampoline support (pointers to nested functions) on other platforms.
However, it also fails to build on X86-64
which does support trampolines.
- The Ada front-end fails to bootstrap.
This is due to lack of LLVM support for setjmp/longjmp style
exception handling, which is used internally by the compiler.
Workaround: configure with --disable-bootstrap.
- The c380004, c393010
and cxg2021 ACATS tests fail
(c380004 also fails with gcc-4.2 mainline).
If the compiler is built with checks disabled then c393010
causes the compiler to go into an infinite loop, using up all system memory.
- Some GCC specific Ada tests continue to crash the compiler.
- The -E binder option (exception backtraces)
does not work and will result in programs
crashing if an exception is raised. Workaround: do not use -E.
- Only discrete types are allowed to start
or finish at a non-byte offset in a record. Workaround: do not pack records
or use representation clauses that result in a field of a non-discrete type
starting or finishing in the middle of a byte.
- The lli interpreter considers
'main' as generated by the Ada binder to be invalid.
Workaround: hand edit the file to use pointers for argv and
envp rather than integers.
- The -fstack-check option is
ignored.
A wide variety of additional information is available on the LLVM web page, in particular in the documentation section. The web page also
contains versions of the API documentation which is up-to-date with the
Subversion version of the source code.
You can access versions of these documents specific to this release by going
into the "llvm/doc/" directory in the LLVM tree.
If you have any questions or comments about LLVM, please feel free to contact
us via the mailing
lists.
LLVM Compiler Infrastructure
Last modified: $Date$