LLVM 2.4 Release Notes
  - Introduction
- Sub-project Status Update
- What's New in LLVM?
- Installation Instructions
- Portability and Supported Platforms
- Known Problems
- Additional Information
This document contains the release notes for the LLVM Compiler
Infrastructure, release 2.4.  Here we describe the status of LLVM, including
major improvements from the previous release and significant known problems.
All LLVM releases may be downloaded from the LLVM releases web site.
For more information about LLVM, including information about the latest
release, please check out the main LLVM
web site.  If you have questions or comments, the LLVM Developer's Mailing
List is a good place to send them.
Note that if you are reading this file from a Subversion checkout or the
main LLVM web page, this document applies to the next release, not the
current one.  To see the release notes for a specific release, please see the
releases page.
 
 
The LLVM 2.4 distribution currently consists of code from the core LLVM
repository (which roughly includes the LLVM optimizers, code generators and
supporting tools) and the llvm-gcc repository.  In addition to this code, the
LLVM Project includes other sub-projects that are in development.  The two which
are the most actively developed are the Clang Project and
the VMKit Project.
 
The Clang project is an effort to build
a set of new 'LLVM native' front-end technologies for the LLVM optimizer
and code generator.  Clang is continuing to make major strides forward in all
areas.  Its C and Objective-C parsing support is very solid, and the code
generation support is far enough along to build many C applications.  While not
yet production quality, it is progressing very nicely.  In addition, C++
front-end work has started to make significant progress.
Clang, in conjunction with the 
ccc driver, is now usable as a 
replacement for gcc for building some small- to medium-sized  C applications. 
Additionally, Clang now has code generation support for Objective-C on Mac OS X
platform. Major highlights include:
	-  Clang/ccc pass almost all of the LLVM test suite on Mac OS X and Linux 
on the 32-bit x86 architecture. This includes significant C 
applications such as sqlite3, 
lua, and 
Clam AntiVirus. 
	
-  Clang can build the majority of Objective-C examples shipped with the 
Mac OS X Developer Tools.
Clang code generation still needs considerable testing and development, however. 
Some areas under active development include:
	-  Improved support for C and Objective-C features, for example 
	variable-length arrays, va_arg, exception handling (Obj-C), and garbage 
	collection (Obj-C).
	
-  ABI compatibility, especially for platforms other than 32-bit x86.
 
The Clang project also includes an early stage static source code analysis
tool for automatically
finding bugs in C and Objective-C programs. The tool performs a growing set
of checks to find bugs that occur on a specific path within a program.  Examples
of bugs the tool finds include logic errors such as null dereferences,
violations of various API rules, dead code, and potential memory leaks in
Objective-C programs. Since its inception, public feedback on the tool has been
extremely positive, and conservative estimates put the number of real bugs it
has found in industrial-quality software on the order of thousands.
The tool also provides a simple web GUI to inspect potential bugs found by
the tool.  While still early in development, the GUI illustrates some of the key
features of Clang: accurate source location information, which is used by the
GUI to highlight specific code expressions that relate to a bug (including those
that span multiple lines) and built-in knowledge of macros, which is used to
perform inline expansion of macros within the GUI itself.
The set of checks performed by the static analyzer is gradually expanding,
and
future plans for the tool include full source-level inter-procedural analysis
and deeper checks such as buffer overrun detection. There are many opportunities
to extend and enhance the static analyzer, and anyone interested in working on
this project is encouraged to get involved!
 
The VMKit project is an implementation of
a JVM and a CLI Virtual Machines (Microsoft .NET is an
implementation of the CLI) using the Just-In-Time compiler of LLVM.
Following LLVM 2.4, VMKit has its first release 0.24 that you can find on its
webpage. The release includes
bug fixes, cleanup and new features. The major changes are:
-  Support for generics in the .Net virtual machine.
-  Initial support for the Mono class libraries.
-  Support for MacOSX/x86, following LLVM's support for exceptions in
JIT on MacOSX/x86.
-  A new vmkit driver: a program to run java or .net applications. The
driver supports llvm command line arguments including the new "-fast" option.
-  A new memory allocation scheme in the JVM that makes unloading a
class loader very fast.
-  VMKit now follows the LLVM Makefile machinery.
 
This release includes a huge number of bug fixes, performance tweaks and
minor improvements.  Some of the major improvements and new features are listed
in this section.
 
LLVM 2.4 includes several major new capabilities:
- The most visible end-user change in LLVM 2.4 is that it includes many
optimizations and changes to make -O0 compile times much faster.  You should see
improvements on the order of 30% (or more) faster than LLVM 2.3.  There are many
pieces to this change, described in more detail below.  The speedups and new
components can also be used for JIT compilers that want fast compilation as
well. 
- The biggest change to the LLVM IR is that Multiple Return Values (which
were introduced in LLVM 2.3) have been generalized to full support for "First
Class Aggregate" values in LLVM 2.4.  This means that LLVM IR supports using
structs and arrays as values in a function.  This capability is mostly useful
for front-end authors, who prefer to treat things like complex numbers, simple
tuples, dope vectors, etc as Value*'s instead of as a tuple of Value*'s or as
memory values.  Bitcode files from LLVM 2.3 will automatically migrate to the
general representation. 
- LLVM 2.4 also includes an initial port for the PIC16 microprocessor. This
is the LLVM target that only has support for 8 bit registers, and a number of
other crazy constraints.  While the port is still in early development stages,
it shows some interesting things you can do with LLVM. 
 
LLVM fully supports the llvm-gcc 4.2 front-end, which marries the GCC
front-ends and driver with the LLVM optimizer and code generator.  It currently
includes support for the C, C++, Objective-C, Ada, and Fortran front-ends.
- LLVM 2.4 supports the full set of atomic __sync_* builtins.  LLVM
2.3 only supported those used by OpenMP, but 2.4 supports them all.  While
llvm-gcc supports all of these builtins, note that not all targets do.  X86 
support them all in both 32-bit and 64-bit mode and PowerPC supports them all
except for the 64-bit operations when in 32-bit mode.
- llvm-gcc now supports an -flimited-precision option, which tells
the compiler that it is ok to use low-precision approximations of certain libm
functions (like tan, log, etc).  This allows you to get high performance if you
only need (say) 14-bits of precision.
- llvm-gcc now supports a C language extension known as "Blocks".
This feature is similar to nested functions and closures, but does not
require stack trampolines (with most ABIs) and supports returning closures 
from functions that define them.  Note that actually using Blocks
requires a small runtime that is not included with llvm-gcc.
- llvm-gcc now supports a new -flto option.  On systems that support
transparent Link Time Optimization (currently Darwin systems with Xcode 3.1 and
later) this allows the use of LTO with other optimization levels like -Os.
Previously, LTO could only be used with -O4, which implied optimizations in
-O3 that can increase code size.
 
New features include:
- A major change to the Use class landed, which shrank it by 25%.  Since
this is a pervasive part of the LLVM, it ended up reducing the memory use of
LLVM IR in general by 15% for most programs.
- Values with no names are now pretty printed by llvm-dis more
nicely.  They now print as "%3 = add i32 %A, 4" instead of
"add i32 %A, 4   ; <i32>:3", which makes it much easier to read.
- LLVM 2.4 includes some changes for better vector support.  First, the shift
operations (shl, ashr, lshr) now all support vectors
and do an element-by-element shift (shifts of the whole vector can be
accomplished by bitcasting the vector to <1 x i128> for example).  Second,
there is initial support in development for vector comparisons with the 
fcmp/icmp
instructions.  These instructions compare two vectors and return a vector of
i1's for each result.  Note that there is very little codegen support available
for any of these IR features though.
- A new DebugInfoBuilder class is available, which makes it much
easier for front-ends to create debug info descriptors, similar to the way that
IRBuilder makes it easier to create LLVM IR.
- The IRBuilder class is now parameterized by a class responsible
for constant folding.  The default ConstantFolder class does target independent
constant folding.  The NoFolder class does no constant folding at all, which is
useful when learning how LLVM works.  The TargetFolder class folds the most,
doing target dependent constant folding.
- LLVM now supports "function attributes", which allows us to separate return
value attributes from function attributes.  LLVM now supports attributes on a
function itself, a return value, and its parameters.  New supported function
attributes include noinline/alwaysinline and the "opt-size" flag which says the
function should be optimized for code size.
- LLVM IR now directly represents "common" linkage, instead of
    representing it as a form of weak linkage.
 
In addition to a huge array of bug fixes and minor performance tweaks, this
release includes a few major enhancements and additions to the optimizers:
- The Global Value Numbering (GVN) pass now does local Partial Redundancy
Elimination (PRE) to eliminate some partially redundant expressions in cases
where doing so won't grow code size.
- LLVM 2.4 includes a new loop deletion pass (which removes output-free
provably-finite loops) and a rewritten Aggressive Dead Code Elimination (ADCE)
pass that no longer uses control dependence information.  These changes speed up
the optimizer and also prevent it from deleting output-free infinite
loops.
- The new AddReadAttrs pass works out which functions are read-only or
read-none (these correspond to 'pure' and 'const' in GCC) and marks them
with the appropriate attribute.
- LLVM 2.4 now includes a new SparsePropagation framework, which makes it
trivial to build lattice-based dataflow solvers that operate over LLVM IR. Using
this interface means that you just define objects to represent your lattice
values and the transfer functions that operate on them.  It handles the
mechanics of worklist processing, liveness tracking, handling PHI nodes,
etc.
- The Loop Strength Reduction and induction variable optimization passes have
several improvements to avoid inserting MAX expressions, to optimize simple
floating point induction variables and to analyze trip counts of more
loops.
- Various helper functions (ComputeMaskedBits, ComputeNumSignBits, etc) were
pulled out of the Instruction Combining pass and put into a new 
ValueTracking.h header, where they can be reused by other passes.
- The tail duplication pass has been removed from the standard optimizer
sequence used by llvm-gcc.  This pass still exists, but the benefits it once
provided are now achieved by other passes.
 
We have put a significant amount of work into the code generator infrastructure,
which allows us to implement more aggressive algorithms and make it run
faster:
- The target-independent code generator supports (and the X86 backend
    currently implements) a new interface for "fast" instruction selection. This
    interface is optimized to produce code as quickly as possible, sacrificing
    code quality to do it.  This is used by default at -O0 or when using
    "llc -fast" on X86.  It is straight-forward to add support for
    other targets if faster -O0 compilation is desired.
- In addition to the new 'fast' instruction selection path, many existing
    pieces of the code generator have been optimized in significant ways.
    SelectionDAG's are now pool allocated and use better algorithms in many
    places, the ".s" file printers now use raw_ostream to emit text much faster,
    etc.  The end result of these improvements is that the compiler also takes
    substantially less time to generate code that is just as good (and often
    better) than before.
- Each target has been split to separate the ".s" file printing logic from the
    rest of the target.  This enables JIT compilers that don't link in the
    (somewhat large) code and data tables used for printing a ".s" file.
- The code generator now includes a "stack slot coloring" pass, which packs
    together individual spilled values into common stack slots.  This reduces
    the size of stack frames with many spills, which tends to increase L1 cache
    effectiveness.
- Various pieces of the register allocator (e.g. the coalescer and two-address
    operation elimination pass) now know how to rematerialize trivial operations
    to avoid copies and include several other optimizations.
- The graphs produced by
    the llc -view-*-dags options are now significantly prettier and
    easier to read.
- LLVM 2.4 includes a new register allocator based on Partitioned Boolean
    Quadratic Programming (PBQP).  This register allocator is still in
    development, but is very simple and clean.
 
New target-specific features include:
- Exception handling is supported by default on Linux/x86-64.
- Position Independent Code (PIC) is now supported on Linux/x86-64.
- @llvm.frameaddress now supports getting the frame address of stack frames
    > 0 on x86/x86-64.
- MIPS floating point support? [BRUNO]
- The PowerPC backend now supports trampolines.
 
New features include:
- llvmc2 (the generic compiler driver) gained plugin
    support. It is now easier to experiment with llvmc2 and
    build your own tools based on it.
- LLVM 2.4 includes a number of new generic algorithms and data structures,
    include a scoped hash table, 'immutable' data structures, a simple
    free-list manager, and a raw_ostream class.
    The raw_ostream class and
    format allow for efficient file output, and various pieces of LLVM
    have switched over to use it.   The eventual goal is to eliminate
    std::ostream in favor of it.
 
If you're already an LLVM user or developer with out-of-tree changes based
on LLVM 2.3, this section lists some "gotchas" that you may run into upgrading
from the previous release.
- The LLVM IR generated by llvm-gcc no longer names all instructions.  This
    makes it run faster, but may be more confusing to some people.  If you
    prefer to have names, the 'opt -instnamer' pass will add names to
    all instructions.
- The LoadVN and GCSE passes have been removed from the tree.  They are
    obsolete and have been replaced with the GVN and MemoryDependence passes.
    
In addition, many APIs have changed in this release.  Some of the major LLVM
API changes are:
- Now, function attributes and return value attributes are managed 
separately. Interface exported by ParameterAttributes.h header is now
experted by Attributes.h header. The new attributes interface changes are:
- getParamAttrs method is now replaced by 
getParamAttributes, getRetAttributes and 
getFnAttributes methods.
-  Return value attributes are stored at index 0. Function attributes are 
stored at index ~0U. Parameter attributes are stored at index that matches 
parameter number.
-  ParamAttr namespace is now renamed as Attribute.
-  The name of the class that manages reference count of opaque 
attributes is changed from PAListPtr to AttrListPtr.
-  ParamAttrsWithIndex is now renamed as AttributeWithIndex. 
 
- The DbgStopPointInst methods getDirectory and
getFileName now return Value* instead of strings. These can be
converted to strings using llvm::GetConstantStringInfo defined via
"llvm/Analysis/ValueTracking.h".
- The APIs to create various instructions have changed from lower case
   "create" methods to upper case "Create" methods (e.g. 
   BinaryOperator::create).  LLVM 2.4 includes both cases, but the
   lower case ones are removed in mainline, please migrate.
- Various header files like "llvm/ADT/iterator" were given a ".h" suffix.
    Change your code to #include "llvm/ADT/iterator.h" instead.
- In the code generator, many MachineOperand predicates were renamed to be
    shorter (e.g. isFrameIndex() -> isFI()),
    SDOperand was renamed to SDValue (and the "Val"
    member was changed to be the getNode() accessor), and the
    MVT::ValueType enum has been replaced with an "MVT"
    struct. The getSignExtended and getValue methods in the
    ConstantSDNode class were renamed to getSExtValue and
    getZExtValue respectively, to be more consistent with
    the ConstantInt class.
 
LLVM is known to work on the following platforms:
- Intel and AMD machines (IA32) running Red Hat Linux, Fedora Core and FreeBSD
      (and probably other unix-like systems).
- PowerPC and X86-based Mac OS X systems, running 10.3 and above in 32-bit and
    64-bit modes.
- Intel and AMD machines running on Win32 using MinGW libraries (native).
- Intel and AMD machines running on Win32 with the Cygwin libraries (limited
    support is available for native builds with Visual C++).
- Sun UltraSPARC workstations running Solaris 10.
- Alpha-based machines running Debian GNU/Linux.
- Itanium-based (IA64) machines running Linux and HP-UX.
The core LLVM infrastructure uses GNU autoconf to adapt itself
to the machine and operating system on which it is built.  However, minor
porting may be required to get LLVM to work on new platforms.  We welcome your
portability patches and reports of successful builds or error messages.
 
This section contains all known problems with the LLVM system, listed by
component.  As new problems are discovered, they will be added to these
sections.  If you run into a problem, please check the LLVM bug database and submit a bug if
there isn't already one.
 
The following components of this LLVM release are either untested, known to
be broken or unreliable, or are in early development.  These components should
not be relied on, and bugs should not be filed against them, but they may be
useful to some people.  In particular, if you would like to work on one of these
components, please contact us on the LLVMdev list.
- The MSIL, IA64, Alpha, SPU, MIPS, and PIC16 backends are experimental.
- The llc "-filetype=asm" (the default) is the only supported
    value for this option.
 
  - The X86 backend does not yet support
    all inline assembly that uses the X86
    floating point stack.  It supports the 'f' and 't' constraints, but not
    'u'.
- The X86 backend generates inefficient floating point code when configured
    to generate code for systems that don't have SSE2.
- Win64 code generation wasn't widely tested. Everything should work, but we
    expect small issues to happen. Also, llvm-gcc cannot build mingw64 runtime
    currently due
    to several
    bugs due to lack of support for the
    'u' inline assembly constraint and X87 floating point inline assembly.
- The X86-64 backend does not yet support the LLVM IR instruction
      va_arg. Currently, the llvm-gcc front-end supports variadic
      argument constructs on X86-64 by lowering them manually.
 
- The Linux PPC32/ABI support needs testing for the interpreter and static
compilation, and lacks support for debug information.
 
- Thumb mode works only on ARMv6 or higher processors. On sub-ARMv6
processors, thumb programs can crash or produce wrong
results (PR1388).
- Compilation for ARM Linux OABI (old ABI) is supported, but not fully tested.
- There is a bug in QEMU-ARM (<= 0.9.0) which causes it to incorrectly
 execute
programs compiled with LLVM.  Please use more recent versions of QEMU.
 
- The SPARC backend only supports the 32-bit SPARC ABI (-m32), it does not
    support the 64-bit SPARC ABI (-m64).
 
- On 21164s, some rare FP arithmetic sequences which may trap do not have the
appropriate nops inserted to ensure restartability.
 
- The Itanium backend is highly experimental, and has a number of known
    issues.  We are looking for a maintainer for the Itanium backend.  If you
    are interested, please contact the LLVMdev mailing list.
 
llvm-gcc does not currently support Link-Time
Optimization on most platforms "out-of-the-box".  Please inquire on the
LLVMdev mailing list if you are interested.
The only major language feature of GCC not supported by llvm-gcc is
    the __builtin_apply family of builtins.   However, some extensions
    are only supported on some targets.  For example, trampolines are only
    supported on some targets (these are used when you take the address of a
    nested function).
If you run into GCC extensions which are not supported, please let us know.
 
The C++ front-end is considered to be fully
tested and works for a number of non-trivial programs, including LLVM
itself, Qt, Mozilla, etc.
- Exception handling works well on the X86 and PowerPC targets. Currently
  only linux and darwin targets are supported (both 32 and 64 bit).
 
The llvm-gcc 4.2 Ada compiler works fairly well, however this is not a mature
technology and problems should be expected.
- The Ada front-end currently only builds on X86-32.  This is mainly due
to lack of trampoline support (pointers to nested functions) on other platforms,
however it also fails to build on X86-64
which does support trampolines.
- The Ada front-end fails to bootstrap.
Workaround: configure with --disable-bootstrap.
- The c380004, c393010
and cxg2021 ACATS tests fail
(c380004 also fails with gcc-4.2 mainline).
- Some gcc specific Ada tests continue to crash the compiler.
- The -E binder option (exception backtraces)
does not work and will result in programs
crashing if an exception is raised.  Workaround: do not use -E.
- Only discrete types are allowed to start
or finish at a non-byte offset in a record.  Workaround: do not pack records
or use representation clauses that result in a field of a non-discrete type
starting or finishing in the middle of a byte.
- The lli interpreter considers
'main' as generated by the Ada binder to be invalid.
Workaround: hand edit the file to use pointers for argv and
envp rather than integers.
- The -fstack-check option is
ignored.
A wide variety of additional information is available on the LLVM web page, in particular in the documentation section.  The web page also
contains versions of the API documentation which is up-to-date with the
Subversion version of the source code.
You can access versions of these documents specific to this release by going
into the "llvm/doc/" directory in the LLVM tree.
If you have any questions or comments about LLVM, please feel free to contact
us via the  mailing
lists.
 
   
   LLVM Compiler Infrastructure
  LLVM Compiler Infrastructure
  Last modified: $Date$