mirror of
https://github.com/c64scene-ar/llvm-6502.git
synced 2025-07-24 22:24:54 +00:00
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_32@170526 91177308-0d34-0410-b5e6-96231b3b80d8
976 lines
37 KiB
HTML
976 lines
37 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
|
|
"http://www.w3.org/TR/html4/strict.dtd">
|
|
<html>
|
|
<head>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
|
<link rel="stylesheet" href="_static/llvm.css" type="text/css">
|
|
<title>LLVM 3.2 Release Notes</title>
|
|
</head>
|
|
<body>
|
|
|
|
<h1>LLVM 3.2 Release Notes</h1>
|
|
|
|
<div>
|
|
<img style="float:right" src="http://llvm.org/img/DragonSmall.png"
|
|
width="136" height="136" alt="LLVM Dragon Logo">
|
|
</div>
|
|
|
|
<ol>
|
|
<li><a href="#intro">Introduction</a></li>
|
|
<li><a href="#subproj">Sub-project Status Update</a></li>
|
|
<li><a href="#externalproj">External Projects Using LLVM 3.2</a></li>
|
|
<li><a href="#whatsnew">What's New in LLVM?</a></li>
|
|
<li><a href="GettingStarted.html">Installation Instructions</a></li>
|
|
<li><a href="#knownproblems">Known Problems</a></li>
|
|
<li><a href="#additionalinfo">Additional Information</a></li>
|
|
</ol>
|
|
|
|
<div class="doc_author">
|
|
<p>Written by the <a href="http://llvm.org/">LLVM Team</a></p>
|
|
</div>
|
|
|
|
<!-- *********************************************************************** -->
|
|
<h2>
|
|
<a name="intro">Introduction</a>
|
|
</h2>
|
|
<!-- *********************************************************************** -->
|
|
|
|
<div>
|
|
|
|
<p>This document contains the release notes for the LLVM Compiler
|
|
Infrastructure, release 3.2. Here we describe the status of LLVM, including
|
|
major improvements from the previous release, improvements in various
|
|
sub-projects of LLVM, and some of the current users of the code. All LLVM
|
|
releases may be downloaded from the <a href="http://llvm.org/releases/">LLVM
|
|
releases web site</a>.</p>
|
|
|
|
<p>For more information about LLVM, including information about the latest
|
|
release, please check out the <a href="http://llvm.org/">main LLVM web
|
|
site</a>. If you have questions or comments,
|
|
the <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">LLVM
|
|
Developer's Mailing List</a> is a good place to send them.</p>
|
|
|
|
<p>Note that if you are reading this file from a Subversion checkout or the main
|
|
LLVM web page, this document applies to the <i>next</i> release, not the
|
|
current one. To see the release notes for a specific release, please see the
|
|
<a href="http://llvm.org/releases/">releases page</a>.</p>
|
|
|
|
</div>
|
|
|
|
|
|
<!-- *********************************************************************** -->
|
|
<h2>
|
|
<a name="subproj">Sub-project Status Update</a>
|
|
</h2>
|
|
<!-- *********************************************************************** -->
|
|
|
|
<div>
|
|
|
|
<p>The LLVM 3.2 distribution currently consists of production-quality code
|
|
from the core LLVM repository, which roughly includes the LLVM optimizers,
|
|
code generators and supporting tools, as well as Clang, DragonEgg and
|
|
compiler-rt sub-project repositories. In addition to this code, the LLVM
|
|
Project includes other sub-projects that are in development. Here we
|
|
include updates on these sub-projects.</p>
|
|
|
|
<!--=========================================================================-->
|
|
<h3>
|
|
<a name="clang">Clang: C/C++/Objective-C Frontend Toolkit</a>
|
|
</h3>
|
|
|
|
<div>
|
|
|
|
<p><a href="http://clang.llvm.org/">Clang</a> is an LLVM front end for the C,
|
|
C++, and Objective-C languages. Clang aims to provide a better user
|
|
experience through expressive diagnostics, a high level of conformance to
|
|
language standards, fast compilation, and low memory use. Like LLVM, Clang
|
|
provides a modular, library-based architecture that makes it suitable for
|
|
creating or integrating with other development tools.</p>
|
|
|
|
<p>In the LLVM 3.2 time-frame, the Clang team has made many improvements.
|
|
Highlights include:</p>
|
|
<ul>
|
|
<li>Improvements to Clang's diagnostics</li>
|
|
<li>Support for tls_model attribute</li>
|
|
<li>Type safety attributes</li>
|
|
</ul>
|
|
|
|
<p>For more details about the changes to Clang since the 3.1 release, see the
|
|
<a href="http://llvm.org/releases/3.2/tools/clang/docs/ReleaseNotes.html">Clang 3.2 release
|
|
notes.</a></p>
|
|
|
|
<p>If Clang rejects your code but another compiler accepts it, please take a
|
|
look at the <a href="http://clang.llvm.org/compatibility.html">language
|
|
compatibility</a> guide to make sure this is not intentional or a known
|
|
issue.</p>
|
|
|
|
</div>
|
|
|
|
<!--=========================================================================-->
|
|
<h3>
|
|
<a name="dragonegg">DragonEgg: GCC front-ends, LLVM back-end</a>
|
|
</h3>
|
|
|
|
<div>
|
|
|
|
<p><a href="http://dragonegg.llvm.org/">DragonEgg</a> is a
|
|
<a href="http://gcc.gnu.org/wiki/plugins">gcc plugin</a> that replaces GCC's
|
|
optimizers and code generators with LLVM's. It works with gcc-4.5 and gcc-4.6
|
|
(and partially with gcc-4.7), can target the x86-32/x86-64 and ARM processor
|
|
families, and has been successfully used on the Darwin, FreeBSD, KFreeBSD,
|
|
Linux and OpenBSD platforms. It fully supports Ada, C, C++ and Fortran. It
|
|
has partial support for Go, Java, Obj-C and Obj-C++.</p>
|
|
|
|
<p>The 3.2 release has the following notable changes:</p>
|
|
|
|
<ul>
|
|
<li>Able to load LLVM plugins such as Polly.</li>
|
|
<li>Supports thread-local storage models.</li>
|
|
<li>Passes knowledge of variable lifetimes to the LLVM optimizers.</li>
|
|
<li>No longer requires GCC to be built with LTO support.</li>
|
|
</ul>
|
|
|
|
</div>
|
|
|
|
<!--=========================================================================-->
|
|
<h3>
|
|
<a name="compiler-rt">compiler-rt: Compiler Runtime Library</a>
|
|
</h3>
|
|
|
|
<div>
|
|
|
|
|
|
<p>The LLVM <a href="http://compiler-rt.llvm.org/">compiler-rt project</a>
|
|
is a simple library that provides an implementation of the low-level
|
|
target-specific hooks required by code generation and other runtime
|
|
components. For example, when compiling for a 32-bit target, converting a
|
|
double to a 64-bit unsigned integer is compiled into a runtime call to the
|
|
<code>__fixunsdfdi</code> function. The compiler-rt library provides highly
|
|
optimized implementations of this and other low-level routines (some are 3x
|
|
faster than the equivalent libgcc routines).</p>
|
|
|
|
<p>The 3.2 release has the following notable changes:</p>
|
|
|
|
<ul>
|
|
<li><a href="http://llvm.org/releases/3.2/tools/clang/docs/ThreadSanitizer.html">ThreadSanitizer (TSan)</a> - data race detector run-time library for C/C++ has been added.</li>
|
|
<li>Improvements to <a href="http://llvm.org/releases/3.2/tools/clang/docs/AddressSanitizer.html">AddressSanitizer</a> including: better portability
|
|
(OSX, Android NDK), support for cmake based builds, enhanced error reporting and lots of bug fixes.</li>
|
|
<li>Added support for A6 'Swift' CPU.</li>
|
|
<li><code>divsi3</code> function has been enhanced to take advantage of a hardware unsigned divide when it is available.</li>
|
|
</ul>
|
|
|
|
</div>
|
|
|
|
<!--=========================================================================-->
|
|
<h3>
|
|
<a name="lldb">LLDB: Low Level Debugger</a>
|
|
</h3>
|
|
|
|
<div>
|
|
|
|
<p><a href="http://lldb.llvm.org">LLDB</a> is a ground-up implementation of a
|
|
command line debugger, as well as a debugger API that can be used from other
|
|
applications. LLDB makes use of the Clang parser to provide high-fidelity
|
|
expression parsing (particularly for C++) and uses the LLVM JIT for target
|
|
support.</p>
|
|
|
|
<p>The 3.2 release has the following notable changes:</p>
|
|
|
|
<ul>
|
|
<li>Linux build fixes for clang (see <a href="http://lldb.llvm.org/build.html">Building LLDB</a>)</li>
|
|
<li>Some Linux stability and usability improvements</li>
|
|
<li>Switch expression evaluation to use MCJIT (from legacy JIT) on Linux</li>
|
|
</ul>
|
|
|
|
</div>
|
|
|
|
<!--=========================================================================-->
|
|
<h3>
|
|
<a name="libc++">libc++: C++ Standard Library</a>
|
|
</h3>
|
|
|
|
<div>
|
|
|
|
<p>Like compiler_rt, libc++ is now <a href="DeveloperPolicy.html#license">dual
|
|
licensed</a> under the MIT and UIUC license, allowing it to be used more
|
|
permissively.</p>
|
|
|
|
<p>Within the LLVM 3.2 time-frame there were the following highlights:</p>
|
|
|
|
<ul>
|
|
<li> C++11 shared_ptr atomic access API (20.7.2.5) has been implemented.</li>
|
|
<li>Applied noexcept and constexpr throughout library.</li>
|
|
<li>Improved C++11 conformance in associative container emplace.</li>
|
|
<li>Performance improvements in: std::rotate algorithm and I/O.</li>
|
|
<li>Operator new/delete and type_infos for exception types moved from libc++ to libc++abi.</li>
|
|
<li>Bug fixes in: <code><atomic></code>; vector<code><bool></code> algorithms,
|
|
<code><future></code>,<code><tuple></code>,
|
|
<code><type_traits></code>,<code><fstream></code>,<code><istream></code>,
|
|
<code><iterator></code>, <code><condition_variable></code>,<code><complex></code> as well as visibility fixes.
|
|
</ul>
|
|
|
|
</div>
|
|
|
|
<!--=========================================================================-->
|
|
<h3>
|
|
<a name="vmkit">VMKit</a>
|
|
</h3>
|
|
|
|
<div>
|
|
|
|
<p>The <a href="http://vmkit.llvm.org/">VMKit project</a> is an implementation
|
|
of a Java Virtual Machine (Java VM or JVM) that uses LLVM for static and
|
|
just-in-time compilation.</p>
|
|
|
|
<p>The 3.2 release has the following notable changes:</p>
|
|
|
|
<ul>
|
|
<li>Bug fixes only, no functional changes.</li>
|
|
</ul>
|
|
|
|
</div>
|
|
|
|
|
|
<!--=========================================================================-->
|
|
<h3>
|
|
<a name="Polly">Polly: Polyhedral Optimizer</a>
|
|
</h3>
|
|
|
|
<div>
|
|
|
|
<p><a href="http://polly.llvm.org/">Polly</a> is an <em>experimental</em>
|
|
optimizer for data locality and parallelism. It currently provides high-level
|
|
loop optimizations and automatic parallelization (using the OpenMP run time).
|
|
Work in the area of automatic SIMD and accelerator code generation was
|
|
started.</p>
|
|
|
|
<p>Within the LLVM 3.2 time-frame there were the following highlights:</p>
|
|
|
|
<ul>
|
|
<li>isl, the integer set library used by Polly, was relicensed under the MIT license.</li>
|
|
<li>isl based code generation.</li>
|
|
<li>MIT licensed replacement for CLooG (LGPLv2).</li>
|
|
<li>Fine grained option handling (separation of core and border computations, control overhead vs. code size).</li>
|
|
<li>Support for FORTRAN and Dragonegg.</li>
|
|
<li>OpenMP code generation fixes.</li>
|
|
</ul>
|
|
|
|
</div>
|
|
|
|
<!--=========================================================================-->
|
|
<h3>
|
|
<a name="StaticAnalyzer">Clang Static Analyzer</a>
|
|
</h3>
|
|
|
|
<div>
|
|
|
|
<p>The <a href="http://clang-analyzer.llvm.org/">Clang Static Analyzer</a>
|
|
is an advanced source code analysis tool integrated into Clang that performs
|
|
a deep analysis of code to find potential bugs.</p>
|
|
|
|
<p>In the LLVM 3.2 release, the static analyzer has made significant improvements
|
|
in many areas, with notable highlights such as:</p>
|
|
|
|
<ul>
|
|
<li>Improved interprocedural analysis within a translation unit (see details below), which greatly amplified the analyzer's ability to find bugs.</li>
|
|
<li>New infrastructure to model "well-known" APIs, allowing the analyzer to do a much better job when modeling calls to such functions.</li>
|
|
<li>Significant improvements to the APIs to write static analyzer checkers, with a more unified way of representing function/method calls in the checker API. Details can be found in the <a href="http://llvm.org/devmtg/2012-11#talk13">Building a Checker in 24 hours</a> talk.
|
|
</ul>
|
|
|
|
<p>The release specifically includes notable improvements for Objective-C analysis, including:</p>
|
|
|
|
<ul>
|
|
<li>Interprocedural analysis for Objective-C methods.</li>
|
|
<li>Interprocedural analysis of calls to "blocks".</li>
|
|
<li>Precise modeling of GCD APIs such as <tt>dispatch_once</tt> and friends.</li>
|
|
<li>Improved support for recently added Objective-C constructs such as array and dictionary literals.</li>
|
|
</ul>
|
|
|
|
<p>The release specifically includes notable improvements for C++ analysis, including:</p>
|
|
|
|
<ul>
|
|
<li>Interprocedural analysis for C++ methods (within a translation unit).</li>
|
|
<li>More precise modeling of C++ initializers and destructors.</li>
|
|
</ul>
|
|
|
|
<p>Finally, this release includes many small improvements to <tt>scan-build</tt>, which can be used to drive the analyzer from the command line or a continuous integration system. This includes a directory-traversal issue, which could cause potential security problems in some cases. We would like to acknowledge Tim Brown of Portcullis Computer Security Ltd for reporting this issue.</p>
|
|
|
|
</div>
|
|
|
|
</div>
|
|
|
|
<!-- *********************************************************************** -->
|
|
<h2>
|
|
<a name="externalproj">External Open Source Projects Using LLVM 3.2</a>
|
|
</h2>
|
|
<!-- *********************************************************************** -->
|
|
|
|
<div>
|
|
|
|
<p>An exciting aspect of LLVM is that it is used as an enabling technology for
|
|
a lot of other language and tools projects. This section lists some of the
|
|
projects that have already been updated to work with LLVM 3.2.</p>
|
|
|
|
<h3>Crack</h3>
|
|
|
|
<div>
|
|
|
|
<p><a href="http://code.google.com/p/crack-language/">Crack</a> aims to provide
|
|
the ease of development of a scripting language with the performance of a
|
|
compiled language. The language derives concepts from C++, Java and Python,
|
|
incorporating object-oriented programming, operator overloading and strong
|
|
typing.</p>
|
|
|
|
</div>
|
|
|
|
<h3>EmbToolkit</h3>
|
|
|
|
<div>
|
|
|
|
<p><a href="http://www.embtoolkit.org/">EmbToolkit</a> provides Linux cross-compiler
|
|
toolchain/SDK (GCC/binutils/C library (uclibc,eglibc,musl)), a build system for
|
|
package cross-compilation and optionally various root file systems.
|
|
It supports ARM and MIPS. There is an ongoing effort to provide a clang+llvm
|
|
environment for the 3.2 releases,
|
|
</p>
|
|
|
|
</div>
|
|
|
|
<h3>FAUST</h3>
|
|
|
|
<div>
|
|
|
|
<p><a href="http://faust.grame.fr/">FAUST</a> is a compiled language for
|
|
real-time audio signal processing. The name FAUST stands for Functional
|
|
AUdio STream. Its programming model combines two approaches: functional
|
|
programming and block diagram composition. In addition with the C, C++, Java,
|
|
JavaScript output formats, the Faust compiler can generate LLVM bitcode, and
|
|
works with LLVM 2.7-3.2.</p>
|
|
|
|
</div>
|
|
|
|
<h3>Glasgow Haskell Compiler (GHC)</h3>
|
|
|
|
<div>
|
|
|
|
<p><a href="http://www.haskell.org/ghc/">GHC</a> is an open source compiler and
|
|
programming suite for Haskell, a lazy functional programming language. It
|
|
includes an optimizing static compiler generating good code for a variety of
|
|
platforms, together with an interactive system for convenient, quick
|
|
development.</p>
|
|
|
|
<p>GHC 7.0 and onwards include an LLVM code generator, supporting LLVM 2.8 and
|
|
later.</p>
|
|
|
|
</div>
|
|
|
|
<h3>Julia</h3>
|
|
|
|
<div>
|
|
|
|
<p><a href="https://github.com/JuliaLang/julia">Julia</a> is a high-level,
|
|
high-performance dynamic language for technical computing. It provides a
|
|
sophisticated compiler, distributed parallel execution, numerical accuracy,
|
|
and an extensive mathematical function library. The compiler uses type
|
|
inference to generate fast code without any type declarations, and uses
|
|
LLVM's optimization passes and JIT compiler. The
|
|
<a href="http://julialang.org/"> Julia Language</a> is designed
|
|
around multiple dispatch, giving programs a large degree of flexibility. It
|
|
is ready for use on many kinds of problems.</p>
|
|
|
|
</div>
|
|
|
|
<h3>LLVM D Compiler</h3>
|
|
|
|
<div>
|
|
|
|
<p><a href="https://github.com/ldc-developers/ldc">LLVM D Compiler</a> (LDC) is
|
|
a compiler for the D programming Language. It is based on the DMD frontend
|
|
and uses LLVM as backend.</p>
|
|
|
|
</div>
|
|
|
|
<h3>Open Shading Language</h3>
|
|
|
|
<div>
|
|
|
|
<p><a href="https://github.com/imageworks/OpenShadingLanguage/">Open Shading
|
|
Language (OSL)</a> is a small but rich language for programmable shading in
|
|
advanced global illumination renderers and other applications, ideal for
|
|
describing materials, lights, displacement, and pattern generation. It uses
|
|
LLVM to JIT complex shader networks to x86 code at runtime.</p>
|
|
|
|
<p>OSL was developed by Sony Pictures Imageworks for use in its in-house
|
|
renderer used for feature film animation and visual effects, and is
|
|
distributed as open source software with the "New BSD" license.
|
|
It has been used for all the shading on such films as The Amazing Spider-Man,
|
|
Men in Black III, Hotel Transylvania, and may other films in-progress,
|
|
and also has been incorporated into several commercial and open source
|
|
rendering products such as Blender, VRay, and Autodesk Beast.</p>
|
|
|
|
</div>
|
|
|
|
<h3>Portable OpenCL (pocl)</h3>
|
|
|
|
<div>
|
|
|
|
<p>In addition to producing an easily portable open source OpenCL
|
|
implementation, another major goal of <a href="http://pocl.sourceforge.net/">
|
|
pocl</a> is improving performance portability of OpenCL programs with
|
|
compiler optimizations, reducing the need for target-dependent manual
|
|
optimizations. An important part of pocl is a set of LLVM passes used to
|
|
statically parallelize multiple work-items with the kernel compiler, even in
|
|
the presence of work-group barriers. This enables static parallelization of
|
|
the fine-grained static concurrency in the work groups in multiple ways
|
|
(SIMD, VLIW, superscalar,...).</p>
|
|
|
|
</div>
|
|
|
|
<h3>Pure</h3>
|
|
|
|
<div>
|
|
|
|
<p><a href="http://pure-lang.googlecode.com/">Pure</a> is an
|
|
algebraic/functional programming language based on term rewriting. Programs
|
|
are collections of equations which are used to evaluate expressions in a
|
|
symbolic fashion. The interpreter uses LLVM as a backend to JIT-compile Pure
|
|
programs to fast native code. Pure offers dynamic typing, eager and lazy
|
|
evaluation, lexical closures, a hygienic macro system (also based on term
|
|
rewriting), built-in list and matrix support (including list and matrix
|
|
comprehensions) and an easy-to-use interface to C and other programming
|
|
languages (including the ability to load LLVM bitcode modules, and inline C,
|
|
C++, Fortran and Faust code in Pure programs if the corresponding
|
|
LLVM-enabled compilers are installed).</p>
|
|
|
|
<p>Pure version 0.56 has been tested and is known to work with LLVM 3.2 (and
|
|
continues to work with older LLVM releases >= 2.5).</p>
|
|
|
|
</div>
|
|
|
|
<h3>TTA-based Co-design Environment (TCE)</h3>
|
|
|
|
<div>
|
|
|
|
<p><a href="http://tce.cs.tut.fi/">TCE</a> is a toolset for designing
|
|
application-specific processors (ASP) based on the Transport triggered
|
|
architecture (TTA). The toolset provides a complete co-design flow from C/C++
|
|
programs down to synthesizable VHDL/Verilog and parallel program binaries.
|
|
Processor customization points include the register files, function units,
|
|
supported operations, and the interconnection network.</p>
|
|
|
|
<p>TCE uses Clang and LLVM for C/C++ language support, target independent
|
|
optimizations and also for parts of code generation. It generates new
|
|
LLVM-based code generators "on the fly" for the designed TTA processors and
|
|
loads them in to the compiler backend as runtime libraries to avoid
|
|
per-target recompilation of larger parts of the compiler chain.</p>
|
|
|
|
</div>
|
|
|
|
</div>
|
|
|
|
<!-- *********************************************************************** -->
|
|
<h2>
|
|
<a name="whatsnew">What's New in LLVM 3.2?</a>
|
|
</h2>
|
|
<!-- *********************************************************************** -->
|
|
|
|
<div>
|
|
|
|
<p>This release includes a huge number of bug fixes, performance tweaks and
|
|
minor improvements. Some of the major improvements and new features are
|
|
listed in this section.</p>
|
|
|
|
<!--=========================================================================-->
|
|
<h3>
|
|
<a name="majorfeatures">Major New Features</a>
|
|
</h3>
|
|
|
|
<div>
|
|
|
|
<!-- Features that need text if they're finished for 3.2:
|
|
ARM EHABI
|
|
combiner-aa?
|
|
strong phi elim
|
|
loop dependence analysis
|
|
CorrelatedValuePropagation
|
|
lib/Transforms/IPO/MergeFunctions.cpp => consider for 3.2.
|
|
Integrated assembler on by default for arm/thumb?
|
|
|
|
-->
|
|
|
|
<!-- Near dead:
|
|
Analysis/RegionInfo.h + Dom Frontiers
|
|
SparseBitVector: used in LiveVar.
|
|
llvm/lib/Archive - replace with lib object?
|
|
-->
|
|
|
|
<p>LLVM 3.2 includes several major changes and big features:</p>
|
|
|
|
<ul>
|
|
<li>Loop Vectorizer.</li>
|
|
<li>New implementation of SROA.</li>
|
|
<li>New NVPTX back-end (replacing existing PTX back-end) based on NVIDIA sources.</li>
|
|
</ul>
|
|
|
|
</div>
|
|
|
|
|
|
<!--=========================================================================-->
|
|
<h3>
|
|
<a name="coreimprovements">LLVM IR and Core Improvements</a>
|
|
</h3>
|
|
|
|
<div>
|
|
|
|
<p>LLVM IR has several new features for better support of new targets and that
|
|
expose new optimization opportunities:</p>
|
|
|
|
<ul>
|
|
<li>Thread local variables may have a specified TLS model. See the
|
|
<a href="LangRef.html#globalvars">Language Reference Manual</a>.</li>
|
|
<li>'TYPE_CODE_FUNCTION_OLD' type code and autoupgrade code for old function attributes format has been removed.</li>
|
|
<li>Internal representation of the Attributes class has been converted into a pointer to an
|
|
opaque object that's uniqued by and stored in the LLVMContext object.
|
|
The Attributes class then becomes a thin wrapper around this opaque object.</li>
|
|
</ul>
|
|
|
|
</div>
|
|
|
|
<!--=========================================================================-->
|
|
<h3>
|
|
<a name="optimizer">Optimizer Improvements</a>
|
|
</h3>
|
|
|
|
<div>
|
|
|
|
<p>In addition to many minor performance tweaks and bug fixes, this release
|
|
includes a few major enhancements and additions to the optimizers:</p>
|
|
|
|
<p> Loop Vectorizer - We've added a loop vectorizer and we are now able to
|
|
vectorize small loops. The loop vectorizer is disabled by default and
|
|
can be enabled using the <b>-mllvm -vectorize-loops</b> flag.
|
|
The SIMD vector width can be specified using the flag
|
|
<b>-mllvm -force-vector-width=4</b>.
|
|
The default value is <b>0</b> which means auto-select.
|
|
<br/>
|
|
We can now vectorize this function:
|
|
|
|
<pre class="doc_code">
|
|
unsigned sum_arrays(int *A, int *B, int start, int end) {
|
|
unsigned sum = 0;
|
|
for (int i = start; i < end; ++i)
|
|
sum += A[i] + B[i] + i;
|
|
|
|
return sum;
|
|
}
|
|
</pre>
|
|
|
|
We vectorize under the following loops:
|
|
<ul>
|
|
<li>The inner most loops must have a single basic block.</li>
|
|
<li>The number of iterations are known before the loop starts to execute.</li>
|
|
<li>The loop counter needs to be incremented by one.</li>
|
|
<li>The loop trip count <b>can</b> be a variable.</li>
|
|
<li>Loops do <b>not</b> need to start at zero.</li>
|
|
<li>The induction variable can be used inside the loop.</li>
|
|
<li>Loop reductions are supported.</li>
|
|
<li>Arrays with affine access pattern do <b>not</b> need to be marked as 'noalias' and are checked at runtime.</li>
|
|
</ul>
|
|
|
|
</p>
|
|
|
|
<p>SROA - We’ve re-written SROA to be significantly more powerful and generate
|
|
code which is much more friendly to the rest of the optimization pipeline.
|
|
Previously this pass had scaling problems that required it to only operate on
|
|
relatively small aggregates, and at times it would mistakenly replace a large
|
|
aggregate with a single very large integer in order to make it a scalar SSA
|
|
value. The result was a large number of i1024 and i2048 values representing any
|
|
small stack buffer. These in turn slowed down many subsequent optimization
|
|
paths.</p>
|
|
<p>The new SROA pass uses a different algorithm that allows it to only promote to
|
|
scalars the pieces of the aggregate actively in use. Because of this it doesn’t
|
|
require any thresholds. It also always deduces the scalar values from the uses
|
|
of the aggregate rather than the specific LLVM type of the aggregate. These
|
|
features combine to both optimize more code with the pass but to improve the
|
|
compile time of many functions dramatically.</p>
|
|
|
|
<ul>
|
|
<li>Branch weight metadata is preserved through more of the optimizer.</li>
|
|
</ul>
|
|
|
|
</div>
|
|
|
|
<!--=========================================================================-->
|
|
<h3>
|
|
<a name="mc">MC Level Improvements</a>
|
|
</h3>
|
|
|
|
<div>
|
|
|
|
<p>The LLVM Machine Code (aka MC) subsystem was created to solve a number of
|
|
problems in the realm of assembly, disassembly, object file format handling,
|
|
and a number of other related areas that CPU instruction-set level tools work
|
|
in. For more information, please see the
|
|
<a href="http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html">Intro
|
|
to the LLVM MC Project Blog Post</a>.</p>
|
|
|
|
<ul>
|
|
<li> Added support for following assembler directives: <code>.ifb</code>, <code>.ifnb</code>, <code>.ifc</code>,
|
|
<code>.ifnc</code>, <code>.purgem</code>, <code>.rept</code> and <code>.version</code> (ELF) as well as Darwin specific
|
|
<code>.pushsection</code>, <code>.popsection</code> and <code>.previous</code> .</li>
|
|
<li>Enhanced handling of <code>.lcomm directive</code>.</li>
|
|
<li>MS style inline assembler: added implementation of the offset and TYPE operators.</li>
|
|
<li>Targets can specify minimum supported NOP size for NOP padding.</li>
|
|
<li>ELF improvements: added support for generating ELF objects on Windows.</li>
|
|
<li>MachO improvements: symbol-difference variables are marked as N_ABS, added direct-to-object attribute for data-in-code markers.</li>
|
|
<li>Added support for annotated disassembly output for x86 and arm targets.</li>
|
|
<li>Arm support has been improved by adding support for ARM TARGET2 relocation
|
|
and fixing hadling of ARM-style "$d.*" labels.</li>
|
|
<li>Implemented local-exec TLS on PowerPC.</li>
|
|
</ul>
|
|
|
|
</div>
|
|
|
|
<!--=========================================================================-->
|
|
<h3>
|
|
<a name="codegen">Target Independent Code Generator Improvements</a>
|
|
</h3>
|
|
|
|
<div>
|
|
|
|
<p>Stack Coloring - We have implemented a new optimization pass
|
|
to merge stack objects which are used in disjoin areas of the code.
|
|
This optimization reduces the required stack space significantly, in cases
|
|
where it is clear to the optimizer that the stack slot is not shared.
|
|
We use the lifetime markers to tell the codegen that a certain alloca
|
|
is used within a region.</p>
|
|
|
|
<p> We now merge consecutive loads and stores. </p>
|
|
|
|
<p>We have put a significant amount of work into the code generator
|
|
infrastructure, which allows us to implement more aggressive algorithms and
|
|
make it run faster:</p>
|
|
|
|
<p> We added new TableGen infrastructure to support bundling for
|
|
Very Long Instruction Word (VLIW) architectures. TableGen can now
|
|
automatically generate a deterministic finite automaton from a VLIW
|
|
target's schedule description which can be queried to determine
|
|
legal groupings of instructions in a bundle.</p>
|
|
|
|
<p> We have added a new target independent VLIW packetizer based on the
|
|
DFA infrastructure to group machine instructions into bundles.</p>
|
|
|
|
<p> We have added new TableGen infrastructure to support relationship maps
|
|
between instructions. This feature enables TableGen to automatically
|
|
construct a set of relation tables and query functions that can be used
|
|
to switch between various forms of instructions. For more information,
|
|
please refer to <a href="http://llvm.org/docs/HowToUseInstrMappings.html">
|
|
How To Use Instruction Mappings</a>.</p>
|
|
|
|
</div>
|
|
|
|
<h4>
|
|
<a name="blockplacement">Basic Block Placement</a>
|
|
</h4>
|
|
|
|
<div>
|
|
|
|
<p>A probability based block placement and code layout algorithm was added to
|
|
LLVM's code generator. This layout pass supports probabilities derived from
|
|
static heuristics as well as source code annotations such as
|
|
<code>__builtin_expect</code>.</p>
|
|
|
|
</div>
|
|
|
|
<!--=========================================================================-->
|
|
<h3>
|
|
<a name="x86">X86-32 and X86-64 Target Improvements</a>
|
|
</h3>
|
|
|
|
<div>
|
|
|
|
<p>New features and major changes in the X86 target include:</p>
|
|
|
|
<ul>
|
|
<li>Small codegen optimizations, especially for AVX2.</li>
|
|
</ul>
|
|
|
|
</div>
|
|
|
|
<!--=========================================================================-->
|
|
<h3>
|
|
<a name="ARM">ARM Target Improvements</a>
|
|
</h3>
|
|
|
|
<div>
|
|
|
|
<p>New features of the ARM target include:</p>
|
|
|
|
<ul>
|
|
<li>Support and performance tuning for the A6 'Swift' CPU.</li>
|
|
</ul>
|
|
|
|
<!--_________________________________________________________________________-->
|
|
|
|
<h4>
|
|
<a name="armintegratedassembler">ARM Integrated Assembler</a>
|
|
</h4>
|
|
|
|
<div>
|
|
|
|
<p>The ARM target now includes a full featured macro assembler, including
|
|
direct-to-object module support for clang. The assembler is currently enabled
|
|
by default for Darwin only pending testing and any additional necessary
|
|
platform specific support for Linux.</p>
|
|
|
|
<p>Full support is included for Thumb1, Thumb2 and ARM modes, along with
|
|
sub-target and CPU specific extensions for VFP2, VFP3 and NEON.</p>
|
|
|
|
<p>The assembler is Unified Syntax only (see ARM Architecural Reference Manual
|
|
for details). While there is some, and growing, support for pre-unfied
|
|
(divided) syntax, there are still significant gaps in that support.</p>
|
|
|
|
</div>
|
|
|
|
</div>
|
|
|
|
<!--=========================================================================-->
|
|
<h3>
|
|
<a name="MIPS">MIPS Target Improvements</a>
|
|
</h3>
|
|
|
|
<div>
|
|
|
|
<p>New features and major changes in the MIPS target include:</p>
|
|
|
|
<ul>
|
|
<li>Integrated assembler support:
|
|
MIPS32 works for both PIC and static, known limitation is the PR14456 where
|
|
R_MIPS_GPREL16 relocation is generated with the wrong addend.
|
|
MIPS64 support is incomplete, for example exception handling is not working.</li>
|
|
<li>Support for fast calling convention has been added.</li>
|
|
<li>Support for Android MIPS toolchain has been added to clang driver.</li>
|
|
<li>Added clang driver support for MIPS N32 ABI through "-mabi=n32" option.</li>
|
|
<li>MIPS32 and MIPS64 disassembler has been implemented.</li>
|
|
<li>Support for compiling programs with large GOTs (exceeding 64kB in size) has been added
|
|
through llc option "-mxgot".</li>
|
|
<li>Added experimental support for MIPS32 DSP intrinsics.</li>
|
|
<li>Experimental support for MIPS16 with following limitations: only soft float is supported,
|
|
C++ exceptions are not supported, large stack frames (> 32000 bytes) are not supported,
|
|
direct object code emission is not supported only .s .</li>
|
|
<li>Standalone assembler (llvm-mc): implementation is in progress and considered experimental.</li>
|
|
<li>All classic JIT and MCJIT tests pass on Little and Big Endian MIPS32 platforms.</li>
|
|
<li>Inline asm support: all common constraints and operand modifiers have been implemented.</li>
|
|
<li>Added tail call optimization support, use llc option "-enable-mips-tail-calls"
|
|
or clang options "-mllvm -enable-mips-tail-calls"to enable it.</li>
|
|
<li>Improved register allocation by removing registers $fp, $gp, $ra and $at from the list of reserved registers.</li>
|
|
<li>Long branch expansion pass has been implemented, which expands branch
|
|
instructions with offsets that do not fit in the 16-bit field.</li>
|
|
<li>Cavium Octeon II board is used for testing builds (llvm-mips-linux builder).</li>
|
|
</ul>
|
|
|
|
</div>
|
|
|
|
<!--=========================================================================-->
|
|
<h3>
|
|
<a name="PowerPC">PowerPC Target Improvements</a>
|
|
</h3>
|
|
|
|
<div>
|
|
|
|
<p>Many fixes and changes across LLVM (and Clang) for better compliance with
|
|
the 64-bit PowerPC ELF Application Binary Interface, interoperability with
|
|
GCC, and overall 64-bit PowerPC support. Some highlights include:</p>
|
|
<ul>
|
|
<li> MCJIT support added.</li>
|
|
<li> PPC64 relocation support and (small code model) TOC handling
|
|
added.</li>
|
|
<li> Parameter passing and return value fixes (alignment issues,
|
|
padding, varargs support, proper register usage, odd-sized
|
|
structure support, float support, extension of return values
|
|
for i32 return values).</li>
|
|
<li> Fixes in spill and reload code for vector registers.</li>
|
|
<li> C++ exception handling enabled.</li>
|
|
<li> Changes to remediate double-rounding compatibility issues with
|
|
respect to GCC behavior.</li>
|
|
<li> Refactoring to disentangle ppc64-elf-linux ABI from Darwin
|
|
ppc64 ABI support.</li>
|
|
<li> Assorted new test cases and test case fixes (endian and word
|
|
size issues).</li>
|
|
<li> Fixes for big-endian codegen bugs, instruction encodings, and
|
|
instruction constraints.</li>
|
|
<li> Implemented -integrated-as support.</li>
|
|
<li> Additional support for Altivec compare operations.</li>
|
|
<li> IBM long double support.</li>
|
|
</ul>
|
|
<p>There have also been code generation improvements for both 32- and 64-bit
|
|
code. Instruction scheduling support for the Freescale e500mc and e5500
|
|
cores has been added.</p>
|
|
|
|
</div>
|
|
|
|
<!--=========================================================================-->
|
|
<h3>
|
|
<a name="NVPTX">PTX/NVPTX Target Improvements</a>
|
|
</h3>
|
|
|
|
<div>
|
|
|
|
<p>The PTX back-end has been replaced by the NVPTX back-end, which is based on
|
|
the LLVM back-end used by NVIDIA in their CUDA (nvcc) and OpenCL compiler.
|
|
Some highlights include:</p>
|
|
<ul>
|
|
<li>Compatibility with PTX 3.1 and SM 3.5</li>
|
|
<li>Support for NVVM intrinsics as defined in the NVIDIA Compiler SDK</li>
|
|
<li>Full compatibility with old PTX back-end, with much greater coverage of
|
|
LLVM IR</li>
|
|
</ul>
|
|
|
|
<p>Please submit any back-end bugs to the LLVM Bugzilla site.</p>
|
|
|
|
</div>
|
|
|
|
<!--=========================================================================-->
|
|
<h3>
|
|
<a name="OtherTS">Other Target Specific Improvements</a>
|
|
</h3>
|
|
|
|
<div>
|
|
|
|
<ul>
|
|
<li>Added support for custom names for library functions in TargetLibraryInfo.</li>
|
|
</ul>
|
|
|
|
</div>
|
|
|
|
<!--=========================================================================-->
|
|
<h3>
|
|
<a name="changes">Major Changes and Removed Features</a>
|
|
</h3>
|
|
|
|
<div>
|
|
|
|
<p>If you're already an LLVM user or developer with out-of-tree changes based on
|
|
LLVM 3.2, this section lists some "gotchas" that you may run into upgrading
|
|
from the previous release.</p>
|
|
|
|
<ul>
|
|
<li>llvm-ld and llvm-stub have been removed, llvm-ld functionality can be partially replaced by
|
|
llvm-link | opt | {llc | as, llc -filetype=obj} | ld, or fully replaced by Clang. </li>
|
|
<li>MCJIT: added support for inline assembly (requires asm parser), added faux remote target execution to lli option '-remote-mcjit'.</li>
|
|
</ul>
|
|
|
|
</div>
|
|
|
|
<!--=========================================================================-->
|
|
<h3>
|
|
<a name="api_changes">Internal API Changes</a>
|
|
</h3>
|
|
|
|
<div>
|
|
|
|
<p>In addition, many APIs have changed in this release. Some of the major
|
|
LLVM API changes are:</p>
|
|
|
|
<p> We've added a new interface for allowing IR-level passes to access
|
|
target-specific information. A new IR-level pass, called
|
|
"TargetTransformInfo" provides a number of low-level interfaces.
|
|
LSR and LowerInvoke already use the new interface. </p>
|
|
|
|
<p> The TargetData structure has been renamed to DataLayout and moved to VMCore
|
|
to remove a dependency on Target. </p>
|
|
|
|
</div>
|
|
|
|
<!--=========================================================================-->
|
|
<h3>
|
|
<a name="tools_changes">Tools Changes</a>
|
|
</h3>
|
|
|
|
<div>
|
|
|
|
<p>In addition, some tools have changed in this release. Some of the changes are:</p>
|
|
|
|
<ul>
|
|
<li>opt: added support for '-mtriple' option.</li>
|
|
<li>llvm-mc : - added '-disassemble' support for '-show-inst' and '-show-encoding' options, added '-edis' option to produce annotated
|
|
disassembly output for X86 and ARM targets.</li>
|
|
<li>libprofile: allows the profile data file name to be specified by the LLVMPROF_OUTPUT environment variable.</li>
|
|
<li>llvm-objdump: has been changed to display available targets, '-arch' option accepts x86 and x86-64 as valid arch names.</li>
|
|
<li>llc and opt: added FMA formation from pairs of FADD + FMUL or FSUB + FMUL enabled by option '-enable-excess-fp-precision' or option '-enable-unsafe-fp-math',
|
|
option '-fp-contract' controls the creation by optimizations of fused FP by selecting Fast, Standard, or Strict mode.</li>
|
|
<li>llc: object file output from llc is no longer considered experimental.</li>
|
|
<li>gold plugin: handles Position Independent Executables.</li>
|
|
</ul>
|
|
|
|
</div>
|
|
|
|
|
|
<!-- *********************************************************************** -->
|
|
<h2>
|
|
<a name="knownproblems">Known Problems</a>
|
|
</h2>
|
|
<!-- *********************************************************************** -->
|
|
|
|
<div>
|
|
|
|
<p>LLVM is generally a production quality compiler, and is used by a broad range
|
|
of applications and shipping in many products. That said, not every
|
|
subsystem is as mature as the aggregate, particularly the more obscure
|
|
targets. If you run into a problem, please check
|
|
the <a href="http://llvm.org/bugs/">LLVM bug database</a> and submit a bug if
|
|
there isn't already one or ask on
|
|
the <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">LLVMdev
|
|
list</a>.</p>
|
|
|
|
<p>Known problem areas include:</p>
|
|
|
|
<ul>
|
|
<li>The CellSPU, MSP430, and XCore backends are experimental, and the CellSPU backend will be removed in LLVM 3.3.</li>
|
|
|
|
<li>The integrated assembler, disassembler, and JIT is not supported by
|
|
several targets. If an integrated assembler is not supported, then a
|
|
system assembler is required. For more details, see the <a
|
|
href="CodeGenerator.html#targetfeatures">Target Features Matrix</a>.
|
|
</li>
|
|
</ul>
|
|
|
|
</div>
|
|
|
|
<!-- *********************************************************************** -->
|
|
<h2>
|
|
<a name="additionalinfo">Additional Information</a>
|
|
</h2>
|
|
<!-- *********************************************************************** -->
|
|
|
|
<div>
|
|
|
|
<p>A wide variety of additional information is available on
|
|
the <a href="http://llvm.org/">LLVM web page</a>, in particular in
|
|
the <a href="http://llvm.org/docs/">documentation</a> section. The web page
|
|
also contains versions of the API documentation which is up-to-date with the
|
|
Subversion version of the source code. You can access versions of these
|
|
documents specific to this release by going into the "<tt>llvm/doc/</tt>"
|
|
directory in the LLVM tree.</p>
|
|
|
|
<p>If you have any questions or comments about LLVM, please feel free to contact
|
|
us via the <a href="http://llvm.org/docs/#maillist"> mailing lists</a>.</p>
|
|
|
|
</div>
|
|
|
|
<!-- *********************************************************************** -->
|
|
|
|
<hr>
|
|
<address>
|
|
<a href="http://jigsaw.w3.org/css-validator/check/referer"><img
|
|
src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a>
|
|
<a href="http://validator.w3.org/check/referer"><img
|
|
src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a>
|
|
|
|
<a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br>
|
|
Last modified: $Date$
|
|
</address>
|
|
|
|
</body>
|
|
</html>
|