checkpoint, don't expect this to read right yet. :)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115426 91177308-0d34-0410-b5e6-96231b3b80d8
2025-01-30 04:35:00 +00:00 · 2010-10-02 21:59:30 +00:00 · 2010-10-02 21:59:30 +00:00 · 7d9b6b439a
commit 7d9b6b439a
parent d476914607
1 changed files with 115 additions and 96 deletions
--- a/docs/ReleaseNotes.html
+++ b/docs/ReleaseNotes.html
@ -67,7 +67,6 @@ current one.  To see the release notes for a specific release, please see the
 Almost dead code.
  include/llvm/Analysis/LiveValues.h => Dan
  lib/Transforms/IPO/MergeFunctions.cpp => consider for 2.8.
  llvm/Analysis/PointerTracking.h => Edwin wants this, consider for 2.8.
  GEPSplitterPass
 -->
@ -82,79 +81,6 @@ Almost dead code.
 <!-- Announcement, lldb, libc++ -->
 <!-- to write:
  MachineCSE tuned and on by default.
  llvm.dbg.value: variable debug info for optimized code
  MC Assembler backend is now real, does relaxation and is bitwise identical
    with darwin assembler in huge majority of all cases.
  new GHC calling convention
  New half float intrinsics LangRef.html#int_fp16
  Rewrote tblgen's type inference for backends to be more consistent and
     diagnose more target bugs.  This also allows limited support for writing
     patterns for instructions that return multiple results, e.g. a virtual
     register and a flag result.  Stuff that used 'parallel' before should use
     this.
  New ARM/Thumb disassembler support in MC.
  New SSEDomainFix pass: 
    On Nehalem and newer CPUs there is a 2 cycle latency penalty on using a
    register in a different domain than where it was defined. Some instructions
    have equvivalents for different domains, like por/orps/orpd.  The
    SSEDomainFix pass tries to minimize the number of domain crossings by
    changing between equvivalent opcodes where possible.
  Support for the Intel AES instructions in the assembler.
  memcpy, memmove, and memset now take address space qualified pointers + volatile.
  per-instruction debug info metadata is much faster and uses less space (new DebugLoc class).
  -ffunction-sections and -fdata-sections are supported on ELF targets.
  Now iterate function passes when a cgsccpassmanager detects a devirtualization
  -momit-leaf-frame-pointer now supported.
  New -regalloc=fast,  =local got removed
  New -regalloc=default option that chooses a register allocator based on the -O optimization level.
  New "trap values" concept: http://llvm.org/docs/LangRef.html#trapvalues
  Improved trip count analysis for <= and >= loops, and uses sign overflow info.
  REMOVED: SCCVN pass.
  X86 backend attempts to promote 16-bit integer operations to 32-bits to avoid
     0x66 prefixes, which are slow on some microarchitectures and bloat the code
     on others.
  X87 fp stackifier is global!
  LTO debug info support?
  NEON: Better performance for QQQQ (4-consecutive Q register) instructions.  New reg sequence abstraction?
  New support for X86 "thiscall" calling convention (x86_thiscallcc in IR).
  ARM: Better scheduling (list-hybrid, hybrid?)
  New SubRegIndex tblgen class for targets -> jakob
  ARM: Tail call support.
  AVX support in the MC assembler.  Full compiler support not done yet.
  Atomics now get legalized when not natively supported (jim g)
  ARM: General performance work and tuning.
  Bottom up fast isel.  Simple Load reuse.  No more machinedce.  Load folding at -O0?
  New linker_private_weak and linker_private_weak_def_auto linkage types
  compiler_rt softfloat support.
  X86 ABI:  <2 x float> in IR no longer maps onto MMX, it turns into <4 x float>
  IR ABI: <3 x float> is passed as <4 x float> instead of 3 floats.
  renamed "Release" -> "Release+Asserts"; "Release-Asserts" -> "Release etc.
  New COPY instruction. copyRegToReg -> copyPhysReg, isMoveInstr is gone.
  JumpThreading much more aggressive about implied value relations.
  New RegionInfo pass  "opt -regions analyze" or "opt -view-regions".
  mc assembler supports macros.
  RenderMachineFunction: -rendermf
  SplitKit?
  Evan: Teach bottom up pre-ra scheduler to track register pressure. Work in progress.
  Evan: Add an ILP scheduler.  On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2 by 16%.
  RegisterPass<> -> INTIALIZE_PASS()
  llvm-diff?
  Preliminary work on TBAA but not usable in 2.8.
  Atomic lowering patch: -loweratomic (see Passes.html#loweratomic)
  compiler_rt now includes extensive a fairly testsuite for blocks language feature and the blocks runtime.
  New OptimizeExts+OptimizeCmps -> PeepholeOptimizer pass
  Triples are now stored in normalized form.  Triple::normalize.
  New LocalStackSlotAllocation.cpp pass (jimg)
  New llvm.x86.int intrinsic (for int $42 and int3)
  New CorrelatedValuePropagation pass, not on by default in 2.8 yet.
  Verbose assembly decodes X86 shuffle instructions, e.g.:
  	insertps	$113, %xmm3, %xmm0     ## xmm0 = zero,xmm0[1,2],xmm3[1]
 	unpcklps	%xmm1, %xmm0    ## xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
 	pshufd	$1, %xmm1, %xmm1        ## xmm1 = xmm1[1,0,0,0]
 -->
 <!-- *********************************************************************** -->
 <div class="doc_section">
@ -253,10 +179,10 @@ libgcc routines).</p>
 <p>
 All of the code in the compiler-rt project is available under the standard LLVM
-License, a "BSD-style" license.  New in LLVM 2.8: 
+License, a "BSD-style" license.  New in LLVM 2.8, compiler_rt now supports 
-
+soft floating point (for targets that don't have a real floating point unit),
-Soft float support
+and includes an extensive testsuite for the "blocks" language feature and the
-</p>
+blocks runtime included in compiler_rt.</p>
 </div>
@ -526,10 +452,6 @@ organization changes have happened:
 <p>LLVM 2.8 includes several major new capabilities:</p>
 <ul>
 <li>atomic lowering pass.</li>
 <li>RegionInfo pass: opt -regions analyze" or "opt -view-regions". 
 <!-- Tobias Grosser --></li>
 <li>ARMGlobalMerge: <!-- Anton --> </li>
 <li>llvm-diff</li>
 </ul>
@ -546,6 +468,13 @@ expose new optimization opportunities:</p>
 <ul>
  memcpy, memmove, and memset now take address space qualified pointers + volatile.
  per-instruction debug info metadata is much faster and uses less space (new DebugLoc class).
  New "trap values" concept: http://llvm.org/docs/LangRef.html#trapvalues
  New linker_private_weak and linker_private_weak_def_auto linkage types
  Triples are now stored in normalized form.  Triple::normalize.
 <li>LLVM 2.8 changes the internal order of operands in <a
  href="http://llvm.org/doxygen/classllvm_1_1InvokeInst.html"><tt>InvokeInst</tt></a>
  and <a href="http://llvm.org/doxygen/classllvm_1_1CallInst.html"><tt>CallInst</tt></a>.
@ -612,6 +541,14 @@ release includes a few major enhancements and additions to the optimizers:</p>
 <ul>
 <li></li>
  Preliminary work on TBAA but not usable in 2.8.
  New CorrelatedValuePropagation pass, not on by default in 2.8 yet.
  JumpThreading much more aggressive about implied value relations.
  New RegionInfo pass  "opt -regions analyze" or "opt -view-regions".
  Improved trip count analysis for <= and >= loops, and uses sign overflow info.
  llvm.dbg.value: variable debug info for optimized code
  Now iterate function passes when a cgsccpassmanager detects a devirtualization
  Atomic lowering patch: -loweratomic (see Passes.html#loweratomic)
 </ul>
@ -639,22 +576,38 @@ release includes a few major enhancements and additions to the optimizers:</p>
 <div class="doc_text">
 <p>
-FIXME: Rewrite.
+The LLVM Machine Code (aka MC) subsystem was created to solve a number
 The LLVM Machine Code (aka MC) sub-project of LLVM was created to solve a number
 of problems in the realm of assembly, disassembly, object file format handling,
 and a number of other related areas that CPU instruction-set level tools work
-in. It is a sub-project of LLVM which provides it with a number of advantages
+in.</p>
-over other compilers that do not have tightly integrated assembly-level tools.
+
-For a gentle introduction, please see the <a
+<p>The MC subproject has made great leaps in LLVM 2.8.  For example, support for
   directly writing .o files from LLC (and clang) now works reliably for
   darwin/x86[-64] (including inline assembly support) and the integrated
   assembler is turned on by default in Clang for these targets.  This provides
   improved compile times among other things.</p>
 <ul>
 <li>The entire compiler has converted over to using the MCStreamer assembler API
    instead of writing out a .s file textually.</li>
 <li>The "assembler parser" is far more mature than in 2.7, supporting a full
    complement of directives, now supports assembler macros, etc.</li>
 <li>The "assembler backend" has been completed, including support for relaxation
    relocation processing and all the other things that an assembler does.</li>
 <li>The MachO file format support is now fully functional and works.</li>
 <li>The MC disassembler now fully supports ARM and Thumb.  ARM assembler support
    is still in early development though.</li>
 <li>The X86 MC assembler now supports the X86 AES and AVX instruction set.</li>
 <li>Work on ELF and COFF support is well underway, but isn't useful yet in LLVM
    2.8.  Please contact the llvmdev mailing list if you're interested in
    this.</li>
 </ul>
 <p>For more information, please see the <a
 href="http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html">Intro to the
 LLVM MC Project Blog Post</a>.
 </p>
 <p>2.8 status here.  Basic correctness, some obscure missing instructions on
   mainline, on by default in clang.
   Entire compiler backend converted to use mcstreamer.
   </p>
 </div>	
@ -671,7 +624,36 @@ infrastructure, which allows us to implement more aggressive algorithms and make
 it run faster:</p>
 <ul>
-<li>MachO writer works.</li>
+<li></li>
  MachineCSE tuned and on by default.
  Rewrote tblgen's type inference for backends to be more consistent and
     diagnose more target bugs.  This also allows limited support for writing
     patterns for instructions that return multiple results, e.g. a virtual
     register and a flag result.  Stuff that used 'parallel' before should use
     this.
  New -regalloc=fast,  =local got removed
  New -regalloc=default option that chooses a register allocator based on the -O optimization level.
  New SubRegIndex tblgen class for targets -> jakob
  Bottom up fast isel.  Simple Load reuse.  No more machinedce.
  IR ABI: <3 x float> is passed as <4 x float> instead of 3 floats.
  New COPY instruction. copyRegToReg -> copyPhysReg, isMoveInstr is gone.
  RenderMachineFunction: -rendermf
  SplitKit?
  Evan: Teach bottom up pre-ra scheduler to track register pressure. Work in progress.
  Evan: Add an ILP scheduler.  On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2 by 16%.
  New OptimizeExts+OptimizeCmps -> PeepholeOptimizer pass
  New LocalStackSlotAllocation.cpp pass (jimg)
  Atomics now get legalized when not natively supported (jim g)
  -ffunction-sections and -fdata-sections are supported on ELF targets.
  -momit-leaf-frame-pointer now supported.
 </ul>
 </div>
@ -689,6 +671,30 @@ it run faster:</p>
    in registers across basic blocks, dramatically improving performance of code
    that uses long double, and when targetting CPUs that don't support SSE.</li>
  New SSEDomainFix pass: 
    On Nehalem and newer CPUs there is a 2 cycle latency penalty on using a
    register in a different domain than where it was defined. Some instructions
    have equvivalents for different domains, like por/orps/orpd.  The
    SSEDomainFix pass tries to minimize the number of domain crossings by
    changing between equvivalent opcodes where possible.
  X86 backend attempts to promote 16-bit integer operations to 32-bits to avoid
     0x66 prefixes, which are slow on some microarchitectures and bloat the code
     on others.
  New support for X86 "thiscall" calling convention (x86_thiscallcc in IR) for windows.
  New llvm.x86.int intrinsic (for int $42 and int3)
  Verbose assembly decodes X86 shuffle instructions, e.g.:
  	insertps	$113, %xmm3, %xmm0     ## xmm0 = zero,xmm0[1,2],xmm3[1]
 	unpcklps	%xmm1, %xmm0    ## xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
 	pshufd	$1, %xmm1, %xmm1        ## xmm1 = xmm1[1,0,0,0]
  X86 ABI:  <2 x float> in IR no longer maps onto MMX, it turns into <4 x float>
  new GHC calling convention
 </ul>
 </div>
@ -704,6 +710,14 @@ it run faster:</p>
 <ul>
  NEON: Better performance for QQQQ (4-consecutive Q register) instructions.  New reg sequence abstraction?
  ARM: Better scheduling (list-hybrid, hybrid?)
  ARM: Tail call support.
  ARM: General performance work and tuning.
  ARM: Half float support through intrinsics LangRef.html#int_fp16
 <li>ARMGlobalMerge: <!-- Anton --> </li>
 <li>
  All of the NEON load and store intrinsics (llvm.arm.neon.vld* and
  llvm.arm.neon.vst*) take an extra parameter to specify the alignment in bytes
@ -795,17 +809,22 @@ it run faster:</p>
 on LLVM 2.7, this section lists some "gotchas" that you may run into upgrading
 from the previous release.</p>
  renamed "Release" -> "Release+Asserts"; "Release-Asserts" -> "Release etc.
  RegisterPass<> -> INTIALIZE_PASS()
 <ul>
 <li>.ll file doesn't produce #uses comments anymore, to get them, run a .bc file
   through "llvm-dis --show-annotations".</li>
 <li>MSIL Backend removed.</li>
 <li>ABCD and SSI passes removed.</li>
 <li>'Union' LLVM IR feature removed.</li>
 <li>SCCVN pass removed.</li>
 </ul>
 <p>In addition, many APIs have changed in this release.  Some of the major LLVM
 API changes are:</p>
 <ul>
 </ul>
@ -844,8 +863,8 @@ href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">LLVMdev list</a>.</p>
 <ul>
 <li>The Alpha, SPU, MIPS, PIC16, Blackfin, MSP430, SystemZ and MicroBlaze
    backends are experimental.</li>
-<li><tt>llc</tt> "<tt>-filetype=asm</tt>" (the default) is the only
+<li><tt>llc</tt> "<tt>-filetype=obj</tt>" is experimental on all targets
-    supported value for this option.  XXX Update me</li>
+    other than darwin-i386 and darwin-x86_64.</li>
 </ul>
 </div>