mirror of
https://github.com/c64scene-ar/llvm-6502.git
synced 2025-01-19 04:32:19 +00:00
checkpoint
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115494 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
parent
11b66112e0
commit
3bdcda1a8b
@ -645,21 +645,6 @@ release includes a few major enhancements and additions to the optimizers:</p>
|
||||
|
||||
</div>
|
||||
|
||||
|
||||
<!--=========================================================================-->
|
||||
<div class="doc_subsection">
|
||||
<a name="executionengine">Interpreter and JIT Improvements</a>
|
||||
</div>
|
||||
|
||||
<div class="doc_text">
|
||||
|
||||
<ul>
|
||||
<li></li>
|
||||
|
||||
</ul>
|
||||
|
||||
</div>
|
||||
|
||||
<!--=========================================================================-->
|
||||
<div class="doc_subsection">
|
||||
<a name="mc">MC Level Improvements</a>
|
||||
@ -689,9 +674,9 @@ in.</p>
|
||||
<li>The MC disassembler now fully supports ARM and Thumb. ARM assembler support
|
||||
is still in early development though.</li>
|
||||
<li>The X86 MC assembler now supports the X86 AES and AVX instruction set.</li>
|
||||
<li>Work on ELF and COFF support is well underway, but isn't useful yet in LLVM
|
||||
2.8. Please contact the llvmdev mailing list if you're interested in
|
||||
this.</li>
|
||||
<li>Work on ELF and COFF object files and ARM target support is well underway,
|
||||
but isn't useful yet in LLVM 2.8. Please contact the llvmdev mailing list
|
||||
if you're interested in this.</li>
|
||||
</ul>
|
||||
|
||||
<p>For more information, please see the <a
|
||||
@ -702,7 +687,6 @@ LLVM MC Project Blog Post</a>.
|
||||
</div>
|
||||
|
||||
|
||||
|
||||
<!--=========================================================================-->
|
||||
<div class="doc_subsection">
|
||||
<a name="codegen">Target Independent Code Generator Improvements</a>
|
||||
@ -715,35 +699,57 @@ infrastructure, which allows us to implement more aggressive algorithms and make
|
||||
it run faster:</p>
|
||||
|
||||
<ul>
|
||||
<li></li>
|
||||
<li>The clang/gcc -momit-leaf-frame-pointer argument is now supported.</li>
|
||||
<li>The clang/gcc -ffunction-sections and -fdata-sections arguments are now
|
||||
supported on ELF targets (like GCC).</li>
|
||||
<li>The MachineCSE pass is now tuned and on by default. It eliminates common
|
||||
subexpressions that are exposed when lowering to machine instructions.</li>
|
||||
<li>The "local" register allocator was replaced by a new "fast" register
|
||||
allocator. This new allocator (which is often used at -O0) is substantially
|
||||
faster and produces better code than the old local register allocator.</li>
|
||||
<li>A new LLC "-regalloc=default" option is available, which automatically
|
||||
chooses a register allocator based on the -O optimization level.</li>
|
||||
<li>The common code generator code was modified to promote illegal argument and
|
||||
return value vectors to wider ones when possible instead of scalarizing
|
||||
them. For example, <3 x float> will now pass in one SSE register
|
||||
instead of 3 on X86. This generates substantially better code since the
|
||||
rest of the code generator was already expecting this.</li>
|
||||
<li>The code generator uses a new "COPY" machine instruction. This speeds up
|
||||
the code generator and eliminates the need for targets to implement the
|
||||
isMoveInstr hook. Also, the copyRegToReg hook was renamed to copyPhysReg
|
||||
and simplified.</li>
|
||||
<li>The code generator now has a "LocalStackSlotPass", which optimizes stack
|
||||
slot access for targets (like ARM) that have limited stack displacement
|
||||
addressing.</li>
|
||||
<li>A new "PeepholeOptimizer" is available, which eliminates sign and zero
|
||||
extends, and optimizes away compare instructions when the condition result
|
||||
is available from a previous instruction.</li>
|
||||
<li>Atomic operations now get legalized into simpler atomic operations if not
|
||||
natively supported, easy the implementation burden on targets.</li>
|
||||
<li>The bottom-up pre-allocation scheduler is now register pressure aware,
|
||||
allowing it to avoid overscheduling in high pressure situations while still
|
||||
aggressively scheduling when registers are available.</li>
|
||||
<li>A new instruction-level-parallelism pre-allocation scheduler is available,
|
||||
which is also register pressure aware. This scheduler has shown substantial
|
||||
wins on X86-64 and is on by default.</li>
|
||||
<li>The tblgen type inference algorithm was rewritten to be more consistent and
|
||||
diagnose more target bugs. If you have an out-of-tree backend, you may
|
||||
find that it finds bugs in your target description. This support also
|
||||
allows limited support for writing patterns for instructions that return
|
||||
multiple results (e.g. a virtual register and a flag result). The
|
||||
'parallel' modifier in tblgen was removed, you should use the new support
|
||||
for multiple results instead.</li>
|
||||
<li>A new (experimental) "-rendermf" pass is available which renders a
|
||||
MachineFunction into HTML, showing live ranges and other useful
|
||||
details.</li>
|
||||
|
||||
MachineCSE tuned and on by default.
|
||||
<!--New SubRegIndex tblgen class for targets -> jakob -->
|
||||
<!-- SplitKit -->
|
||||
|
||||
Rewrote tblgen's type inference for backends to be more consistent and
|
||||
diagnose more target bugs. This also allows limited support for writing
|
||||
patterns for instructions that return multiple results, e.g. a virtual
|
||||
register and a flag result. Stuff that used 'parallel' before should use
|
||||
this.
|
||||
|
||||
New -regalloc=fast, =local got removed
|
||||
New -regalloc=default option that chooses a register allocator based on the -O optimization level.
|
||||
New SubRegIndex tblgen class for targets -> jakob
|
||||
|
||||
Bottom up fast isel. Simple Load reuse. No more machinedce.
|
||||
IR ABI: <3 x float> is passed as <4 x float> instead of 3 floats.
|
||||
|
||||
New COPY instruction. copyRegToReg -> copyPhysReg, isMoveInstr is gone.
|
||||
RenderMachineFunction: -rendermf
|
||||
SplitKit?
|
||||
Evan: Teach bottom up pre-ra scheduler to track register pressure. Work in progress.
|
||||
Evan: Add an ILP scheduler. On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2 by 16%.
|
||||
|
||||
New OptimizeExts+OptimizeCmps -> PeepholeOptimizer pass
|
||||
New LocalStackSlotAllocation.cpp pass (jimg)
|
||||
Atomics now get legalized when not natively supported (jim g)
|
||||
|
||||
-ffunction-sections and -fdata-sections are supported on ELF targets.
|
||||
-momit-leaf-frame-pointer now supported.
|
||||
<li>The -fast-isel instruction selection path (used at -O0 on X86) was rewritten
|
||||
to work bottom-up on basic blocks instead of top down. This makes it
|
||||
slightly faster (because the MachineDCE pass is not needed any longer) and
|
||||
allows it to generate better code in some cases.</li>
|
||||
|
||||
</ul>
|
||||
</div>
|
||||
@ -860,24 +866,6 @@ it run faster:</p>
|
||||
</ul>
|
||||
</div>
|
||||
|
||||
<!--=========================================================================-->
|
||||
<div class="doc_subsection">
|
||||
<a name="newapis">New Useful APIs</a>
|
||||
</div>
|
||||
|
||||
<div class="doc_text">
|
||||
|
||||
<p>This release includes a number of new APIs that are used internally, which
|
||||
may also be useful for external clients.
|
||||
</p>
|
||||
|
||||
<ul>
|
||||
<li></li>
|
||||
</ul>
|
||||
|
||||
|
||||
</div>
|
||||
|
||||
<!--=========================================================================-->
|
||||
<div class="doc_subsection">
|
||||
<a name="otherimprovements">Other Improvements and New Features</a>
|
||||
|
Loading…
x
Reference in New Issue
Block a user