llvm-6502/lib
Chandler Carruth ab1f4ef9a2 Add some optional passes around the vectorizer to both better prepare
the IR going into it and to clean up the IR produced by the vectorizers.

Note that these are *off by default* right now while folks collect data
on whether the performance tradeoff is reasonable.

In a build of the 'opt' binary, I see about 2% compile time regression
due to this change on average. This is in my mind essentially the worst
expected case: very little of the opt binary is going to *benefit* from
these extra passes.

I've seen several benchmarks improve in performance my small amounts due
to running these passes, and there are certain (rare) cases where these
passes make a huge difference by either enabling the vectorizer at all
or by hoisting runtime checks out of the outer loop. My primary
motivation is to prevent people from seeing runtime check overhead in
benchmarks where the existing passes and optimizers would be able to
eliminate that.

I've chosen the sequence of passes based on the kinds of things that
seem likely to be relevant for the code at each stage: rotaing loops for
the vectorizer, finding correlated values, loop invariants, and
unswitching opportunities from any runtime checks, and cleaning up
commonalities exposed by the SLP vectorizer.

I'll be pinging existing threads where some of these issues have come up
and will start new threads to get folks to benchmark and collect data on
whether this is the right tradeoff or we should do something else.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219644 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 00:31:29 +00:00
..
Analysis [modules] Stop excluding Support/Debug.h from the Support module. This header 2014-10-13 00:41:03 +00:00
AsmParser Make CallingConv::ID an alias of "unsigned". 2014-09-10 18:00:17 +00:00
Bitcode Introduce LLVMWriteBitcodeToMemoryBuffer C API function. 2014-10-14 00:30:59 +00:00
CodeGen Migrate another set of getSubtargetImpl away. 2014-10-13 21:57:44 +00:00
DebugInfo Add couple of missing 'override' keyword. NFC. 2014-10-10 17:34:30 +00:00
ExecutionEngine [MCJIT] Replace memcpy with readBytesUnaligned in RuntimeDyldMachOI386. 2014-10-10 23:07:09 +00:00
IR Return undef on FP <-> Int conversions that overflow (PR21330). 2014-10-10 23:00:21 +00:00
IRReader Pass a && to getLazyBitcodeModule. 2014-09-03 17:31:46 +00:00
LineEditor
Linker Merge alignment of common GlobalValue. 2014-09-09 17:48:18 +00:00
LTO LTO: Document the Boolean argument from r218784 2014-10-02 21:11:04 +00:00
MC MC: Shrink MCSymbolRefExpr by only storing the bits we need. 2014-10-11 17:57:27 +00:00
Object Object, COFF: Move the VirtualSize/SizeOfRawData logic to getSectionSize 2014-10-09 08:42:31 +00:00
Option Add an overload of getLastArgNoClaim taking two OptSpecifiers. 2014-09-12 19:42:53 +00:00
ProfileData Reduce double set lookups. NFC. 2014-10-10 15:32:50 +00:00
Support Removing the static destructor from ManagedStatic.cpp by controlling the allocation and de-allocation of the mutex. 2014-10-13 22:37:25 +00:00
TableGen Eliminate some deep std::vector copies. NFC. 2014-10-03 18:33:16 +00:00
Target Make first of several changes to bring up to AArch64 fast-isel style 2014-10-13 21:46:41 +00:00
Transforms Add some optional passes around the vectorizer to both better prepare 2014-10-14 00:31:29 +00:00
CMakeLists.txt
LLVMBuild.txt
Makefile