Commit Graph

5640 Commits

Author SHA1 Message Date
Nick Lewycky
34b96d1576 dbgs() << Instruction doesn't print a newline on the end any more. Update these
debug statements to add a missing newline. Also canonicalize to '\n' instead of
"\n"; the latter calls a function with a loop the former does not.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184897 91177308-0d34-0410-b5e6-96231b3b80d8
2013-06-26 00:30:18 +00:00
Bob Wilson
a1fe2948ed Fix SROA to avoid unnecessary scalar conversions for 1-element vectors.
When a 1-element vector alloca is promoted, a store instruction can often be
rewritten without converting the value to a scalar and using an insertelement
instruction to stuff it into the new alloca.  This patch just adds a check
to skip that conversion when it is unnecessary.  This turns out to be really
important for some ARM Neon operations where <1 x i64> is used to get around
the fact that i64 is not a legal type.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184870 91177308-0d34-0410-b5e6-96231b3b80d8
2013-06-25 19:09:50 +00:00
Meador Inge
be87bce32b Remove the simplify-libcalls pass (finally)
This commit completely removes what is left of the simplify-libcalls
pass.  All of the functionality has now been migrated to the instcombine
and functionattrs passes.  The following C API functions are now NOPs:

  1. LLVMAddSimplifyLibCallsPass
  2. LLVMPassManagerBuilderSetDisableSimplifyLibCalls

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184459 91177308-0d34-0410-b5e6-96231b3b80d8
2013-06-20 19:48:07 +00:00
Bill Wendling
f9fd58a44b Access the TargetLoweringInfo from the TargetMachine object instead of caching it. The TLI may change between functions. No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184352 91177308-0d34-0410-b5e6-96231b3b80d8
2013-06-19 21:07:11 +00:00
Matt Arsenault
ad966ea7a8 Move StructurizeCFG out of R600 to generic Transforms.
Register it with PassManager

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184343 91177308-0d34-0410-b5e6-96231b3b80d8
2013-06-19 20:18:24 +00:00
Quentin Colombet
5a2fb058d3 LSR: Fix the parameters used to compute the scaling factor cost.
Prior to this change, the considered addressing modes may be invalid since the
maximum and minimum offsets were not taking into account.
This was causing an assertion failure.

The added test case exercices that behavior.

<rdar://problem/14199725> Assertion failed: (CurScaleCost >= 0 && "Legal
addressing mode has an illegal cost!")


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184341 91177308-0d34-0410-b5e6-96231b3b80d8
2013-06-19 19:59:41 +00:00
Jakub Staszak
515971fdd7 Use 0 instead of NULL.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184044 91177308-0d34-0410-b5e6-96231b3b80d8
2013-06-15 12:20:44 +00:00
Shuxin Yang
9792b646c6 Fix a potential bug in r183584.
r183584 tries to derive some info from the code *AFTER* a call and apply
these derived info to the code *BEFORE* the call, which is not always safe
as the call in question may never return, and in this case, the derived
info is invalid.
  
  Thank Duncan for pointing out this potential bug.

rdar://14073661 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183606 91177308-0d34-0410-b5e6-96231b3b80d8
2013-06-08 04:56:05 +00:00
Shuxin Yang
1c2b03aae9 Fix an assertion in MemCpyOpt pass.
The MemCpyOpt pass is capable of optimizing:
      callee(&S); copy N bytes from S to D.
    into:
      callee(&D);
subject to some legality constraints. 

  Assertion is triggered when the compiler tries to evalute "sizeof(typeof(D))",
while D is an opaque-typed, 'sret' formal argument of function being compiled.
i.e. the signature of the func being compiled is something like this:
  T caller(...,%opaque* noalias nocapture sret %D, ...)

  The fix is that when come across such situation, instead of calling some
utility functions to get the size of D's type (which will crash), we simply
assume D has at least N bytes as implified by the copy-instruction.

rdar://14073661 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183584 91177308-0d34-0410-b5e6-96231b3b80d8
2013-06-07 22:45:21 +00:00
David Majnemer
5a57dbef33 IndVarSimplify: check if loop invariant expansion can trap
IndVarSimplify is willing to move divide instructions outside of their
loop bodies if they are invariant of the loop.  However, it may not be
safe to expand them if we do not know if they can trap.

Instead, check to see if it is not safe to expand the instruction and
skip the expansion.

This fixes PR16041.

Testcase by Rafael Ávila de Espíndola.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183239 91177308-0d34-0410-b5e6-96231b3b80d8
2013-06-04 17:51:58 +00:00
Quentin Colombet
06f5ebc5a1 Loop Strength Reduce: Scaling factor cost.
Account for the cost of scaling factor in Loop Strength Reduce when rating the
formulae. This uses a target hook.

The default implementation of the hook is: if the addressing mode is legal, the
scaling factor is free.

<rdar://problem/13806271>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183045 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-31 21:29:03 +00:00
Quentin Colombet
5b00f4edcb Modify how the formulae are rated in Loop Strength Reduce.
Namely, check if the target allows to fold more that one register in the
addressing mode and if yes, adjust the cost accordingly.

Prior to this commit, reg1 + scale * reg2 accesses were artificially preferred
to reg1 + reg2 accesses. Indeed, the cost model wrongly assumed that reg1 + reg2
needs a temporary register for the computation, whereas it was correctly
estimated for reg1 + scale * reg2.

<rdar://problem/13973908>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183021 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-31 17:20:29 +00:00
Michael J. Spencer
c6af2432c8 Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182680 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-24 22:23:49 +00:00
Shuxin Yang
4b7b3a7c19 [GVN] Split critical-edge on the fly, instead of postpone edge-splitting to next
iteration.
  
  This on step toward non-iterative GVN. My local hack suggests that getting rid
of iteration will speedup GVN by 30%+ on a medium sized input (2k LOC, C++).
I cannot explain why not 2x or more at this moment.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181532 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-09 18:34:27 +00:00
Nick Lewycky
ae9f07e0b8 Fix a bug in codegenprep where it was losing track of values OptimizeMemoryInst
by switching to a ValueMap. Patch by Andrea DiBiagio!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181397 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-08 09:00:10 +00:00
Andrew Trick
fcf79528da Rotate multi-exit loops even if the latch was simplified.
Test case by Michele Scandale!

Fixes PR10293: Load not hoisted out of loop with multiple exits.

There are few regressions with this patch, now tracked by
rdar:13817079, and a roughly equal number of improvements. The
regressions are almost certainly back luck because LoopRotate has very
little idea of whether rotation is profitable. Doing better requires a
more comprehensive solution.

This checkin is a quick fix that lacks generality (PR10293 has
a counter-example). But it trivially fixes the case in PR10293 without
interfering with other cases, and it does satify the criteria that
LoopRotate is a loop canonicalization pass that should avoid
heuristics and special cases.

I can think of two approaches that would probably be better in
the long run. Ultimately they may both make sense.

(1) LoopRotate should check that the current header would make a good
loop guard, and that the loop does not already has a sufficient
guard. The artifical SimplifiedLoopLatch check would be unnecessary,
and the design would be more general and canonical. Two difficulties:

- We need a strong guarantee that we won't endlessly rotate, so the
  analysis would need to be precise in order to avoid the
  SimplifiedLoopLatch precondition.

- Analysis like this are usually based on SCEV, which we don't want to
  rely on.

(2) Rotate on-demand in late loop passes. This could even be done by
shoving the loop back on the queue after the optimization that needs
it. This could work well when we find LICM opportunities in
multi-branch loops. This requires some work, and it doesn't really
solve the problem of SCEV wanting a loop guard before the analysis.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181230 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-06 17:58:18 +00:00
Shuxin Yang
968d689ec3 Decompose GVN::processNonLocalLoad() (about 400 LOC) into smaller helper functions. No function change.
This function consists of following steps:
   1. Collect dependent memory accesses.
   2. Analyze availability.
   3. Perform fully redundancy elimination, or 
   4. Perform PRE, depending on the availability

 Step 2, 3 and 4 are now moved to three helper routines.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181047 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-03 19:17:26 +00:00
Shuxin Yang
556dd3a9a9 [GV] Remove dead code which is really difficult to decipher.
Actually it took me couple of hours trying to make sense of them and
only to find they are dead code.  I guess the original author used
"allSingleSucc" to indicate if there are any critial edge emanating
from some blocks, and tried to perform code motion (actually speculation)
in the presence of these critical edges; but later on he/she changed mind
and decided to perform edge-splitting first.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180951 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-02 21:14:31 +00:00
Filip Pizlo
40be1e8566 This patch breaks up Wrap.h so that it does not have to include all of
the things, and renames it to CBindingWrapping.h.  I also moved 
CBindingWrapping.h into Support/.

This new file just contains the macros for defining different wrap/unwrap 
methods.

The calls to those macros, as well as any custom wrap/unwrap definitions 
(like for array of Values for example), are put into corresponding C++ 
headers.

Doing this required some #include surgery, since some .cpp files relied 
on the fact that including Wrap.h implicitly caused the inclusion of a 
bunch of other things.

This also now means that the C++ headers will include their corresponding 
C API headers; for example Value.h must include llvm-c/Core.h.  I think 
this is harmless, since the C API headers contain just external function 
declarations and some C types, so I don't believe there should be any 
nasty dependency issues here.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180881 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-01 20:59:00 +00:00
Nadav Rotem
fee6969463 SROA: Generate selects instead of shuffles when blending values because this is the cannonical form.
Shuffles are more difficult to lower and we usually don't touch them, while we do optimize selects more often.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180875 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-01 19:53:30 +00:00
Shuxin Yang
4d4c54d29f Fix a XOR reassociation bug.
When Reassociator optimize "(x | C1)" ^ "(X & C2)", it may swap the two
subexpressions, however, it forgot to swap cached constants (of C1 and C2)
accordingly.

rdar://13739160


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180676 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-27 18:02:12 +00:00
Eric Christopher
3e39731e88 Move C++ code out of the C headers and into either C++ headers
or the C++ files themselves. This enables people to use
just a C compiler to interoperate with LLVM.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180063 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-22 22:47:22 +00:00
Rafael Espindola
cde25b435a Clarify that llvm.used can contain aliases.
Also add a check for llvm.used in the verifier and simplify clients now that
they can assume they have a ConstantArray.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180019 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-22 14:58:02 +00:00
Benjamin Kramer
d81a0dee5b SROA: Don't crash on a select with two identical operands.
This is an edge case that can happen if we modify a chain of multiple selects.
Update all operands in that case and remove the assert. PR15805.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179982 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-21 17:48:39 +00:00
Chris Lattner
77327fd652 Fix a comment, PR15777.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179775 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-18 17:42:14 +00:00
Jim Grosbach
467116a1c8 Fix a typo in comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179542 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-15 17:40:48 +00:00
Shuxin Yang
4fd00c55d0 Redo the fix Benjamin Kramer committed in r178793 about iterator invalidation in Reassociate.
I brazenly think this change is slightly simpler than r178793 because: 
  - no "state" in functor
  - "OpndPtrs[i]" looks simpler than "&Opnds[OpndIndices[i]]" 

  While I can reproduce the probelm in Valgrind, it is rather difficult to come up
a standalone testing case. The reason is that when an iterator is invalidated,
the stale invalidated elements are not yet clobbered by nonsense data, so the
optimizer can still proceed successfully. 

  Thank Benjamin for fixing this bug and generously providing the test case.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179062 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-08 22:00:43 +00:00
Chandler Carruth
05c7e7f99d Fix PR15674 (and PR15603): a SROA think-o.
The fix for PR14972 in r177055 introduced a real think-o in the *store*
side, likely because I was much more focused on the load side. While we
can arbitrarily widen (or narrow) a loaded value, we can't arbitrarily
widen a value to be stored, as that changes the width of memory access!
Lock down the code path in the store rewriting which would do this to
only handle the intended circumstance.

All of the existing tests continue to pass, and I've added a test from
the PR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178974 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-07 11:47:54 +00:00
Shuxin Yang
2da70d1792 Disable the optimization about promoting vector-element-access with symbolic index.
This optimization is unstable at this moment; it 
  1) block us on a very important application
  2) PR15200
  3) test6 and test7 in test/Transforms/ScalarRepl/dynamic-vector-gep.ll
     (the CHECK command compare the output against wrong result)

   I personally believe this optimization should not have any impact on the
autovectorized code, as auto-vectorizer is supposed to put gather/scatter
in a "right" way.  Although in theory downstream optimizaters might reveal 
some gather/scatter optimization opportunities, the chance is quite slim.

   For the hand-crafted vectorizing code, in term of redundancy elimination,
load-CSE, copy-propagation and DSE can collectively achieve the same result,
but in much simpler way. On the other hand, these optimizers are able to 
improve the code in a incremental way; in contrast, SROA is sort of all-or-none
approach. However, SROA might slighly win in stack size, as it tries to figure 
out a stretch of memory tightenly cover the area accessed by the dynamic index.

 rdar://13174884
 PR15200



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178912 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05 21:07:08 +00:00
Benjamin Kramer
ad2e252865 Reassociate: Avoid iterator invalidation.
OpndPtrs stored pointers into the Opnd vector that became invalid when the
vector grows. Store indices instead. Sadly I only have a large testcase that
only triggers under valgrind, so I didn't include it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178793 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-04 21:15:42 +00:00
Shuxin Yang
ad26993e1a Correct assertion condition
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178484 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-01 18:13:05 +00:00
Shuxin Yang
2d10010649 Implement XOR reassociation. It is based on following rules:
rule 1: (x | c1) ^ c2 => (x & ~c1) ^ (c1^c2),
     only useful when c1=c2
  rule 2: (x & c1) ^ (x & c2) = (x & (c1^c2))
  rule 3: (x | c1) ^ (x | c2) = (x & c3) ^ c3 where c3 = c1 ^ c2
  rule 4: (x | c1) ^ (x & c2) => (x & c3) ^ c1, where c3 = ~c1 ^ c2

 It reduces an application's size (in terms of # of instructions) by 8.9%.
 Reviwed by Pete Cooper. Thanks a lot!

 rdar://13212115  


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178409 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-30 02:15:01 +00:00
Jakub Staszak
6f7becfe23 Minor cleanups. No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177837 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-24 09:56:28 +00:00
Jakub Staszak
65a47ff554 Use dyn_cast instead of isa && cast.
No functionality change.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177836 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-24 09:25:47 +00:00
Chandler Carruth
d647ec5239 [SROA] Prefix names using a custom IRBuilder inserter.
The key part of this is ensuring that name prefixes remain in a Twine
form until we get to a point where we can nuke them under NDEBUG. This
is tricky using the old APIs as they played fast and loose with Twine,
which is prone to serious error. The inserter is much cleaner as it is
actually in the call stack leading to the setName call, and so has
a good opportunity to prepend the prefix.

This matters more than you might imagine because most runs over an
alloca find a single partition, and rewrite 3 or 4 instructions
referring to it. As a consequence doing this lazily and exclusively with
Twine allows the optimizer to delete more of it and shaves another 2% to
3% off of the release build's SROA run time for PR15412. I also think
the APIs are cleaner, and the use of Twine is more reliable, so
I consider it a win-win despite the churn required to reach this state.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177631 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-21 09:52:18 +00:00
Meador Inge
a2c6256b2a simplify-libcalls: Removed unused variable
The 'Modified' variable should have been removed from SimplifyLibCalls
in r177619, but was missed.  This commit removes it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177622 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-21 02:44:07 +00:00
Meador Inge
cf47ce616c Move library call prototype attribute inference to functionattrs
The simplify-libcalls pass implemented a doInitialization hook to infer
function prototype attributes for well-known functions.  Given that the
simplify-libcalls pass is going away *and* that the functionattrs pass
is already in place to deduce function attributes, I am moving this logic
to the functionattrs pass.  This approach was discussed during patch
review:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121126/157465.html.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177619 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-21 00:55:59 +00:00
Chandler Carruth
fd060a94a4 Fix a silly search-and-replace goof with r177495 that only broke
non-release builds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177498 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-20 07:40:56 +00:00
Chandler Carruth
05c6d0b16f [SROA] Don't preserve the IR names in release builds.
This is espcially important because the new SROA pass goes to great
lengths to provide helpful names for debugging, and as a consequence
they can become very slow to render.

Good for between 5% and 15% of the SROA runtime on some slow test cases
such as the one in PR15412.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177495 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-20 07:30:36 +00:00
Chandler Carruth
30ee9c2093 Move the endif to the correct line so we don't have warnings about
unused statistics variables.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177494 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-20 06:47:00 +00:00
Chandler Carruth
f2b649da0f Introduce some new statistics to help track the exact behavior of the
new SROA pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177493 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-20 06:30:46 +00:00
Quentin Colombet
9deb91722c Update global merge pass according to Duncan's advices:
- Remove useless includes
- Change misleading comments
- Move code into doFinalization


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177445 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-19 21:46:49 +00:00
Arnaud A. de Grandmaison
eb9a42e8ab IndVarSimplify: do not recompute an IV value outside of the loop if :
- it is trivially known to be used inside the loop in a way that can not be optimized away
- there is no use outside of the loop which can take advantage of the computation hoisting

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177432 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-19 20:00:22 +00:00
Andrew Trick
d37c8568e6 Revert "Cleanup some SCEV logic a bit."
This reverts commit 82cd8f7382.

Just add a comment instead!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177377 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-19 05:10:27 +00:00
Andrew Trick
82cd8f7382 Cleanup some SCEV logic a bit.
Make the code more obvious to scan-build and humans.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177375 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-19 04:14:59 +00:00
Andrew Trick
4b02729558 Tighten up an internal LSR API that should check for NULL.
No test case, but should fix a scan_build warning.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177374 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-19 04:14:57 +00:00
Jakub Staszak
89c4dc6339 Make method private. Keep coding standard.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177348 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-18 23:31:30 +00:00
Quentin Colombet
e572809aa1 Extend global merge pass to optionally consider global constant variables.
Also add some checks to not merge globals used within landing pad instructions or marked as "used".


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177331 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-18 22:30:07 +00:00
Chandler Carruth
5e8da1773c Mark internal classes as POD-like to get better behavior out of
SmallVector and DenseMap.

This speeds up SROA by 25% on PR15412.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177259 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-18 08:36:46 +00:00
Chandler Carruth
41b55f5556 PR14972: SROA vs. GVN exposed a really bad bug in SROA.
The fundamental problem is that SROA didn't allow for overly wide loads
where the bits past the end of the alloca were masked away and the load
was sufficiently aligned to ensure there is no risk of page fault, or
other trapping behavior. With such widened loads, SROA would delete the
load entirely rather than clamping it to the size of the alloca in order
to allow mem2reg to fire. This was exposed by a test case that neatly
arranged for GVN to run first, widening certain loads, followed by an
inline step, and then SROA which miscompiles the code. However, I see no
reason why this hasn't been plaguing us in other contexts. It seems
deeply broken.

Diagnosing all of the above took all of 10 minutes of debugging. The
really annoying aspect is that fixing this completely breaks the pass.
;] There was an implicit reliance on the fact that no loads or stores
extended past the alloca once we decided to rewrite them in the final
stage of SROA. This was used to encode information about whether the
loads and stores had been split across multiple partitions of the
original alloca. That required threading explicit tracking of whether
a *use* of a partition is split across multiple partitions.

Once that was done, another problem arose: we allowed splitting of
integer loads and stores iff they were loads and stores to the entire
alloca. This is a really arbitrary limitation, and splitting at least
some integer loads and stores is crucial to maximize promotion
opportunities. My first attempt was to start removing the restriction
entirely, but currently that does Very Bad Things by causing *many*
common alloca patterns to be fully decomposed into i8 operations and
lots of or-ing together to produce larger integers on demand. The code
bloat is terrifying. That is still the right end-goal, but substantial
work must be done to either merge partitions or ensure that small i8
values are eagerly merged in some other pass. Sadly, figuring all this
out took essentially all the time and effort here.

So the end result is that we allow splitting only when the load or store
at least covers the alloca. That ensures widened loads and stores don't
hurt SROA, and that we don't rampantly decompose operations more than we
have previously.

All of this was already fairly well tested, and so I've just updated the
tests to cover the wide load behavior. I can add a test that crafts the
pass ordering magic which caused the original PR, but that seems really
brittle and to provide little benefit. The fundamental problem is that
widened loads should Just Work.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177055 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-14 11:32:24 +00:00