Commit Graph

6459 Commits

Author SHA1 Message Date
Olivier Sallenave
5f235de01b Check interleaving without relying on debug output.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229027 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 02:13:57 +00:00
Michael Zolotukhin
cd35fcecc2 Testcase for r228988.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228995 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 00:35:45 +00:00
NAKAMURA Takumi
37fc1833a8 llvm/test/Transforms/LoopVectorize/PowerPC/small-loop-rdx.ll REQUIRES +Asserts due to -debug.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228989 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 00:21:34 +00:00
Olivier Sallenave
90e069dc29 Change max interleave factor to 12 for POWER7 and POWER8.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228973 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-12 22:57:58 +00:00
Bjorn Steinbrink
53a7b568b2 Fix a crash in the assumption cache when inlining indirect function calls
Summary:
Instances of the AssumptionCache are per function, so we can't re-use
the same AssumptionCache instance when recursing in the CallAnalyzer to
analyze a different function. Instead we have to pass the
AssumptionCacheTracker to the CallAnalyzer so it can get the right
AssumptionCache on demand.

Reviewers: hfinkel

Subscribers: llvm-commits, hans

Differential Revision: http://reviews.llvm.org/D7533

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228957 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-12 21:04:22 +00:00
Benjamin Kramer
82f9916923 Update test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228956 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-12 20:40:19 +00:00
Benjamin Kramer
d038a7fe67 InstCombine: Allow folding of xor into icmp by changing the predicate for vectors
The loop vectorizer can create this pattern.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228954 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-12 20:26:46 +00:00
Michael Zolotukhin
5513166316 Add a testcase for r228432.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228951 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-12 19:57:24 +00:00
James Molloy
28a123abaf [LoopRerolling] Be more forgiving with instruction order.
We can't solve the full subgraph isomorphism problem. But we can
allow obvious cases, where for example two instructions of different
types are out of order. Due to them having different types/opcodes,
there is no ambiguity.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228931 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-12 15:54:14 +00:00
Andrea Di Biagio
44926033f6 [TTI] Teach the cost heuristic how to query TLI to check if a zext/trunc is 'free' for the target.
Now that SimplifyCFG uses TTI for the cost heuristic, we can teach BasicTTIImpl
how to query TLI in order to get a more accurate cost for truncates and
zero-extends.

Before this patch, the basic cost heuristic in TargetTransformInfoImplCRTPBase
would have conservatively returned a 'default' TCC_Basic for all zero-extends,
and TCC_Free for truncates on native types.

This patch improves the heuristic so that we query TLI (if available) to get
more accurate answers. If TLI is available, then methods 'isZExtFree' and
'isTruncateFree' can be used to check if a zext/trunc is free for the target.

Added more test cases to SimplifyCFG/X86/speculate-cttz-ctlz.ll.
With this change, SimplifyCFG is now able to speculate a 'cheap' cttz/ctlz
immediately followed by a free zext/trunc.

Differential Revision: http://reviews.llvm.org/D7585


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228923 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-12 14:17:24 +00:00
Chandler Carruth
ffc97a6ae1 [slp] Fix a nasty bug in the SLP vectorizer that Joerg pointed out.
Apparently some code finally started to tickle this after my
canonicalization changes to instcombine.

The bug stems from trying to form a vector type out of scalars that
aren't compatible at all. In this example, from x86_mmx values. The code
in the vectorizer that checks for reasonable types whas checking for
aggregates or vectors, but there are lots of other types that should
just never reach the vectorizer.

Debugging this was made more confusing by the lie in an assert in
VectorType::get() -- it isn't that the types are *primitive*. The types
must be integer, pointer, or floating point types. No other types are
allowed.

I've improved the assert and added a helper to the vectorizer to handle
the element type validity checks. It now re-uses the VectorType static
function and then further excludes weird target-specific types that we
probably shouldn't be touching here (x86_fp80 and ppc_fp128). Neither of
these are really reachable anyways (neither 80-bit nor 128-bit things
will get vectorized) but it seems better to just eagerly exclude such
nonesense.

I've added a test case, but while it definitely covers two of the paths
through this code there may be more paths that would benefit from test
coverage. I'm not familiar enough with the SLP vectorizer to synthesize
test cases for all of these, but was able to update the code itself by
inspection.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228899 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-12 02:30:56 +00:00
Tim Northover
7f75841a73 DeadArgElim: aggregate Return assessment properly.
I mistakenly thought the liveness of each "RetVal(F, i)" depended only on F. It
actually depends on the index too, which means we need to be careful about how
the results are combined before return. In particular if a single Use returns
Live, that counts for the entire object, at the granularity we're considering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228885 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-11 23:13:11 +00:00
Mehdi Amini
23af697ae6 Reassociate: cannot negate a INT_MIN value
Summary:
When trying to canonicalize negative constants out of
multiplication expressions, we need to check that the
constant is not INT_MIN which cannot be negated.

Reviewers: mcrosier

Reviewed By: mcrosier

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7286

From: Mehdi Amini <mehdi.amini@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228872 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-11 19:54:44 +00:00
Andrea Di Biagio
f033db57e9 [TTI] Improved cost heuristic for cttz/ctlz calls.
This patch is a follow-up of r228826 (see code-review: D7506).

Now that SimplifyCFG uses TargetTransformInfo for cost analysis, we 
have to fix the cost heuristic for intrinsic calls to cttz/ctlz.

This patch defines method 'getIntrinsicCost' in BasicTTIImpl: now, BasicTTIImpl
queries TLI to check if a call to cttz/ctlz is cheap for the target.

Added test cases in Transforms/SimplifyCFG/X86 to verify that on x86,
SimplifyCFG only speculates a call to cttz/ctlz if it is cheap.

Differential Revision: http://reviews.llvm.org/D7554


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228829 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-11 14:22:18 +00:00
James Molloy
4de471dd0a [SimplifyCFG] Swap to using TargetTransformInfo for cost
analysis.

We're already using TTI in SimplifyCFG, so remove the hard-baked "cheapness"
heuristic and use TTI directly. Generally NFC intended, but we're using a slightly
different heuristic now so there is a slight test churn.

Test changes:
  * combine-comparisons-by-cse.ll: Removed unneeded branch check.
  * 2014-08-04-muls-it.ll: Test now doesn't branch but emits muleq.
  * coalesce-subregs.ll: Superfluous block check.
  * 2008-01-02-hoist-fp-add.ll: fadd is safe to speculate. Change to udiv.
  * PhiBlockMerge.ll: Superfluous CFG checking code. Main checks still present.
  * select-gep.ll: A variable GEP is not expensive, just TCC_Basic, according to the TTI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228826 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-11 12:15:41 +00:00
James Molloy
caba7561ae [LoopReroll] Introduce the concept of DAGRootSets.
A DAGRootSet models an induction variable being used in a rerollable
loop. For example:

   x[i*3+0] = y1
   x[i*3+1] = y2
   x[i*3+2] = y3

   Base instruction -> i*3
                    +---+----+
                   /    |     \
               ST[y1]  +1     +2  <-- Roots
                        |      |
                      ST[y2] ST[y3]

There may be multiple DAGRootSets, for example:

   x[i*2+0] = ...   (1)
   x[i*2+1] = ...   (1)
   x[i*2+4] = ...   (2)
   x[i*2+5] = ...   (2)
   x[(i+1234)*2+5678] = ... (3)
   x[(i+1234)*2+5679] = ... (3)

This concept is similar to the "Scale" member used previously, but allows
multiple independent sets of roots based off the same induction variable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228821 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-11 09:19:47 +00:00
Reid Kleckner
7c5e0c9851 Fix invalid LLVM IR in PruneEH tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228786 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-11 02:06:47 +00:00
Reid Kleckner
690248bf52 Don't promote asynch EH invokes of nounwind functions to calls
If the landingpad of the invoke is using a personality function that
catches asynch exceptions, then it can catch a trap.

Also add some landingpads to invalid LLVM IR test cases that lack them.

Over-the-shoulder reviewed by David Majnemer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228782 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-11 01:23:16 +00:00
David Majnemer
78d0638594 EarlyCSE: Add check lines for test added in r228760
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228761 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-10 23:11:02 +00:00
David Majnemer
0f8bd667a1 EarlyCSE: It isn't safe to CSE across synchronization boundaries
This fixes PR22514.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228760 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-10 23:09:43 +00:00
Tim Northover
b613f8842e DeadArgElim: arguments affect all returned sub-values by default.
Unless we meet an insertvalue on a path from some value to a return, that value
will be live if *any* of the return's components are live, so all of those
components must be added to the MaybeLiveUses.

Previously we were deleting arguments if sub-value 0 turned out to be dead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228731 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-10 19:49:18 +00:00
Michael Zolotukhin
261a3a361b Add a test case for new unrolling heuristics.
THe heuristics were added in r228265 and r228434.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228713 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-10 17:54:54 +00:00
Chandler Carruth
3e77df419d Revert r228556: InstCombine: propagate nonNull through assume
This commit isn't using the correct context, and is transfoming calls
that are operands to loads rather than calls that are operands to an
icmp feeding into an assume. I've replied on the original review thread
with a very reduced test case and some thoughts on how to rework this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228677 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-10 08:07:32 +00:00
Ramkumar Ramachandra
69a5c89128 PlaceSafepoints: modernize gc.result.* -> gc.result
Differential Revision: http://reviews.llvm.org/D7516

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228625 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-09 23:00:40 +00:00
Philip Reames
d3f3d5f0d7 Introduce more tests for PlaceSafepoints
These tests the two optimizations for backedge insertion currently implemented and the split backedge flag which is currently off by default.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228617 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-09 22:10:15 +00:00
Philip Reames
2eace6ebc5 Minor test cleanup
a) add gc attribute
b) remove unused param



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228612 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-09 21:50:31 +00:00
Philip Reames
c016fa420f Add basic tests for PlaceSafepoints
This is just adding really simple tests which should have been part of the original submission.  When doing so, I discovered that I'd mistakenly removed required pieces when preparing the patch for upstream submission.  I fixed two such bugs in this submission.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228610 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-09 21:48:05 +00:00
Tim Northover
968ed6a5f0 DeadArgElim: fix mismatch in accounting of array return types.
Some parts of DeadArgElim were only considering the individual fields
of StructTypes separately, but others (where insertvalue &
extractvalue instructions occur) also looked into ArrayTypes.

This one is an actual bug; the mismatch can lead to an argument being
considered used by a return sub-value that isn't being tracked (and
hence is dead by default). It then gets incorrectly eliminated.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228559 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-09 01:21:00 +00:00
Tim Northover
c4af8c9467 DeadArgElim: assess uses of entire return value aggregate.
Previously, a non-extractvalue use of an aggregate return value meant
the entire return was considered live (the algorithm gave up
entirely). This was correct, but conservative. It's better to actually
look at that Use, making the analysis results apply to all sub-values
under consideration.

E.g.

  %val = call { i32, i32 } @whatever()
  [...]
  ret { i32, i32 } %val

The return is using the entire aggregate (sub-values 0 and 1). We can
still simplify @whatever if we can prove that this return is itself
unused.

Also unifies the logic slightly between aggregate and non-aggregate
cases..

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228558 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-09 01:20:53 +00:00
Ramkumar Ramachandra
5439a80dca InstCombine: propagate nonNull through assume
Make assume (load (call|invoke) != null) set nonNull return attribute
for the call and invoke. Also include tests.

Differential Revision: http://reviews.llvm.org/D7107

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228556 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-09 01:13:13 +00:00
Bjorn Steinbrink
61a16d2a16 Correctly combine alias.scope metadata by a union instead of intersecting
Summary:
The alias.scope metadata represents sets of things an instruction might
alias with. When generically combining the metadata from two
instructions the result must be the union of the original sets, because
the new instruction might alias with anything any of the original
instructions aliased with.

Reviewers: hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7490

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228525 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-08 17:07:14 +00:00
Benjamin Kramer
a54b82a9fe ValueTracking: Make isBytewiseValue simpler and more powerful at the same time.
Turns out there is a simpler way of checking that all bytes in a word are equal
than binary decomposition.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228503 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-07 19:29:02 +00:00
Bjorn Steinbrink
2dd5f23a1d Properly update AA metadata when performing call slot optimization
Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7482

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228500 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-07 17:54:36 +00:00
Matthias Braun
2f2dec87fb InstCombine: Combine select sequences into a single select
Normalize
select(C0, select(C1, a, b), b) -> select((C0 & C1), a, b)
select(C0, a, select(C1, a, b)) -> select((C0 | C1), a, b)

This normal form may enable further combines on the And/Or and shortens
paths for the values. Many targets prefer the other but can go back
easily in CodeGen.

Differential Revision: http://reviews.llvm.org/D7399

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228409 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-06 17:49:36 +00:00
Michael Kuperstein
acd7b00be2 Teach isDereferenceablePointer() to look through bitcast constant expressions.
This fixes a LICM regression due to the new load+store pair canonicalization.

Differential Revision: http://reviews.llvm.org/D7411

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228284 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-05 09:15:37 +00:00
Cameron Esfahani
d02540a1d7 Value soft float calls as more expensive in the inliner.
Summary: When evaluating floating point instructions in the inliner, ask the TTI whether it is an expensive operation.  By default, it's not an expensive operation.  This keeps the default behavior the same as before.  The ARM TTI has been updated to return back TCC_Expensive for targets which don't have hardware floating point.

Reviewers: chandlerc, echristo

Reviewed By: echristo

Subscribers: t.p.northover, aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D6936

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228263 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-05 02:09:33 +00:00
Tom Stellard
7c038bc15f StructurizeCFG: Use a reverse post-order traversal
We were previously doing a post-order traversal and operating on the
list in reverse, however this would occasionaly cause backedges for
loops to be visited before some of the other blocks in the loop.

We know use a reverse post-order traversal, which avoids this issue.

The reverse post-order traversal is not completely ideal, so we need
to manually fixup the list to ensure that inner loop backedges are
visited before outer loop backedges.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228186 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-04 20:49:44 +00:00
Renato Golin
ff01f89466 Reverting VLD1/VST1 base-updating/post-incrementing combining
This reverts patches 223862, 224198, 224203, and 224754, which were all
related to the vector load/store combining and were reverted/reaplied
a few times due to the same alignment problems we're seeing now.

Further tests, mainly self-hosting Clang, will be needed to reapply this
patch in the future.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228129 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-04 10:11:59 +00:00
Daniel Berlin
403050abcc Allow PRE to insert no-cost phi nodes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228024 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-03 20:37:08 +00:00
Jingyue Wu
2918efd551 Add straight-line strength reduction to LLVM
Summary:
Straight-line strength reduction (SLSR) is implemented in GCC but not yet in
LLVM. It has proven to effectively simplify statements derived from an unrolled
loop, and can potentially benefit many other cases too. For example,

LLVM unrolls

  #pragma unroll
  foo (int i = 0; i < 3; ++i) {
    sum += foo((b + i) * s);
  }

into

  sum += foo(b * s);
  sum += foo((b + 1) * s);
  sum += foo((b + 2) * s);

However, no optimizations yet reduce the internal redundancy of the three
expressions:

  b * s
  (b + 1) * s
  (b + 2) * s

With SLSR, LLVM can optimize these three expressions into:

  t1 = b * s
  t2 = t1 + s
  t3 = t2 + s

This commit is only an initial step towards implementing a series of such
optimizations. I will implement more (see TODO in the file commentary) in the
near future. This optimization is enabled for the NVPTX backend for now.
However, I am more than happy to push it to the standard optimization pipeline
after more thorough performance tests.

Test Plan: test/StraightLineStrengthReduce/slsr.ll

Reviewers: eliben, HaoLiu, meheff, hfinkel, jholewinski, atrick

Reviewed By: jholewinski, atrick

Subscribers: karthikthecool, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D7310

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228016 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-03 19:37:06 +00:00
Erik Eckstein
40d542097a Fix: SLPVectorizer crashes with assertion when vectorizing a cmp instruction.
The commit r225977 uncovered this bug. The problem was that the vectorizer tried to
read the second operand of an already deleted instruction.
The bug didn't show up before r225977 because the freed memory still contained a non-null pointer.
With r225977 deletion of instructions is delayed and the read operand pointer is always null.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227800 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-02 12:45:34 +00:00
Chandler Carruth
9a941b2028 [PM] Port SimplifyCFG to the new pass manager.
This should be sufficient to replace the initial (minor) function pass
pipeline in Clang with the new pass manager. I'll probably add an (off
by default) flag to do that just to ensure we can get extra testing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227726 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-01 11:34:21 +00:00
Chandler Carruth
80c55f265d [PM] Port EarlyCSE to the new pass manager.
I've added RUN lines both to the basic test for EarlyCSE and the
target-specific test, as this serves as a nice test that the TTI layer
in the new pass manager is in fact working well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227725 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-01 10:51:23 +00:00
Adrian Prantl
88deac4007 Inliner: Use replaceDbgDeclareForAlloca() instead of splicing the
instruction and generalize it to optionally dereference the variable.
Follow-up to r227544.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227604 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-30 19:37:48 +00:00
Hao Liu
2f45a3c252 Move the target specific test case arbitrary-induction-step.ll to test/Transforms/LoopVectorize/AArch64 folder.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227561 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-30 07:33:31 +00:00
Hao Liu
e7769db118 [LoopVectorize] Induction variables: support arbitrary constant step.
Previously, only -1 and +1 step values are supported for induction variables. This patch extends LV to support
arbitrary constant steps.
Initial patch by Alexey Volkov. Some bug fixes are added in the following version.

Differential Revision: http://reviews.llvm.org/D6051 and http://reviews.llvm.org/D7193


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227557 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-30 05:02:21 +00:00
Adrian Prantl
e413c8afe0 Fix PR22386. The inliner moves static allocas to the entry basic block
so we need to move the dbg.declare intrinsics that describe them, too.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227544 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-30 01:55:25 +00:00
Sanjay Patel
26c81cc870 [GVN] don't propagate equality comparisons of FP zero (PR22376)
In http://reviews.llvm.org/D6911, we allowed GVN to propagate FP equalities
to allow some simple value range optimizations. But that introduced a bug
when comparing to -0.0 or 0.0: these compare equal even though they are not
bitwise identical.

This patch disallows propagating zero constants in equality comparisons. 
Fixes: http://llvm.org/bugs/show_bug.cgi?id=22376

Differential Revision: http://reviews.llvm.org/D7257



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227491 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-29 20:51:49 +00:00
Philip Reames
61a76b2d4a Teach SplitBlockPredecessors how to handle landingpad blocks.
Patch by: Igor Laevsky <igor@azulsystems.com>

"Currently SplitBlockPredecessors generates incorrect code in case if basic block we are going to split has a landingpad. Also seems like it is fairly common case among it's users to conditionally call either SplitBlockPredecessors or SplitLandingPadPredecessors. Because of this I think it is reasonable to add this condition directly into SplitBlockPredecessors."

Differential Revision: http://reviews.llvm.org/D7157



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227390 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-28 23:06:47 +00:00
Michael Kuperstein
0906c8fc1c [X86] Reduce some 32-bit imuls into lea + shl
Reduce integer multiplication by a constant of the form k*2^c, where k is in {3,5,9} into a lea + shl. Previously it was only done for imulq on 64-bit platforms, but it makes sense for imull and 32-bit as well.

Differential Revision: http://reviews.llvm.org/D7196

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227308 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-28 14:08:22 +00:00
Elena Demikhovsky
b5c82c079a Fold fcmp in cases where value is provably non-negative. By Arch Robison.
This patch folds fcmp in some cases of interest in Julia. The patch adds a function CannotBeOrderedLessThanZero that returns true if a value is provably not less than zero. I.e. the function returns true if the value is provably -0, +0, positive, or a NaN. The patch extends InstructionSimplify.cpp to fold instances of fcmp where:
 - the predicate is olt or uge
 - the first operand is provably not less than zero
 - the second operand is zero
The motivation for handling these cases optimizing away domain checks for sqrt in Julia for common idioms such as sqrt(x*x+y*y)..

http://reviews.llvm.org/D6972



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227298 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-28 08:03:58 +00:00
Reid Kleckner
0935e7a79b Move EH personality type classification to Analysis/LibCallSemantics.h
Summary:
Also add enum types for __C_specific_handler and _CxxFrameHandler3 for
which we know a few things.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7214

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227284 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-28 01:17:38 +00:00
Ahmed Bougacha
37be0d7c43 [SimplifyLibCalls] Don't confuse strcpy_chk for stpcpy_chk.
This was introduced in a faulty refactoring (r225640, mea culpa):
the tests weren't testing the return values, so, for both
__strcpy_chk and __stpcpy_chk, we would return the end of the
buffer (matching stpcpy) instead of the beginning (for strcpy).

The root cause was the prefix "__" being ignored when comparing,
which made us always pick LibFunc::stpcpy_chk.
Pass the LibFunc::Func directly to avoid this kind of error.
Also, make the testcases as explicit as possible to prevent this.

The now-useful testcases expose another, entangled, stpcpy problem,
with the further simplification.  This was introduced in a
refactoring (r225640) to match the original behavior.

However, this leads to problems when successive simplifications
generate several similar instructions, none of which are removed
by the custom replaceAllUsesWith.

For instance, InstCombine (the main user) doesn't erase the
instruction in its custom RAUW.  When trying to simplify say
__stpcpy_chk:
- first, an stpcpy is created (fortified simplifier),
- second, a memcpy is created (normal simplifier), but the
  stpcpy call isn't removed.
- third, InstCombine later revisits the instructions,
  and simplifies the first stpcpy to a memcpy.  We now have
  two memcpys.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227250 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-27 21:52:16 +00:00
Sanjoy Das
fdefc694cd Teach IRCE to look at branch weights when recognizing range checks
Splitting a loop to make range checks redundant is profitable only if
the range check "never" fails. Make this fact a part of recognizing a
range check -- a branch is a range check only if it is expected to
pass (via branch_weights metadata).

Differential Revision: http://reviews.llvm.org/D7192



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227249 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-27 21:38:12 +00:00
Andrea Di Biagio
944d86558e [InstCombine] Teach how to fold a select into a cttz/ctlz with the 'is_zero_undef' flag.
This patch teaches the Instruction Combiner how to fold a cttz/ctlz followed by
a icmp plus select into a single cttz/ctlz with flag 'is_zero_undef' cleared.

Added test InstCombine/select-cmp-cttz-ctlz.ll.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227197 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-27 15:58:14 +00:00
David Majnemer
90c42ddc62 LoopRotate: Don't walk the uses of a Constant
LoopRotate wanted to avoid live range interference by looking at the
uses of a Value in the loop latch and seeing if any lied outside of the
loop.  We would wrongly perform this operation on Constants.

This fixes PR22337.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227171 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-27 06:21:43 +00:00
Chad Rosier
13faabb6c5 Commoning of target specific load/store intrinsics in Early CSE.
Phabricator revision: http://reviews.llvm.org/D7121
Patch by Sanjin Sijaric <ssijaric@codeaurora.org>!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227149 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-26 22:51:15 +00:00
Philip Reames
c1beae0d42 Add test cases for PRE w/volatile loads
These tests check that the combination of 227110 (cross block query inst) and 227112 (volatile load semantics) work together properly to allow PRE in cases where a loop contains a volatile access.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227146 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-26 22:40:44 +00:00
Hans Wennborg
325385a37f SimplifyCFG: Omit range checks for switch lookup tables when default is unreachable
The range check would get optimized away later, but we might as well not emit
them in the first place.

http://reviews.llvm.org/D6471

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227126 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-26 19:52:34 +00:00
Hans Wennborg
d5c2318adc SimplifyCFG: don't remove unreachable default switch destinations
An unreachable default destination can be exploited by other optimizations and
allows for more efficient lowering. Both the SDag switch lowering and
LowerSwitch can exploit unreachable defaults.

Also make TurnSwitchRangeICmp handle switches with unreachable default.
This is kind of separate change, but it cannot be tested without the change
above, and I don't want to land the change above without this since that would
regress other tests.

Differential Revision: http://reviews.llvm.org/D6471

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227125 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-26 19:52:32 +00:00
Philip Reames
cce3c83917 Refine memory dependence's notion of volatile semantics
According to my reading of the LangRef, volatiles are only ordered with respect to other volatiles. It is entirely legal and profitable to forward unrelated loads over the volatile load. This patch implements this for GVN by refining the transition rules MemoryDependenceAnalysis uses when encountering a volatile.

The added test cases show where the extra flexibility is profitable for local dependence optimizations. I have a related change (227110) which will extend this to non-local dependence (i.e. PRE), but that's essentially orthogonal to the semantic change in this patch. I have tested the two together and can confirm that PRE works over a volatile load with both changes.  I will be submitting a PRE w/volatiles test case seperately in the near future.

Differential Revision: http://reviews.llvm.org/D6901



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227112 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-26 18:54:27 +00:00
Philip Reames
8a5ad05c13 Pass QueryInst down through non-local dependency calculation
This change is mostly motivated by exposing information about the original query instruction to the actual scanning work in getPointerDependencyFrom when used by GVN PRE. In a follow up change, I will use this to be more precise with regards to the semantics of volatile instructions encountered in the scan of a basic block.

Worth noting, is that this change (despite appearing quite simple) is not semantically preserving. By providing more information to the helper routine, we allow some optimizations to kick in that weren't previously able to (when called from this code path.) In particular, we see that treatment of !invariant.load becomes more precise. In theory, we might see a difference with an ordered/atomic instruction as well, but I'm having a hard time actually finding a test case which shows that.

Test wise, I've included new tests for !invariant.load which illustrate this difference. I've also included some updated TBAA tests which highlight that this change isn't needed for that optimization to kick in - it's handled inside alias analysis itself. 

Eventually, it would be nice to factor the !invariant.load handling inside alias analysis as well.

Differential Revision: http://reviews.llvm.org/D6895



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227110 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-26 18:39:52 +00:00
Erik Eckstein
8f6e8cb4f6 SLPVectorizer: fix wrong scheduling of atomic load/stores.
This fixes PR22306.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227077 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-26 09:07:04 +00:00
Chandler Carruth
d4f6d111c1 [PM] Port LowerExpectIntrinsic to the new pass manager.
This just lifts the logic into a static helper function, sinks the
legacy pass to be a trivial wrapper of that helper fuction, and adds
a trivial wrapper for the new PM as well. Not much to see here.

I switched a test case to run in both modes, but we have to strip the
dead prototypes separately as that pass isn't in the new pass manager
(yet).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226999 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-24 11:13:02 +00:00
Chandler Carruth
7a98df7f74 [PM] Port instcombine to the new pass manager!
This is exciting as this is a much more involved port. This is
a complex, existing transformation pass. All of the core logic is shared
between both old and new pass managers. Only the access to the analyses
is separate because the actual techniques are separate. This also uses
a bunch of different and interesting analyses and is the first time
where we need to use an analysis across an IR layer.

This also paves the way to expose instcombine utility functions. I've
got a static function that implements the core pass logic over
a function which might be mildly interesting, but more interesting is
likely exposing a routine which just uses instructions *already in* the
worklist and combines until empty.

I've switched one of my favorite instcombine tests to run with both as
well to make sure this keeps working.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226987 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-24 04:19:17 +00:00
Hans Wennborg
01e223e92e LowerSwitch: replace unreachable default with popular case destination
SimplifyCFG currently does this transformation, but I'm planning to remove that
to allow other passes, such as this one, to exploit the unreachable default.

This patch takes care to keep track of what case values are unreachable even
after the transformation, allowing for more efficient lowering.

Differential Revision: http://reviews.llvm.org/D6697

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226934 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-23 20:43:51 +00:00
Reid Kleckner
7ed0364cee Revert "Don't remove a landing pad if the invoke requires a table entry."
This reverts commit r176827.

Björn Steinbrink pointed out that this didn't actually fix the bug
(PR15555) it was attempting to fix.

With this reverted, we can now remove landingpad cleanups that
immediately resume unwinding, converting the invoke to a call.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226850 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-22 19:29:46 +00:00
Sanjoy Das
49afc9109e Fix crashes in IRCE caused by mismatched types
There are places where the inductive range check elimination pass
depends on two llvm::Values or llvm::SCEVs to be of the same
llvm::Type when they do not need to be. This patch relaxes those
restrictions (by bailing out of the optimization if the types
mismatch), and adds test cases to trigger those paths.

These issues were found by bootstrapping clang with IRCE running in
the -O3 pass ordering.

Differential Revision: http://reviews.llvm.org/D7082



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226793 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-22 08:29:18 +00:00
Elena Demikhovsky
05e7ae1a7b Fixed a bug in masked load/store in reversed loop.
Added a test.

The bug was submitted to bugzilla:
http://llvm.org/bugs/show_bug.cgi?id=22225



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226791 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-22 08:20:06 +00:00
Chandler Carruth
b778cbc0c8 [canonicalize] Teach InstCombine to canonicalize loads which are only
ever stored to always use a legal integer type if one is available.

Regardless of whether this particular type is good or bad, it ensures we
don't get weird differences in generated code (and resulting
performance) from "equivalent" patterns that happen to end up using
a slightly different type.

After some discussion on llvmdev it seems everyone generally likes this
canonicalization. However, there may be some parts of LLVM that handle
it poorly and need to be fixed. I have at least verified that this
doesn't impede GVN and instcombine's store-to-load forwarding powers in
any obvious cases. Subtle cases are exactly what we need te flush out if
they remain.

Also note that this IR pattern should already be hitting LLVM from Clang
at least because it is exactly the IR which would be produced if you
used memcpy to copy a pointer or floating point between memory instead
of a variable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226781 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-22 05:08:12 +00:00
David Blaikie
f93662d3d5 DebugInfo: Use distinct inlinedAt MDLocations to avoid separate inlined calls being coalesced
When two calls from the same MDLocation are inlined they currently get
treated as one inlined function call (creating difficulty debugging,
duplicate variables, etc).

Clang worked around this by including column information on inline calls
which doesn't address LTO inlining or calls to the same function from
the same line and column (such as through a macro). It also didn't
address ctor and member function calls.

By making the inlinedAt locations distinct, every call site has an
explicitly distinct location that cannot be coalesced with any other
call.

This can produce linearly (2x in the worst case where every call is
inlined and the call instruction has a non-call instruction at the same
location) more debug locations. Any increase beyond that are in cases
where the Clang workaround was insufficient and the new scheme is
creating necessary distinct nodes that were being erroneously coalesced
previously.

After this change to LLVM the incomplete workarounds in Clang. That
should reduce the number of debug locations (in a build without column
info, the default on Darwin, not the default on Linux) by not creating
pseudo-distinct locations for every call to an inline function.

(oh, and I made the inlined-at chain rebuilding iterative instead of
recursive because I was having trouble wrapping my head around it the
way it was - open to discussion on the right design for that function
(including going back to a recursive solution))

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226736 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-21 22:57:29 +00:00
David Majnemer
c070e4e528 InstCombine: Don't strip bitcasts off of callsites marked 'thunk'
The return type of a thunk is meaningless, we just want the arguments
and return value to be forwarded.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226708 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-21 22:32:04 +00:00
Alexander Potapenko
506c6ec22a Use a smaller pragma unroll threshold to reduce test execution time.
When opt is compiled with AddressSanitizer it takes more than 30 seconds
to unroll the loop in unroll_1M().


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226660 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-21 13:52:02 +00:00
Karthik Bhat
7e9f120130 Fix Operandreorder logic in SLPVectorizer to generate longer vectorizable chain.
This patch fixes 2 issues in reorderInputsAccordingToOpcode
1) AllSameOpcodeLeft and AllSameOpcodeRight was being calculated incorrectly resulting in code not being vectorized in few cases.
2) Adds logic to reorder operands if we get longer chain of consecutive loads enabling vectorization. Handled the same for cases were we have AltOpcode.
Thanks Michael for inputs and review.
Review: http://reviews.llvm.org/D6677



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226547 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-20 06:11:00 +00:00
Mehdi Amini
525f296ef1 Fix Reassociate handling of constant in presence of undef float
http://reviews.llvm.org/D6993

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226245 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-16 03:00:58 +00:00
Sanjoy Das
148e8c9b8b Add a new pass "inductive range check elimination"
IRCE eliminates range checks of the form

  0 <= A * I + B < Length

by splitting a loop's iteration space into three segments in a way
that the check is completely redundant in the middle segment.  As an
example, IRCE will convert

  len = < known positive >
  for (i = 0; i < n; i++) {
    if (0 <= i && i < len) {
      do_something();
    } else {
      throw_out_of_bounds();
    }
  }

to

  len = < known positive >
  limit = smin(n, len)
  // no first segment
  for (i = 0; i < limit; i++) {
    if (0 <= i && i < len) { // this check is fully redundant
      do_something();
    } else {
      throw_out_of_bounds();
    }
  }
  for (i = limit; i < n; i++) {
    if (0 <= i && i < len) {
      do_something();
    } else {
      throw_out_of_bounds();
    }
  }


IRCE can deal with multiple range checks in the same loop (it takes
the intersection of the ranges that will make each of them redundant
individually).

Currently IRCE does not do any profitability analysis.  That is a
TODO.

Please note that the status of this pass is *experimental*, and it is
not part of any default pass pipeline.  Having said that, I will love
to get feedback and general input from people interested in trying
this out.

This pass was originally r226201.  It was reverted because it used C++
features not supported by MSVC 2012.

Differential Revision: http://reviews.llvm.org/D6693



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226238 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-16 01:03:22 +00:00
Sanjoy Das
df1b4f601d Revert r226201 (Add a new pass "inductive range check elimination")
The change used C++11 features not supported by MSVC 2012.  I will fix
the change to use things supported MSVC 2012 and recommit shortly.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226216 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-15 22:18:10 +00:00
Sanjoy Das
0170a308ec Add a new pass "inductive range check elimination"
IRCE eliminates range checks of the form

  0 <= A * I + B < Length

by splitting a loop's iteration space into three segments in a way
that the check is completely redundant in the middle segment.  As an
example, IRCE will convert

  len = < known positive >
  for (i = 0; i < n; i++) {
    if (0 <= i && i < len) {
      do_something();
    } else {
      throw_out_of_bounds();
    }
  }

to

  len = < known positive >
  limit = smin(n, len)
  // no first segment
  for (i = 0; i < limit; i++) {
    if (0 <= i && i < len) { // this check is fully redundant
      do_something();
    } else {
      throw_out_of_bounds();
    }
  }
  for (i = limit; i < n; i++) {
    if (0 <= i && i < len) {
      do_something();
    } else {
      throw_out_of_bounds();
    }
  }


IRCE can deal with multiple range checks in the same loop (it takes
the intersection of the ranges that will make each of them redundant
individually).

Currently IRCE does not do any profitability analysis.  That is a
TODO.

Please note that the status of this pass is *experimental*, and it is
not part of any default pass pipeline.  Having said that, I will love
to get feedback and general input from people interested in trying
this out.

Differential Revision: http://reviews.llvm.org/D6693



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226201 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-15 20:45:46 +00:00
Sanjoy Das
7ec1829823 Fix PR22222
The bug was introduced in r225282. r225282 assumed that sub X, Y is
the same as add X, -Y. This is not correct if we are going to upgrade
the sub to sub nuw. This change fixes the issue by making the
optimization ignore sub instructions.

Differential Revision: http://reviews.llvm.org/D6979



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226075 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-15 01:46:09 +00:00
Richard Smith
ef7d38d35a For PR21145: recognise a builtin call to a known deallocation function even if
it's defined in the current module. Clang generates this situation for the
C++14 sized deallocation functions, because it generates a weak definition in
case one isn't provided by the C++ runtime library.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226069 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-15 01:00:33 +00:00
Ramkumar Ramachandra
fba4d82671 [GC] CodeGenPrep transform: simplify offsetable relocate
The transform is somewhat involved, but the basic idea is simple: find
derived pointers that have been offset from the base pointer using gep
and replace the relocate of the derived pointer with a gep to the
relocated base pointer (with the same offset).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226060 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-14 23:27:07 +00:00
Duncan P. N. Exon Smith
37ac8d3622 IR: Move MDLocation into place
This commit moves `MDLocation`, finishing off PR21433.  There's an
accompanying clang commit for frontend testcases.  I'll attach the
testcase upgrade script I used to PR21433 to help out-of-tree
frontends/backends.

This changes the schema for `DebugLoc` and `DILocation` from:

    !{i32 3, i32 7, !7, !8}

to:

    !MDLocation(line: 3, column: 7, scope: !7, inlinedAt: !8)

Note that empty fields (line/column: 0 and inlinedAt: null) don't get
printed by the assembly writer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226048 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-14 22:27:36 +00:00
David Majnemer
5e8cd99f55 InstCombine: Don't take A-B<0 into A<B if A-B has other uses
This fixes PR22226.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226023 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-14 19:26:56 +00:00
Ahmed Bougacha
61d6dc41fa [SimplifyLibCalls] Don't try to simplify indirect calls.
It turns out, all callsites of the simplifier are guarded by a check for
CallInst::getCalledFunction (i.e., to make sure the callee is direct).

This check wasn't done when trying to further optimize a simplified fortified
libcall, introduced by a refactoring in r225640.

Fix that, add a testcase, and document the requirement.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225895 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-14 00:55:05 +00:00
Sanjay Patel
2211d38267 GVN: propagate equalities for floating point compares
Allow optimizations based on FP comparison values in the same way
as integers. 

This resolves PR17713:
http://llvm.org/bugs/show_bug.cgi?id=17713

Differential Revision: http://reviews.llvm.org/D6911



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225660 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-12 19:29:48 +00:00
Hal Finkel
6829815d96 [PowerPC] Readjust the loop unrolling threshold
Now that the way that the partial unrolling threshold for small loops is used
to compute the unrolling factor as been corrected, a slightly smaller threshold
is preferable. This is expected; other targets may need to re-tune as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225566 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-10 00:31:10 +00:00
Hal Finkel
a14d6f1ea5 [LoopUnroll] Fix the partial unrolling threshold for small loop sizes
When we compute the size of a loop, we include the branch on the backedge and
the comparison feeding the conditional branch. Under normal circumstances,
these don't get replicated with the rest of the loop body when we unroll. This
led to the somewhat surprising behavior that really small loops would not get
unrolled enough -- they could be unrolled more and the resulting loop would be
below the threshold, because we were assuming they'd take
(LoopSize * UnrollingFactor) instructions after unrolling, instead of
(((LoopSize-2) * UnrollingFactor)+2) instructions. This fixes that computation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225565 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-10 00:30:55 +00:00
Hans Wennborg
ca71be6415 SimplifyCFG: check uses of constant-foldable instrs in switch destinations (PR20210)
The previous code assumed that such instructions could not have any uses
outside CaseDest, with the motivation that the instruction could not
dominate CommonDest because CommonDest has phi nodes in it. That simply
isn't true; e.g., CommonDest could have an edge back to itself.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225552 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-09 22:13:31 +00:00
Tim Northover
8cd39a2630 Re-reapply r221924: "[GVN] Perform Scalar PRE on gep indices that feed loads before
doing Load PRE"

It's not really expected to stick around, last time it provoked a weird LTO
build failure that I can't reproduce now, and the bot logs are long gone. I'll
re-revert it if the failures recur.

Original description: Perform Scalar PRE on gep indices that feed loads before
doing Load PRE.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225536 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-09 19:19:56 +00:00
Hal Finkel
139bfee84c [PowerPC] Enable late partial unrolling on the POWER7
The P7 benefits from not have really-small loops so that we either have
multiple dispatch groups in the loop and/or the ability to form more-full
dispatch groups during scheduling. Setting the partial unrolling threshold to
44 seems good, empirically, for the P7. Compared to using no late partial
unrolling, this yields the following test-suite speedups:

SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding
	-66.3253% +/- 24.1975%
SingleSource/Benchmarks/Misc-C++/oopack_v1p8
	-44.0169% +/- 29.4881%
SingleSource/Benchmarks/Misc/pi
	-27.8351% +/- 12.2712%
SingleSource/Benchmarks/Stanford/Bubblesort
	-30.9898% +/- 22.4647%

I've speculatively added a similar setting for the P8. Also, I've noticed that
the unroller does not quite calculate the unrolling factor correctly for really
tiny loops because it neglects to account for the fact that not every loop body
replicant contains an ending branch and counter increment. I'll fix that later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225522 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-09 15:51:16 +00:00
Duncan P. N. Exon Smith
f416d72973 IR: Add 'distinct' MDNodes to bitcode and assembly
Propagate whether `MDNode`s are 'distinct' through the other types of IR
(assembly and bitcode).  This adds the `distinct` keyword to assembly.

Currently, no one actually calls `MDNode::getDistinct()`, so these nodes
only get created for:

  - self-references, which are never uniqued, and
  - nodes whose operands are replaced that hit a uniquing collision.

The concept of distinct nodes is still not quite first-class, since
distinct-ness doesn't yet survive across `MapMetadata()`.

Part of PR22111.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225474 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 22:38:29 +00:00
Matt Arsenault
3b1f741856 Fix fcmp + fabs instcombines when using the intrinsic
This was only handling the libcall. This is another example
of why only the intrinsic should ever be used when it exists.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225465 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 20:09:34 +00:00
Matt Arsenault
374b57cec9 Fix using wrong intrinsic in test
This is a leftover from renaming the intrinsic.
It's surprising the unknown llvm. intrinsic wasn't rejected.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225304 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 23:00:33 +00:00
Rafael Espindola
f907a26bc2 Change the .ll syntax for comdats and add a syntactic sugar.
In order to make comdats always explicit in the IR, we decided to make
the syntax a bit more compact for the case of a GlobalObject in a
comdat with the same name.

Just dropping the $name causes problems for

@foo = globabl i32 0, comdat
$bar = comdat ...

and

declare void @foo() comdat
$bar = comdat ...

So the syntax is changed to

@g1 = globabl i32 0, comdat($c1)
@g2 = globabl i32 0, comdat

and

declare void @foo() comdat($c1)
declare void @foo() comdat

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225302 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 22:55:16 +00:00
Sanjoy Das
31123d4529 This patch teaches IndVarSimplify to add nuw and nsw to certain kinds
of operations that provably don't overflow. For example, we can prove
%civ.inc below does not sign-overflow. With this change,
IndVarSimplify changes %civ.inc to an add nsw.

  define i32 @foo(i32* %array, i32* %length_ptr, i32 %init) {
   entry:
    %length = load i32* %length_ptr, !range !0
    %len.sub.1 = sub i32 %length, 1
    %upper = icmp slt i32 %init, %len.sub.1
    br i1 %upper, label %loop, label %exit
  
   loop:
    %civ = phi i32 [ %init, %entry ], [ %civ.inc, %latch ]
    %civ.inc = add i32 %civ, 1
    %cmp = icmp slt i32 %civ.inc, %length
    br i1 %cmp, label %latch, label %break
  
   latch:
    store i32 0, i32* %array
    %check = icmp slt i32 %civ.inc, %len.sub.1
    br i1 %check, label %loop, label %break
  
   break:
    ret i32 %civ.inc
  
   exit:
    ret i32 42
  }

Differential Revision: http://reviews.llvm.org/D6748



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225282 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 19:02:56 +00:00
Matt Arsenault
d883ca0ca7 Convert fcmp with 0.0 from casted integers to icmp
This is already handled in general when it is known the
conversion can't lose bits with smaller integer types
casted into wider floating point types.

This pattern happens somewhat often in GPU programs that cast
workitem intrinsics to float, which are often compared with 0.

Specifically handle the special case of compares with zero which
should also be known to not lose information. I had a more general
version of this which allows equality compares if the casted float is
exactly representable in the integer, but I'm not 100% confident that
is always correct.

Also fold cases that aren't integers to true / false.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225265 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 15:50:59 +00:00
David Majnemer
51e4a66417 InstCombine: Bitcast call arguments from/to pointer/integer type
Try harder to get rid of bitcast'd calls by ptrtoint/inttoptr'ing
arguments and return values when DataLayout says it is safe to do so.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225254 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 08:41:31 +00:00
Michael Kuperstein
25903ef9bc Fix broken test from r225159.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225164 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 12:34:01 +00:00
Jiangning Liu
614fe873ce Fixed a bug in memory dependence checking module of loop vectorization. The following loop should not be vectorized with current algorithm.
{code}
// loop body
   ... = a[i]          (1)
    ... = a[i+1]       (2)
 .......
a[i+1] = ....          (3)
   a[i] = ...          (4)
{code}

The algorithm tries to collect memory access candidates from AliasSetTracker, and then check memory dependences one another. The memory accesses are unique in AliasSetTracker, and a single memory access in AliasSetTracker may map to multiple entries in AccessAnalysis, which could cover both 'read' and 'write'. Originally the algorithm only checked 'write' entry in Accesses if only 'write' exists. This is incorrect and the consequence is it ignored all read access, and finally some RAW and WAR dependence are missed.

For the case given above, if we ignore two reads, the dependence between (1) and (3) would not be able to be captured, and finally this loop will be incorrectly vectorized.

The fix simply inserts a new loop to find all entries in Accesses. Since it will skip most of all other memory accesses by checking the Value pointer at the very beginning of the loop, it should not increase compile-time visibly.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225159 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 10:08:58 +00:00
Chandler Carruth
4f9a7277d1 [SROA] Apply a somewhat heavy and unpleasant hammer to fix PR22093, an
assert out of the new pre-splitting in SROA.

This fix makes the code do what was originally intended -- when we have
a store of a load both dealing in the same alloca, we force them to both
be pre-split with identical offsets. This is really quite hard to do
because we can keep discovering problems as we go along. We have to
track every load over the current alloca which for any resaon becomes
invalid for pre-splitting, and go back to remove all stores of those
loads. I've included a couple of test cases derived from PR22093 that
cover the different ways this can happen. While that PR only really
triggered the first of these two, its the same fundamental issue.

The other challenge here is documented in a FIXME now. We end up being
quite a bit more aggressive for pre-splitting when loads and stores
don't refer to the same alloca. This aggressiveness comes at the cost of
introducing potentially redundant loads. It isn't clear that this is the
right balance. It might be considerably better to require that we only
do pre-splitting when we can presplit every load and store involved in
the entire operation. That would give more consistent if conservative
results. Unfortunately, it requires a non-trivial change to the actual
pre-splitting operation in order to correctly handle cases where we end
up pre-splitting stores out-of-order. And it isn't 100% clear that this
is the right direction, although I'm starting to suspect that it is.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225149 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 04:17:53 +00:00
David Majnemer
07d7dbae9e InstCombine: match can find ConstantExprs, don't assume we have a Value
We assumed the output of a match was a Value, this would cause us to
assert because we would fail a cast<>.  Instead, use a helper in the
Operator family to hide the distinction between Value and Constant.

This fixes PR22087.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225127 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-04 07:36:02 +00:00
David Majnemer
77e22b7836 ValueTracking: ComputeNumSignBits should tolerate misshapen phi nodes
PHI nodes can have zero operands in the middle of a transform.  It is
expected that utilities in Analysis don't freak out when this happens.

Note that it is considered invalid to allow these misshapen phi nodes to
make it to another pass.

This fixes PR22086.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225126 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-04 07:06:53 +00:00
David Majnemer
5e9c6212a8 InstCombine: Detect when llvm.umul.with.overflow always overflows
We know overflow always occurs if both ~LHSKnownZero * ~RHSKnownZero
and LHSKnownOne * RHSKnownOne overflow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225077 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-02 07:29:47 +00:00
Chandler Carruth
ce7f347da2 [SROA] Teach SROA to be more aggressive in splitting now that we have
a pre-splitting pass over loads and stores.

Historically, splitting could cause enough problems that I hamstrung the
entire process with a requirement that splittable integer loads and
stores must cover the entire alloca. All smaller loads and stores were
unsplittable to prevent chaos from ensuing. With the new pre-splitting
logic that does load/store pair splitting I introduced in r225061, we
can now very nicely handle arbitrarily splittable loads and stores. In
order to fully benefit from these smarts, we need to mark all of the
integer loads and stores as splittable.

However, we don't actually want to rewrite partitions with all integer
loads and stores marked as splittable. This will fail to extract scalar
integers from aggregates, which is kind of the point of SROA. =] In
order to resolve this, what we really want to do is only do
pre-splitting on the alloca slices with integer loads and stores fully
splittable. This allows us to uncover all non-integer uses of the alloca
that would benefit from a split in an integer load or store (and where
introducing the split is safe because it is just memory transfer from
a load to a store). Once done, we make all the non-whole-alloca integer
loads and stores unsplittable just as they have historically been,
repartition and rewrite.

The result is that when there are integer loads and stores anywhere
within an alloca (such as from a memcpy of a sub-object of a larger
object), we can split them up if there are non-integer components to the
aggregate hiding beneath. I've added the challenging test cases to
demonstrate how this is able to promote to scalars even a case where we
have even *partially* overlapping loads and stores.

This restores the single-store behavior for small arrays of i8s which is
really nice. I've restored both the little endian testing and big endian
testing for these exactly as they were prior to r225061. It also forced
me to be more aggressive in an alignment test to actually defeat SROA.
=] Without the added volatiles there, we actually split up the weird i16
loads and produce nice double allocas with better alignment.

This also uncovered a number of bugs where we failed to handle
splittable load and store slices which didn't have a begininng offset of
zero. Those fixes are included, and without them the existing test cases
explode in glorious fireworks. =]

I've kept support for leaving whole-alloca integer loads and stores as
splittable even for the purpose of rewriting, but I think that's likely
no longer needed. With the new pre-splitting, we might be able to remove
all the splitting support for loads and stores from the rewriter. Not
doing that in this patch to try to isolate any performance regressions
that causes in an easy to find and revert chunk.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225074 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-02 03:55:54 +00:00
Chandler Carruth
40a8741994 [SROA] Add a test case for r225068 / PR22080.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225070 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-02 00:34:29 +00:00
Chandler Carruth
450b39e971 [SROA] Teach SROA how to much more intelligently handle split loads and
stores.

When there are accesses to an entire alloca with an integer
load or store as well as accesses to small pieces of the alloca, SROA
splits up the large integer accesses. In order to do that, it uses bit
math to merge the small accesses into large integers. While this is
effective, it produces insane IR that can cause significant problems in
the rest of the optimizer:

- It can cause load and store mismatches with GVN on the non-alloca side
  where we end up loading an i64 (or some such) rather than loading
  specific elements that are stored.
- We can't always get rid of the integer bit math, which is why we can't
  always fix the loads and stores to work well with GVN.
- This is especially bad when we have operations that mix poorly with
  integer bit math such as floating point operations.
- It will block things like the vectorizer which might be able to handle
  the scalar stores that underly the aggregate.

At the same time, we can't just directly split up these loads and stores
in all cases. If there is actual integer arithmetic involved on the
values, then using integer bit math is actually the perfect lowering
because we can often combine it heavily with the surrounding math.

The solution this patch provides is to find places where SROA is
partitioning aggregates into small elements, and look for splittable
loads and stores that it can split all the way to some other adjacent
load and store. These are uniformly the cases where failing to split the
loads and stores hurts the optimizer that I have seen, and I've looked
extensively at the code produced both from more and less aggressive
approaches to this problem.

However, it is quite tricky to actually do this in SROA. We may have
loads and stores to the same alloca, or other complex patterns that are
hard to handle. This complexity leads to the somewhat subtle algorithm
implemented here. We have to do this entire process as a separate pass
over the partitioning of the alloca, and split up all of the loads prior
to splitting the stores so that we can handle safely the cases of
overlapping, including partially overlapping, loads and stores to the
same alloca. We also have to reconstitute the post-split slice
configuration so we can avoid iterating again over all the alloca uses
(the slow part of SROA). But we also have to ensure that when we split
up loads and stores to *other* allocas, we *do* re-iterate over them in
SROA to adapt to the more refined partitioning now required.

With this, I actually think we can fix a long-standing TODO in SROA
where I avoided splitting as many loads and stores as probably should be
splittable. This limitation historically mitigated the fallout of all
the bad things mentioned above. Now that we have more intelligent
handling, I plan to remove the FIXME and more aggressively mark integer
loads and stores as splittable. I'll do that in a follow-up patch to
help with bisecting any fallout.

The net result of this change should be more fine-grained and accurate
scalars being formed out of aggregates. At the very least, Clang now
generates perfect code for this high-level test case using
std::complex<float>:

  #include <complex>

  void g1(std::complex<float> &x, float a, float b) {
    x += std::complex<float>(a, b);
  }
  void g2(std::complex<float> &x, float a, float b) {
    x -= std::complex<float>(a, b);
  }

  void foo(const std::complex<float> &x, float a, float b,
           std::complex<float> &x1, std::complex<float> &x2) {
    std::complex<float> l1 = x;
    g1(l1, a, b);
    std::complex<float> l2 = x;
    g2(l2, a, b);
    x1 = l1;
    x2 = l2;
  }

This code isn't just hypothetical either. It was reduced out of the hot
inner loops of essentially every part of the Eigen math library when
using std::complex<float>. Those loops would consistently and
pervasively hop between the floating point unit and the integer unit due
to bit math extraction and insertion of floating point values that were
"stored" in a 64-bit integer register around the loop backedge.

So far, this change has passed a bootstrap and I have done some other
testing and so far, no issues. That doesn't mean there won't be though,
so I'll be prepared to help with any fallout. If you performance swings
in particular, please let me know. I'm very curious what all the impact
of this change will be. Stay tuned for the follow-up to also split more
integer loads and stores.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225061 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-01 11:54:38 +00:00
Sanjay Patel
28650b8ec2 InstCombine: fsub nsz 0, X ==> fsub nsz -0.0, X
Some day the backend may handle instruction-level fast math flags and make
this transform unnecessary, but it's still better practice to use the canonical
representation of fneg when possible (use a -0.0).

This is a partial fix for PR20870 ( http://llvm.org/bugs/show_bug.cgi?id=20870 ).
See also http://reviews.llvm.org/D6723.

Differential Revision: http://reviews.llvm.org/D6731



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225050 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-31 22:14:05 +00:00
David Majnemer
0f77ccd6bb InstCombine: try to transform A-B < 0 into A < B
We are allowed to move the 'B' to the right hand side if we an prove
there is no signed overflow and if the comparison itself is signed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225034 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-31 04:21:41 +00:00
Philip Reames
91a083c57f Carry facts about nullness and undef across GC relocation
This change implements four basic optimizations:

    If a relocated value isn't used, it doesn't need to be relocated.
    If the value being relocated is null, relocation doesn't change that. (Technically, this might be collector specific. I don't know of one which it doesn't work for though.)
    If the value being relocated is undef, the relocation is meaningless.
    If the value being relocated was known nonnull, the relocated pointer also isn't null. (Since it points to the same source language object.)

I outlined other planned work in comments.

Differential Revision: http://reviews.llvm.org/D6600



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224968 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-29 23:27:30 +00:00
Philip Reames
1714ad67bd Refine the notion of MayThrow in LICM to include a header specific version
In LICM, we have a check for an instruction which is guaranteed to execute and thus can't introduce any new faults if moved to the preheader. To handle a function which might unconditionally throw when first called, we check for any potentially throwing call in the loop and give up.

This is unfortunate when the potentially throwing condition is down a rare path. It prevents essentially all LICM of potentially faulting instructions where the faulting condition is checked outside the loop. It also greatly diminishes the utility of loop unswitching since control dependent instructions - which are now likely in the loops header block - will not be lifted by subsequent LICM runs.

define void @nothrow_header(i64 %x, i64 %y, i1 %cond) {
; CHECK-LABEL: nothrow_header
; CHECK-LABEL: entry
; CHECK: %div = udiv i64 %x, %y
; CHECK-LABEL: loop
; CHECK: call void @use(i64 %div)
entry:
  br label %loop
loop: ; preds = %entry, %for.inc
  %div = udiv i64 %x, %y
  br i1 %cond, label %loop-if, label %exit
loop-if:
  call void @use(i64 %div)
  br label %loop
exit:
  ret void
}

The current patch really only helps with non-memory instructions (i.e. divs, etc..) since the maythrow call down the rare path will be considered to alias an otherwise hoistable load.  The one exception is that it does kick in for loads which are known to be invariant without regard to other possible stores, i.e. those marked with either !invarant.load metadata of tbaa 'is constant memory' metadata.

Differential Revision: http://reviews.llvm.org/D6725



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224965 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-29 23:00:57 +00:00
Philip Reames
456b7b602c Loading from null is valid outside of addrspace 0
This patches fixes a miscompile where we were assuming that loading from null is undefined and thus we could assume it doesn't happen.  This transform is perfectly legal in address space 0, but is not neccessarily legal in other address spaces.

We really should introduce a hook to control this property on a per target per address space basis.  We may be loosing valuable optimizations in some address spaces by being too conservative.

Original patch by Thomas P Raoux (submitted to llvm-commits), tests and formatting fixes by me.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224961 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-29 22:46:21 +00:00
David Majnemer
7627d9c229 InstCombine: Infer nuw for multiplies
A multiply cannot unsigned wrap if there are bitwidth, or more, leading
zero bits between the two operands.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224849 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-26 09:50:35 +00:00
David Majnemer
998ae69abe InstCombe: Infer nsw for multiplies
We already utilize this logic for reducing overflow intrinsics, it makes
sense to reuse it for normal multiplies as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224847 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-26 09:10:14 +00:00
Michael Kuperstein
a098c770e1 [ValueTracking] Move GlobalAlias handling to be after the max depth check in computeKnownBits()
GlobalAlias handling used to be after GlobalValue handling, which meant it was, in practice, dead code. r220165 moved GlobalAlias handling to be before GlobalValue handling, but also moved it to be before the max depth check, causing an assert due to a recursion depth limit violation. 

This moves GlobalAlias handling forward to where it's safe, and changes the GlobalValue handling to only look at GlobalObjects.

Differential Revision: http://reviews.llvm.org/D6758

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224765 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-23 11:33:41 +00:00
Michael Liao
b9e302f3ca [SimplifyCFG] Revise common code sinking
- Fix the case where more than 1 common instructions derived from the same
  operand cannot be sunk. When a pair of value has more than 1 derived values
  in both branches, only 1 derived value could be sunk.
- Replace BB1 -> (BB2, PN) map with joint value map, i.e.
  map of (BB1, BB2) -> PN, which is more accurate to track common ops.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224757 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-23 08:26:55 +00:00
Bruno Cardoso Lopes
a559a2317c [LCSSA] Handle PHI insertion in disjoint loops
Take two disjoint Loops L1 and L2.

LoopSimplify fails to simplify some loops (e.g. when indirect branches
are involved). In such situations, it can happen that an exit for L1 is
the header of L2. Thus, when we create PHIs in one of such exits we are
also inserting PHIs in L2 header.

This could break LCSSA form for L2 because these inserted PHIs can also
have uses in L2 exits, which are never handled in the current
implementation. Provide a fix for this corner case and test that we
don't assert/crash on that.

Differential Revision: http://reviews.llvm.org/D6624

rdar://problem/19166231

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224740 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-22 22:35:46 +00:00
David Majnemer
6df827240e This should have been part of r224676.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224677 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-20 04:48:34 +00:00
David Majnemer
854a37649a InstCombine: Squash an icmp+select into bitwise arithmetic
(X & INT_MIN) == 0 ? X ^ INT_MIN : X  into  X | INT_MIN
(X & INT_MIN) != 0 ? X ^ INT_MIN : X  into  X & INT_MAX

This fixes PR21993.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224676 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-20 04:45:35 +00:00
David Majnemer
9cd99a0724 InstSimplify: Optimize away pointless comparisons
(X & INT_MIN) ? X & INT_MAX : X  into  X & INT_MAX
(X & INT_MIN) ? X : X & INT_MAX  into  X
(X & INT_MIN) ? X | INT_MIN : X  into  X
(X & INT_MIN) ? X : X | INT_MIN  into  X | INT_MIN

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224669 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-20 03:04:38 +00:00
Bruno Cardoso Lopes
06833ca7c1 Reapply: [InstCombine] Fix visitSwitchInst to use right operand types for sub cstexpr
The visitSwitchInst generates SUB constant expressions to recompute the
switch condition. When truncating the condition to a smaller type, SUB
expressions should use the previous type (before trunc) for both
operands. Also, fix code to also return the modified switch when only
the truncation is performed.

This fixes an assertion crash.

Differential Revision: http://reviews.llvm.org/D6644

rdar://problem/19191835

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224588 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-19 17:12:35 +00:00
Sanjay Patel
7c5fa50875 use -0.0 when creating an fneg instruction
Backends recognize (-0.0 - X) as the canonical form for fneg
and produce better code. Eg, ppc64 with 0.0:

   lis r2, ha16(LCPI0_0)
   lfs f0, lo16(LCPI0_0)(r2)
   fsubs f1, f0, f1
   blr

vs. -0.0:

   fneg f1, f1
   blr

Differential Revision: http://reviews.llvm.org/D6723



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224583 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-19 16:44:08 +00:00
Bruno Cardoso Lopes
01b07d541b Revert "[InstCombine] Fix visitSwitchInst to use right operand types for sub cstexpr"
Reverts commit r224574 to appease buildbots:

The visitSwitchInst generates SUB constant expressions to recompute the
switch condition. When truncating the condition to a smaller type, SUB
expressions should use the previous type (before trunc) for both
operands. This fixes an assertion crash.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224576 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-19 14:36:24 +00:00
Bruno Cardoso Lopes
cba407d019 [InstCombine] Fix visitSwitchInst to use right operand types for sub cstexpr
The visitSwitchInst generates SUB constant expressions to recompute the
switch condition. When truncating the condition to a smaller type, SUB
expressions should use the previous type (before trunc) for both
operands. This fixes an assertion crash.

Differential Revision: http://reviews.llvm.org/D6644

rdar://problem/19191835

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224574 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-19 14:23:15 +00:00
David Majnemer
73059bd1f1 ConstantFold: Shifting undef by zero results in undef
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224553 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-18 23:54:43 +00:00
Suyog Sarda
4bfc4f2e8c Revert 224119 "This patch recognizes (+ (+ v0, v1) (+ v2, v3)), reorders them for bundling into vector of loads,
and vectorizes it." 

This was re-ordering floating point data types resulting in mismatch in output.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224424 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-17 10:34:27 +00:00
Elena Demikhovsky
982a8b3aeb Added 5 more tests related to sink store revision 224247
- by Ella Bolshinsky

http://reviews.llvm.org/D6420



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224418 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-17 08:12:59 +00:00
Erik Eckstein
96bd465d6c Strength reduce intrinsics with overflow into regular arithmetic operations if possible.
Some intrinsics, like s/uadd.with.overflow and umul.with.overflow, are already strength reduced.
This change adds other arithmetic intrinsics: s/usub.with.overflow, smul.with.overflow.
It completes the work on PR20194.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224417 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-17 07:29:19 +00:00
David Majnemer
891ec6d69f InstSimplify: shl nsw/nuw undef, %V -> undef
We can always choose an value for undef which might cause %V to shift
out an important bit except for one case, when %V is zero.

However, shl behaves like an identity function when the right hand side
is zero.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224405 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-17 01:54:33 +00:00
Elena Demikhovsky
14fb445715 Masked Load and Store Intrinsics in loop vectorizer.
The loop vectorizer optimizes loops containing conditional memory
accesses by generating masked load and store intrinsics.
This decision is target dependent.

http://reviews.llvm.org/D6527



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224334 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-16 11:50:42 +00:00
Sanjoy Das
574e01c32e Teach ScalarEvolution to exploit min and max expressions when proving
isKnownPredicate.

The motivation for this change is to optimize away checks in loops
like this:

    limit = min(t, len)
    for (i = 0 to limit)
      if (i >= len || i < 0) throw_array_of_of_bounds();
      a[i] = ...

Differential Revision: http://reviews.llvm.org/D6635



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224285 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-15 22:50:15 +00:00
Duncan P. N. Exon Smith
1ef70ff39b IR: Make metadata typeless in assembly
Now that `Metadata` is typeless, reflect that in the assembly.  These
are the matching assembly changes for the metadata/value split in
r223802.

  - Only use the `metadata` type when referencing metadata from a call
    intrinsic -- i.e., only when it's used as a `Value`.

  - Stop pretending that `ValueAsMetadata` is wrapped in an `MDNode`
    when referencing it from call intrinsics.

So, assembly like this:

    define @foo(i32 %v) {
      call void @llvm.foo(metadata !{i32 %v}, metadata !0)
      call void @llvm.foo(metadata !{i32 7}, metadata !0)
      call void @llvm.foo(metadata !1, metadata !0)
      call void @llvm.foo(metadata !3, metadata !0)
      call void @llvm.foo(metadata !{metadata !3}, metadata !0)
      ret void, !bar !2
    }
    !0 = metadata !{metadata !2}
    !1 = metadata !{i32* @global}
    !2 = metadata !{metadata !3}
    !3 = metadata !{}

turns into this:

    define @foo(i32 %v) {
      call void @llvm.foo(metadata i32 %v, metadata !0)
      call void @llvm.foo(metadata i32 7, metadata !0)
      call void @llvm.foo(metadata i32* @global, metadata !0)
      call void @llvm.foo(metadata !3, metadata !0)
      call void @llvm.foo(metadata !{!3}, metadata !0)
      ret void, !bar !2
    }
    !0 = !{!2}
    !1 = !{i32* @global}
    !2 = !{!3}
    !3 = !{}

I wrote an upgrade script that handled almost all of the tests in llvm
and many of the tests in cfe (even handling many `CHECK` lines).  I've
attached it (or will attach it in a moment if you're speedy) to PR21532
to help everyone update their out-of-tree testcases.

This is part of PR21532.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224257 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-15 19:07:53 +00:00
Elena Demikhovsky
a8a374135b Added a test related to 224247 revision
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224248 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-15 14:14:10 +00:00
Suyog Sarda
4dcffed444 Typo Correction in Test Case. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224244 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-15 12:19:46 +00:00
Ahmed Bougacha
780a093afb Reapply "[ARM] Combine base-updating/post-incrementing vector load/stores."
r223862 tried to also combine base-updating load/stores.
r224198 reverted it, as "it created a regression on the test-suite
on test MultiSource/Benchmarks/Ptrdist/anagram by scrambling the order
in which the words are shown."
Reapply, with a fix to ignore non-normal load/stores.
Truncstores are handled elsewhere (you can actually write a pattern for
those, whereas for postinc loads you can't, since they return two values),
but it should be possible to also combine extloads base updates, by checking
that the memory (rather than result) type is of the same size as the addend.

Original commit message:
We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD
when the base pointer is incremented after the load/store.

We can do the same thing for generic load/stores.

Note that we can only combine the first load/store+adds pair in
a sequence (as might be generated for a v16f32 load for instance),
because other combines turn the base pointer addition chain (each
computing the address of the next load, from the address of the last
load) into independent additions (common base pointer + this load's
offset).

Differential Revision: http://reviews.llvm.org/D6585


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224203 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-13 23:22:12 +00:00
Renato Golin
1e173b7139 Revert "[ARM] Combine base-updating/post-incrementing vector load/stores."
This reverts commit r223862, as it created a regression on the test-suite
on test MultiSource/Benchmarks/Ptrdist/anagram by scrambling the order
in which the words are shown. We'll investigate the issue and re-apply
when safe.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224198 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-13 20:23:18 +00:00
David Majnemer
3b7e6d27d2 ValueTracking: Don't recurse too deeply in computeKnownBitsFromAssume
Respect the MaxDepth recursion limit, doing otherwise will trigger an
assert in computeKnownBits.

This fixes PR21891.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224168 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-12 23:59:29 +00:00
Suyog Sarda
1dea0dc279 This patch recognizes (+ (+ v0, v1) (+ v2, v3)), reorders them for bundling into vector of loads,
and vectorizes it. 
 
 Test case :
 
       float hadd(float* a) {
           return (a[0] + a[1]) + (a[2] + a[3]);
        }
 
 
 AArch64 assembly before patch :
 
        ldp	s0, s1, [x0]
 	ldp	s2, s3, [x0, #8]
 	fadd	s0, s0, s1
 	fadd	s1, s2, s3
 	fadd	s0, s0, s1
 	ret
 
 AArch64 assembly after patch :
 
        ldp	d0, d1, [x0]
 	fadd	v0.2s, v0.2s, v1.2s
 	faddp	s0, v0.2s
 	ret

Reviewed Link : http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141208/248531.html



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224119 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-12 12:53:44 +00:00
Steven Wu
a511846bdf Fix another infinite loop in InstCombine
Summary:
InstCombine infinite-loops for the testcase added
It is because InstCombine is generating instructions that can be
optimized by itself. Fix by not optimizing frem if the optimized
type is the same as original type.
rdar://problem/19150820

Reviewers: majnemer

Differential Revision: http://reviews.llvm.org/D6634

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224097 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-12 04:34:07 +00:00
Andrea Di Biagio
f27500040b [InstCombine][X86] Improved folding of calls to Intrinsic::x86_sse4a_insertqi.
This patch teaches the instruction combiner how to fold a call to 'insertqi' if
the 'length field' (3rd operand) is set to zero, and if the sum between
field 'length' and 'bit index' (4th operand) is bigger than 64.

From the AMD64 Architecture Programmer's Manual:
1. If the sum of the bit index + length field is greater than 64, then the
   results are undefined;
2. A value of zero in the field length is defined as a length of 64.

This patch improves the existing combining logic for intrinsic 'insertqi'
adding extra checks to address both point 1. and point 2.

Differential Revision: http://reviews.llvm.org/D6583


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224054 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-11 20:44:59 +00:00
David Majnemer
c57bee5399 InstSimplify: Remove usesless %a parameter from tests
No functional change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224016 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-11 12:56:17 +00:00
Michael Kuperstein
1696b35ff1 The inliner needs to fix up debug information for llvm.dbg.declare, not only for llvm.dbg.value.
Patch by Amjad Aboud

Differential Revision: http://reviews.llvm.org/D6525


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224015 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-11 12:41:10 +00:00
David Majnemer
72c6bdbf70 ConstantFold, InstSimplify: undef >>a x can be either -1 or 0, choose 0
Zero is usually a nicer constant to have than -1.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223969 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 21:58:15 +00:00
David Majnemer
ea9bcfc707 ConstantFold: an undef shift amount results in undef
X shifted by undef results in undef because the undef value can
represent values greater than the width of the operands.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223968 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 21:38:05 +00:00
David Majnemer
895316336e ConstantFold: div undef, 0 should fold to undef, not zero
Dividing by zero yields an undefined value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223924 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 09:14:55 +00:00
David Majnemer
6578f1beb1 InstSimplify: [al]shr exact undef, %X -> undef
Exact shifts always keep the non-zero bits of their input.  This means
it keeps it's undef bits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223923 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 09:14:52 +00:00
David Majnemer
1297775557 InstSimplify: div %X, 0 -> undef
We already optimized rem %X, 0 to undef, we should do the same for div.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223919 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 07:52:18 +00:00
Ahmed Bougacha
605c40341b [ARM] Combine base-updating/post-incrementing vector load/stores.
We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD
when the base pointer is incremented after the load/store.

We can do the same thing for generic load/stores.

Note that we can only combine the first load/store+adds pair in
a sequence (as might be generated for a v16f32 load for instance),
because other combines turn the base pointer addition chain (each
computing the address of the next load, from the address of the last
load) into independent additions (common base pointer + this load's
offset).

Differential Revision: http://reviews.llvm.org/D6585


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223862 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 00:07:37 +00:00
Chandler Carruth
3508e27903 Revert r223764 which taught instcombine about integer-based elment extraction
patterns.

This is causing Clang to miscompile itself for 32-bit x86 somehow, and likely
also on ARM and PPC. I really don't know how, but reverting now that I've
confirmed this is actually the culprit. I have a reproduction as well and so
should be able to restore this shortly.

This reverts commit r223764.

Original commit log follows:
Teach instcombine to canonicalize "element extraction" from a load of an
integer and "element insertion" into a store of an integer into actual
element extraction, element insertion, and vector loads and stores.

Previously various parts of LLVM (including instcombine itself) would
introduce integer loads and stores into the code as a way of opaquely
loading and storing "bits". In some cases (such as a memcpy of
std::complex<float> object) we will eventually end up using those bits
in non-integer types. In order for SROA to effectively promote the
allocas involved, it splits these "store a bag of bits" integer loads
and stores up into the constituent parts. However, for non-alloca loads
and tsores which remain, it uses integer math to recombine the values
into a large integer to load or store.

All of this would be "fine", except that it forces LLVM to go through
integer math to combine and split up values. While this makes perfect
sense for integers (and in fact is critical for bitfields to end up
lowering efficiently) it is *terrible* for non-integer types, especially
floating point types. We have a much more canonical way of representing
the act of concatenating the bits of two SSA values in LLVM: a vector
and insertelement. This patch teaching InstCombine to use this
representation.

With this patch applied, LLVM will no longer introduce integer math into
the critical path of every loop over std::complex<float> operations such
as those that make up the hot path of ... oh, most HPC code, Eigen, and
any other heavy linear algebra library.

For the record, I looked *extensively* at fixing this in other parts of
the compiler, but it just doesn't work:
- We really do want to canonicalize memcpy and other bit-motion to
  integer loads and stores. SSA values are tremendously more powerful
  than "copy" intrinsics. Not doing this regresses massive amounts of
  LLVM's scalar optimizer.
- We really do need to split up integer loads and stores of this form in
  SROA or every memcpy of a trivially copyable struct will prevent SSA
  formation of the members of that struct. It essentially turns off
  SROA.
- The closest alternative is to actually split the loads and stores when
  partitioning with SROA, but this has all of the downsides historically
  discussed of splitting up loads and stores -- the wide-store
  information is fundamentally lost. We would also see performance
  regressions for bitfield-heavy code and other places where the
  integers aren't really intended to be split without seemingly
  arbitrary logic to treat integers totally differently.
- We *can* effectively fix this in instcombine, so it isn't that hard of
  a choice to make IMO.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223813 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-09 19:21:16 +00:00
Duncan P. N. Exon Smith
dad20b2ae2 IR: Split Metadata from Value
Split `Metadata` away from the `Value` class hierarchy, as part of
PR21532.  Assembly and bitcode changes are in the wings, but this is the
bulk of the change for the IR C++ API.

I have a follow-up patch prepared for `clang`.  If this breaks other
sub-projects, I apologize in advance :(.  Help me compile it on Darwin
I'll try to fix it.  FWIW, the errors should be easy to fix, so it may
be simpler to just fix it yourself.

This breaks the build for all metadata-related code that's out-of-tree.
Rest assured the transition is mechanical and the compiler should catch
almost all of the problems.

Here's a quick guide for updating your code:

  - `Metadata` is the root of a class hierarchy with three main classes:
    `MDNode`, `MDString`, and `ValueAsMetadata`.  It is distinct from
    the `Value` class hierarchy.  It is typeless -- i.e., instances do
    *not* have a `Type`.

  - `MDNode`'s operands are all `Metadata *` (instead of `Value *`).

  - `TrackingVH<MDNode>` and `WeakVH` referring to metadata can be
    replaced with `TrackingMDNodeRef` and `TrackingMDRef`, respectively.

    If you're referring solely to resolved `MDNode`s -- post graph
    construction -- just use `MDNode*`.

  - `MDNode` (and the rest of `Metadata`) have only limited support for
    `replaceAllUsesWith()`.

    As long as an `MDNode` is pointing at a forward declaration -- the
    result of `MDNode::getTemporary()` -- it maintains a side map of its
    uses and can RAUW itself.  Once the forward declarations are fully
    resolved RAUW support is dropped on the ground.  This means that
    uniquing collisions on changing operands cause nodes to become
    "distinct".  (This already happened fairly commonly, whenever an
    operand went to null.)

    If you're constructing complex (non self-reference) `MDNode` cycles,
    you need to call `MDNode::resolveCycles()` on each node (or on a
    top-level node that somehow references all of the nodes).  Also,
    don't do that.  Metadata cycles (and the RAUW machinery needed to
    construct them) are expensive.

  - An `MDNode` can only refer to a `Constant` through a bridge called
    `ConstantAsMetadata` (one of the subclasses of `ValueAsMetadata`).

    As a side effect, accessing an operand of an `MDNode` that is known
    to be, e.g., `ConstantInt`, takes three steps: first, cast from
    `Metadata` to `ConstantAsMetadata`; second, extract the `Constant`;
    third, cast down to `ConstantInt`.

    The eventual goal is to introduce `MDInt`/`MDFloat`/etc. and have
    metadata schema owners transition away from using `Constant`s when
    the type isn't important (and they don't care about referring to
    `GlobalValue`s).

    In the meantime, I've added transitional API to the `mdconst`
    namespace that matches semantics with the old code, in order to
    avoid adding the error-prone three-step equivalent to every call
    site.  If your old code was:

        MDNode *N = foo();
        bar(isa             <ConstantInt>(N->getOperand(0)));
        baz(cast            <ConstantInt>(N->getOperand(1)));
        bak(cast_or_null    <ConstantInt>(N->getOperand(2)));
        bat(dyn_cast        <ConstantInt>(N->getOperand(3)));
        bay(dyn_cast_or_null<ConstantInt>(N->getOperand(4)));

    you can trivially match its semantics with:

        MDNode *N = foo();
        bar(mdconst::hasa               <ConstantInt>(N->getOperand(0)));
        baz(mdconst::extract            <ConstantInt>(N->getOperand(1)));
        bak(mdconst::extract_or_null    <ConstantInt>(N->getOperand(2)));
        bat(mdconst::dyn_extract        <ConstantInt>(N->getOperand(3)));
        bay(mdconst::dyn_extract_or_null<ConstantInt>(N->getOperand(4)));

    and when you transition your metadata schema to `MDInt`:

        MDNode *N = foo();
        bar(isa             <MDInt>(N->getOperand(0)));
        baz(cast            <MDInt>(N->getOperand(1)));
        bak(cast_or_null    <MDInt>(N->getOperand(2)));
        bat(dyn_cast        <MDInt>(N->getOperand(3)));
        bay(dyn_cast_or_null<MDInt>(N->getOperand(4)));

  - A `CallInst` -- specifically, intrinsic instructions -- can refer to
    metadata through a bridge called `MetadataAsValue`.  This is a
    subclass of `Value` where `getType()->isMetadataTy()`.

    `MetadataAsValue` is the *only* class that can legally refer to a
    `LocalAsMetadata`, which is a bridged form of non-`Constant` values
    like `Argument` and `Instruction`.  It can also refer to any other
    `Metadata` subclass.

(I'll break all your testcases in a follow-up commit, when I propagate
this change to assembly.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223802 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-09 18:38:53 +00:00
Sonam Kumari
05a824843d Removal Of Duplicate Test Cases and Addition Of Missing Check Statements
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223768 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-09 10:46:38 +00:00
Ankur Garg
df35082f20 [test/Transforms/InstCombine/shift.ll] Removed duplicate test cases. NFC.
Removed some duplicate test cases from the file /test/Transforms/InstCombine/shift.ll.

test54 and test57 were duplicates of each other.
test55 and test58 were duplicates of each other.

(Removed test57 and test58)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223767 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-09 10:35:19 +00:00
Chandler Carruth
e78a87b633 Teach instcombine to canonicalize "element extraction" from a load of an
integer and "element insertion" into a store of an integer into actual
element extraction, element insertion, and vector loads and stores.

Previously various parts of LLVM (including instcombine itself) would
introduce integer loads and stores into the code as a way of opaquely
loading and storing "bits". In some cases (such as a memcpy of
std::complex<float> object) we will eventually end up using those bits
in non-integer types. In order for SROA to effectively promote the
allocas involved, it splits these "store a bag of bits" integer loads
and stores up into the constituent parts. However, for non-alloca loads
and tsores which remain, it uses integer math to recombine the values
into a large integer to load or store.

All of this would be "fine", except that it forces LLVM to go through
integer math to combine and split up values. While this makes perfect
sense for integers (and in fact is critical for bitfields to end up
lowering efficiently) it is *terrible* for non-integer types, especially
floating point types. We have a much more canonical way of representing
the act of concatenating the bits of two SSA values in LLVM: a vector
and insertelement. This patch teaching InstCombine to use this
representation.

With this patch applied, LLVM will no longer introduce integer math into
the critical path of every loop over std::complex<float> operations such
as those that make up the hot path of ... oh, most HPC code, Eigen, and
any other heavy linear algebra library.

For the record, I looked *extensively* at fixing this in other parts of
the compiler, but it just doesn't work:
- We really do want to canonicalize memcpy and other bit-motion to
  integer loads and stores. SSA values are tremendously more powerful
  than "copy" intrinsics. Not doing this regresses massive amounts of
  LLVM's scalar optimizer.
- We really do need to split up integer loads and stores of this form in
  SROA or every memcpy of a trivially copyable struct will prevent SSA
  formation of the members of that struct. It essentially turns off
  SROA.
- The closest alternative is to actually split the loads and stores when
  partitioning with SROA, but this has all of the downsides historically
  discussed of splitting up loads and stores -- the wide-store
  information is fundamentally lost. We would also see performance
  regressions for bitfield-heavy code and other places where the
  integers aren't really intended to be split without seemingly
  arbitrary logic to treat integers totally differently.
- We *can* effectively fix this in instcombine, so it isn't that hard of
  a choice to make IMO.

Differential Revision: http://reviews.llvm.org/D6548

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223764 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-09 08:55:32 +00:00
David Majnemer
fca9c7b21c InstSimplify: Try to bring back the rest of r223583
This reverts r223624 with a small tweak, hopefully this will make stage3
equivalent.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223679 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-08 18:30:43 +00:00
Sonam Kumari
a585ec7f7f Removal Of Duplicate Test Case from shift.ll file
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223648 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-08 09:40:43 +00:00
NAKAMURA Takumi
e4a5390406 Revert a part of r223583, for now. It seems causing different emission between stage2(gcc-clang) and stage3 clang. Investigating.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223624 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-08 02:07:22 +00:00
David Majnemer
620e8763ec InstSimplify: Optimize away useless unsigned comparisons
Code like X < Y && Y == 0 should always be folded away to false.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223583 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-06 10:51:40 +00:00
Duncan P. N. Exon Smith
30596886ed IR: Disallow complicated function-local metadata
Disallow complex types of function-local metadata.  The only valid
function-local metadata is an `MDNode` whose sole argument is a
non-metadata function-local value.

Part of PR21532.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223564 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-06 01:26:49 +00:00
Hans Wennborg
76b1313e75 Add some tests for SimplifyCFG's TurnSwitchRangeIntoICmp(). NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223396 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 22:19:28 +00:00
Hans Wennborg
8de6d3e510 Add some tests for SimplifyCFG's ConstantFoldTerminator(). NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223395 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 22:19:25 +00:00
Philip Reames
ef7c2ff0f3 Add a test case for argument type coercion in an invoke of a vararg function
This would have caught the bug I fixed in 223370.  



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223378 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 19:13:45 +00:00
Hal Finkel
efbb95a1be Revert "r223364 - Revert r223347 which has caused crashes on bootstrap bots."
Reapply r223347, with a fix to not crash on uninserted instructions (or more
precisely, instructions in uninserted blocks). bugpoint was able to reduce the
test case somewhat, but it is still somewhat large (and relies on setting
things up to be simplified during inlining), so I've not included it here.
Nevertheless, it is clear what is going on and why.

Original commit message:

Restrict somewhat the memory-allocation pointer cmp opt from r223093

Based on review comments from Richard Smith, restrict this optimization from
applying to globals that might resolve lazily to other dynamically-loaded
modules, and also from dynamic allocas (which might be transformed into malloc
calls). In short, take extra care that the compared-to pointer is really
simultaneously live with the memory allocation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223371 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 17:45:19 +00:00
Alexander Potapenko
182d9aaccb Revert r223347 which has caused crashes on bootstrap bots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223364 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 14:22:27 +00:00
Simon Pilgrim
94590ca4cf [InstCombine] Minor optimization for bswap with binary ops
Added instcombine optimizations for BSWAP with AND/OR/XOR ops:

OP( BSWAP(x), BSWAP(y) ) -> BSWAP( OP(x, y) )
OP( BSWAP(x), CONSTANT ) -> BSWAP( OP(x, BSWAP(CONSTANT) ) )

Since its just a one liner, I've also added BSWAP to the DAGCombiner equivalent as well:

fold (OP (bswap x), (bswap y)) -> (bswap (OP x, y))

Refactored bswap-fold tests to use FileCheck instead of just checking that the bswaps had gone.

Differential Revision: http://reviews.llvm.org/D6407



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223349 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 09:44:01 +00:00
Hal Finkel
d70d5148a6 Restrict somewhat the memory-allocation pointer cmp opt from r223093
Based on review comments from Richard Smith, restrict this optimization from
applying to globals that might resolve lazily to other dynamically-loaded
modules, and also from dynamic allocas (which might be transformed into malloc
calls). In short, take extra care that the compared-to pointer is really
simultaneously live with the memory allocation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223347 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 09:22:28 +00:00
Matthias Braun
b0ec6c21b7 [SimplifyLibCalls] Improve double->float shrinking to consider constants
This allows cases like float x; fmin(1.0, x); to be optimized to fminf(1.0f, x);

rdar://19049359

Differential Revision: http://reviews.llvm.org/D6496

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223270 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-03 21:46:33 +00:00
Matthias Braun
9d362ec2a4 [SimplifyLibCalls] Enable double to float shrinking for copysign
rdar://19049359

Differential Revision: http://reviews.llvm.org/D6495

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223269 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-03 21:46:29 +00:00
Nick Lewycky
0f87df2033 Fix test to use the right metadata node (reapply r223239 plus a fix) and also to use the correct path to the GCNO file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223244 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-03 17:32:44 +00:00
Alexander Potapenko
22b6c01fc8 Revert r223239, which broke some bots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223240 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-03 16:03:08 +00:00
Alexander Potapenko
2afd191abd Fix the metadata number used by llvm.gcov to match the number of the inserted metadata node.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223239 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-03 15:15:58 +00:00
Erik Eckstein
10e28ca6b1 InstCombine: simplify signed range checks
Try to convert two compares of a signed range check into a single unsigned compare.
Examples:
(icmp sge x, 0) & (icmp slt x, n) --> icmp ult x, n
(icmp slt x, 0) | (icmp sgt x, n) --> icmp ugt x, n




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223224 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-03 10:39:15 +00:00
Tom Stellard
857550322c StructurizeCFG: Use LoopInfo analysis for better loop detection
We were assuming that each back-edge in a region represented a unique
loop, which is not always the case.  We need to use LoopInfo to
correctly determine which back-edges are loops.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223199 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-03 04:28:32 +00:00
Nick Lewycky
92d7d4dcd7 Emit the entry block first and the exit block second, then all the blocks in between afterwards. This is what gcc always does, and some out of tree tools depend on that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223193 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-03 02:45:01 +00:00
Michael Zolotukhin
97be10d98f PR21302. Vectorize only bottom-tested loops.
rdar://problem/18886083

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223171 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-02 22:59:06 +00:00
Michael Zolotukhin
6845cace0e Apply loop-rotate to several vectorizer tests.
Such loops shouldn't be vectorized due to the loops form.
After applying loop-rotate (+simplifycfg) the tests again start to check
what they are intended to check.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223170 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-02 22:59:02 +00:00
Bruno Cardoso Lopes
495e547ef9 [SwitchLowering] Handle destinations on multiple phi instructions
Follow up from r222926. Also handle multiple destinations from merged
cases on multiple and subsequent phi instructions.

rdar://problem/19106978

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223135 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-02 18:31:53 +00:00
Bruno Cardoso Lopes
8fc3ffbb74 [LICM] Avoind store sinking if no preheader is available
Load instructions are inserted into loop preheaders when sinking stores
and later removed if not used by the SSA updater. Avoid sinking if the
loop has no preheader and avoid crashes. This fixes one more side effect
of not handling indirectbr instructions properly on LoopSimplify.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223119 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-02 14:22:34 +00:00
Sonam Kumari
4b0964871d [signext.ll] Removal Of Duplicate Test Cases
Removed the duplicate test case existing in signext.ll file.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223109 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-02 05:29:47 +00:00
Hal Finkel
7e32aa1015 Simplify pointer comparisons involving memory allocation functions
System memory allocation functions, which are identified at the IR level by the
noalias attribute on the return value, must return a pointer into a memory region
disjoint from any other memory accessible to the caller. We can use this
property to simplify pointer comparisons between allocated memory and local
stack addresses and the addresses of global variables. Neither the stack nor
global variables can overlap with the region used by the memory allocator.

Fixes PR21556.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223093 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-01 23:38:06 +00:00
Reid Kleckner
03c735b42c Parse 'ghccc' in .ll files as the GHC convention (cc 10)
Previously we just used "cc 10" in the .ll files, but that isn't very
human readable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223076 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-01 21:04:44 +00:00
Hans Wennborg
c9605b6067 Revert r223049, r223050 and r223051 while investigating test failures.
I didn't foresee affecting the Clang test suite :/

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223054 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-01 17:36:43 +00:00
Hans Wennborg
07c6024f48 SimplifyCFG: Omit range checks for switch lookup tables when default is unreachable
They would get optimized away later, but we might as well not emit them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223051 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-01 17:08:38 +00:00
Hans Wennborg
b2fd944378 SimplifyCFG: don't remove unreachable default switch destinations
An unreachable default destination can be exploited by other optimizations, and
SDag lowering is now prepared to handle them efficiently.

For example, branches to the unreachable destination will be optimized away,
such as in the case of range checks for switch lookup tables.

On 64-bit Linux, this reduces the size of a clang bootstrap by 80 kB (and
Chromium by 30 kB).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223050 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-01 17:08:35 +00:00
Sonam Kumari
5e6e75a48e Removed extra whitespace. (Testing commit access). NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222994 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-01 09:27:46 +00:00
Rafael Espindola
6c8ce66b03 Relax an assert a bit to avoid a crash on unreachable code.
Patch by Duncan Exon Smith with a small tweak by me.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222984 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-01 02:55:24 +00:00
Duncan P. N. Exon Smith
9416f9c57d DebugIR: Delete -debug-ir
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222945 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-29 03:15:47 +00:00
Duncan P. N. Exon Smith
54786a0936 Revert "Masked Vector Load and Store Intrinsics."
This reverts commit r222632 (and follow-up r222636), which caused a host
of LNT failures on an internal bot.  I'll respond to the commit on the
list with a reproduction of one of the failures.

Conflicts:
	lib/Target/X86/X86TargetTransformInfo.cpp

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222936 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-28 21:29:14 +00:00
David Majnemer
a1129621dd InstCombine: FoldOrOfICmps harder
We may be in a situation where the icmps might not be near each other in
a tree of or instructions.  Try to dig out related compare instructions
and see if they combine.

N.B.  This won't fire on deep trees of compares because rewritting the
tree might end up creating a net increase of IR.  We may have to resort
to something more sophisticated if this is a real problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222928 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-28 19:58:29 +00:00
Bruno Cardoso Lopes
69ed1ff9b3 [LICM] Store sink and indirectbr instructions
Loop simplify skips exit-block insertion when exits contain indirectbr
instructions. This leads to an assertion in LICM when trying to sink
stores out of non-dedicated loop exits containing indirectbr
instructions. This patch fix this issue by re-checking for dedicated
exits in LICM prior to store sink attempts.

Differential Revision: http://reviews.llvm.org/D6414

rdar://problem/18943047

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222927 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-28 19:47:46 +00:00
Bruno Cardoso Lopes
04122090c2 [SwitchLowering] Handle multiple destinations on condensed case stmts
Switch cases statements with sequential values that branch to the same
destination BB may often be handled together in a single new source BB.
In this scenario we need to remove remaining incoming values from PHI
instructions in the destination BB, as to match the number of source
branches.

Differential Revision: http://reviews.llvm.org/D6415

rdar://problem/19040894

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222926 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-28 19:47:33 +00:00
Erik Eckstein
6cfbf12d77 reinstate r222872: Peephole optimization in switch table lookup: reuse the guarding table comparison if possible.
Fixed missing dominance check.
Original commit message:

This optimization tries to reuse the generated compare instruction, if there is a comparison against the default value after the switch.
Example:
   if (idx < tablesize)
      r = table[idx]; // table does not contain default_value
   else
      r = default_value;
   if (r != default_value)
      ...
Is optimized to:
   cond = idx < tablesize;
   if (cond)
      r = table[idx];
   else
      r = default_value;
   if (cond)
      ...
Jump threading will then eliminate the second if(cond).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222891 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-27 15:13:14 +00:00
Suyog Sarda
578b3ccd8f Use FileCheck instead of grep. Change by Ankur Garg.
Differential Revision: http://reviews.llvm.org/D6430


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222879 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-27 11:22:49 +00:00
Erik Eckstein
4646580bc8 Revert "Peephole optimization in switch table lookup: reuse the guarding table comparison if possible."
It is breaking the clang bootstrag.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222877 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-27 10:59:08 +00:00
Suyog Sarda
8fc5d5c0da Use FileCheck instead of grep. Change by Sonam.
Differential Revision: http://reviews.llvm.org/D6432



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222876 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-27 10:57:24 +00:00
Erik Eckstein
c394c42d6d Peephole optimization in switch table lookup: reuse the guarding table comparison if possible.
This optimization tries to reuse the generated compare instruction, if there is a comparison against the default value after the switch.
Example:
    if (idx < tablesize)
       r = table[idx]; // table does not contain default_value
    else
       r = default_value;
    if (r != default_value)
       ...
Is optimized to:
    cond = idx < tablesize;
    if (cond)
       r = table[idx];
    else
       r = default_value;
    if (cond)
       ...
\endcode
Jump threading will then eliminate the second if(cond).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222872 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-27 08:33:51 +00:00
David Majnemer
dcf39d2586 InstCombine: Restore optimizations lost in r210006
This restores our ability to optimize:
(X & C) == 0 ? X ^ C : X  into  X | C
(X & C) != 0 ? X ^ C : X  into  X & ~C

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222871 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-27 07:25:21 +00:00
David Majnemer
f45536e75e InstSimplify: Restore optimizations lost in r210006
This restores our ability to optimize:
(X & C) ? X & ~C : X  into  X & ~C
(X & C) ? X : X & ~C  into  X
(X & C) ? X | C : X  into  X
(X & C) ? X : X | C  into  X | C

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222868 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-27 06:32:46 +00:00
David Majnemer
9e6a1814c9 Revert "Added inst combine transforms for single bit tests from Chris's note"
This reverts commit r210006, it miscompiled libapr which is used in who
knows how many projects.

A test has been added to ensure that we don't regress again.

I'll work on a rewrite of what the optimization was trying to do later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222856 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-26 23:00:38 +00:00
Hans Wennborg
ca5ffd51d9 Remove useless rdar:// comment from switch_to_lookup_table.ll test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222772 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-25 18:45:23 +00:00
Hans Wennborg
4d48c3f1aa LazyValueInfo: Actually re-visit partially solved block-values in solveBlockValue()
If solveBlockValue() needs results from predecessors that are not already
computed, it returns false with the intention of resuming when the dependencies
have been resolved. However, the computation would never be resumed since an
'overdefined' result had been placed in the cache, preventing any further
computation.

The point of placing the 'overdefined' result in the cache seems to have been
to break cycles, but we can check for that when inserting work items in the
BlockValue stack instead. This makes the "stop and resume" mechanism of
solveBlockValue() work as intended, unlocking more analysis.

Using this patch shaves 120 KB off a 64-bit Chromium build on Linux.

I benchmarked compiling bzip2.c at -O2 but couldn't measure any difference in
compile time.

Tests by Jiangning Liu from r215343 / PR21238, Pete Cooper, and me.

Differential Revision: http://reviews.llvm.org/D6397

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222768 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-25 17:23:05 +00:00
Chandler Carruth
333d5c9f51 [InstCombine] Change LLVM To canonicalize toward the value type being
stored rather than the pointer type.

This change is analogous to r220138 which changed the canonicalization
for loads. The rationale is the same: memory does not have a type,
operations (and thus the values they produce) have a type. We should
match that type as closely as possible rather than reading some form of
semantics into the pointer type.

With this change, loads and stores should no longer be made with
nonsensical types for the values that tehy load and store. This is
particularly important when trying to match specific loaded and stored
types in the process of doing other instcombines, which is what led me
down this twisty maze of miscanonicalization.

I've put quite some effort into looking through IR to find places where
LLVM's optimizer was being unreasonably conservative in the face of
mismatched load and store types, however it is possible (let's say,
likely!) I have missed some. If you see regressions here, or from
r220138, the likely cause is some part of LLVM failing to cope with load
and store types differing. Test cases appreciated, it is important that
we root all of these out of LLVM.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222748 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-25 10:09:51 +00:00
Suyog Sarda
faab0cf1ad Change the test case file to use FileCheck instead of grep. NFC.
Change by Ankur Garg.

Differential Revision: http://reviews.llvm.org/D6382


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222740 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-25 08:44:56 +00:00
Chandler Carruth
a87c35420b Revert r220349 to re-instate r220277 with a fix for PR21330 -- quite
clearly only exactly equal width ptrtoint and inttoptr casts are no-op
casts, it says so right there in the langref. Make the code agree.

Original log from r220277:
Teach the load analysis to allow finding available values which require
inttoptr or ptrtoint cast provided there is datalayout available.
Eventually, the datalayout can just be required but in practice it will
always be there today.

To go with the ability to expose available values requiring a ptrtoint
or inttoptr cast, helpers are added to perform one of these three casts.

These smarts are necessary to finish canonicalizing loads and stores to
the operational type requirements without regressing fundamental
combines.

I've added some test cases. These should actually improve as the load
combining and store combining improves, but they may fundamentally be
highlighting some missing combines for select in addition to exercising
the specific added logic to load analysis.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222739 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-25 08:20:27 +00:00
David Majnemer
044b644f54 InstSimplify: Handle some simple tautological comparisons
This handles cases where we are comparing a masked value against itself.
The analysis could be further improved by making it recursive but such
expense is not currently justified.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222716 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-25 02:55:48 +00:00
Matt Arsenault
2543acd169 Bug 21610: Canonicalize min/max fcmp selects to use ordered comparisons
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222705 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-24 23:15:18 +00:00
Matt Arsenault
3ff3cb7fe3 Convert test to FileCheck and use CHECK-LABEL
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222704 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-24 23:03:17 +00:00
David Majnemer
a17a9dc8df InstCombine: Don't create an unused instruction
We would create an instruction but not inserting it.
Not inserting the unused instruction would lead us to verification
failure.

This fixes PR21653.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222659 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-24 16:41:13 +00:00
David Majnemer
4a9d304d9d InstCombine: Don't assume DataLayout is always available
We tried to get the result of DataLayout::getLargestLegalIntTypeSize but
we didn't have a DataLayout.  This resulted in opt crashing.

This fixes PR21651.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222645 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-24 07:26:20 +00:00
Elena Demikhovsky
ae1ae2c3a1 Masked Vector Load and Store Intrinsics.
Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores.
Added SDNodes for masked operations and lowering patterns for X86 code generator.
Examples:
<16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align */, <16 x i1> %mask)
declare void @llvm.masked.store.v8f64(i8* %addr, <8 x double> %value, i32 4, <8 x i1> %mask)

Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch.

http://reviews.llvm.org/D6191



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222632 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-23 08:07:43 +00:00
David Majnemer
369d8fa34f InstCombine: Propagate exact for (sdiv X, Pow2) -> (udiv X, Pow2)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222625 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-22 20:00:41 +00:00
David Majnemer
89bcfdb956 InstCombine: Propagate exact for (sdiv X, Y) -> (udiv X, Y)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222624 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-22 20:00:38 +00:00
David Majnemer
91349eecb0 InstCombine: Propagate exact for (sdiv -X, C) -> (sdiv X, -C)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222623 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-22 20:00:34 +00:00
David Majnemer
218fe23f41 InstCombine: Propagate exact in (udiv (lshr X,C1),C2) -> (udiv x,C1<<C2)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222620 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-22 18:16:54 +00:00
David Majnemer
8ff39c5c44 InstCombine: Propagate NSW/NUW for X*(1<<Y) -> X<<Y
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222613 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-22 08:57:02 +00:00
David Majnemer
082eff658e InstCombine: Propagate NSW for -X * -Y -> X * Y
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222612 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-22 07:25:19 +00:00
David Majnemer
7eca618dfc InstSimplify: Simplify (sub 0, X) -> X if it's NUW
This is a generalization of the X - (0 - Y) -> X transform.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222611 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-22 07:15:16 +00:00
David Majnemer
fc1c5babaf InstCombine: Preserve nsw when folding X*(2^C) -> X << C
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222606 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-22 04:52:55 +00:00
David Majnemer
156d6ec86b InstCombine: Preserve nsw/nuw for ((X << C2)*C1) -> (X * (C1 << C2))
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222605 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-22 04:52:52 +00:00
David Majnemer
0f8991742c InstCombine: Preserve nsw for (mul %V, -1) -> (sub 0, %V)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222604 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-22 04:52:38 +00:00
Gerolf Hoflehner
5182ad54b2 [InstCombine] Re-commit of r218721 (Optimize icmp-select-icmp sequence)
Fixes the self-host fail. Note that this commit activates dominator
analysis in the combiner by default (like the original commit did).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222590 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-21 23:36:44 +00:00
David Majnemer
9970214474 SROA: The alloca type isn't a candidate promotion type for vectors
The alloca's type is irrelevant, only those types which are used in a
load or store of the exact size of the slice should be considered.

This manifested as an assertion failure when we compared the various
types: we had a size mismatch.

This fixes PR21480.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222499 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-21 02:34:55 +00:00
Michael Zolotukhin
4e7b10b07f Fix a trip-count overflow issue in LoopUnroll.
Currently LoopUnroll generates a prologue loop before the main loop
body to execute first N%UnrollFactor iterations. Also, this loop is
used if trip-count can overflow - it's determined by a runtime check.

However, we've been mistakenly optimizing this loop to a linear code for
UnrollFactor = 2, not taking into account that it also serves as a safe
version of the loop if its trip-count overflows.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222451 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-20 20:19:55 +00:00
Chad Rosier
503ec9826c Revert "[Reassociate] As the expression tree is rewritten make sure the operands are"
This reverts commit r222142.  This is causing/exposing an execution-time regression
in spec2006/gcc and coremark on AArch64/A57/Ofast.

Conflicts:

	test/Transforms/Reassociate/optional-flags.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222398 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-19 23:21:20 +00:00
Suyog Sarda
ca72befdb5 Vectorize a reduction chain feeding into a 'return' statement.
e.x 
return (a[0]+b[0]) + (a[1]+b[1])

Differential Revision: http://reviews.llvm.org/D6227



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222364 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-19 16:07:38 +00:00
Arnaud A. de Grandmaison
beeec3231e Fix tail recursion elimination
When the BasicBlock containing the return instrution has a PHI with 2
incoming values, FoldReturnIntoUncondBranch will remove the no longer
used incoming value and remove the no longer needed phi as well. This
leaves us with a BB that no longer has a PHI, but the subsequent call
to FoldReturnIntoUncondBranch from FoldReturnAndProcessPred will not
remove the return instruction (which still uses the result of the call
instruction). This prevents EliminateRecursiveTailCall to remove
the value, as it is still being used in a basicblock which has no
predecessors.

The basicblock can not be erased on the spot, because its iterator is
still being used in runTRE.

This issue was exposed when removing the threshold on size for lifetime
marker insertion for named temporaries in clang. The testcase is a much
reduced version of peelOffOuterExpr(const Expr*, const ExplodedNode *)
from clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222354 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-19 13:32:51 +00:00
David Majnemer
f47d325eec AliasSetTracker: UnknownInsts should contribute to the refcount
AliasSetTracker::addUnknown may create an AliasSet devoid of pointers
just to contain an instruction if no suitable AliasSet already exists.
It will then AliasSet::addUnknownInst and we will be done.

However, it's possible for addUnknown to choose an existing AliasSet to
addUnknownInst.
If this were to occur, we are in a bit of a pickle: removing pointers
from the AliasSet can cause the entire AliasSet to become destroyed,
taking our unknown instructions out with them.

Instead, keep track whether or not our AliasSet has any unknown
instructions.

This fixes PR21582.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222338 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-19 09:41:05 +00:00
Manman Ren
2b82868de5 Revert r222039 because of bot failure.
http://lab.llvm.org:8080/green/job/clang-Rlto_master/298/
Hopefully, bot will be green. If not, we will re-submit the commit.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222287 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-19 00:13:26 +00:00
David Majnemer
643bef9333 InstCombine: Fix another infinite loop caused by visitFPTrunc
We would attempt to replace an frem's operand with the same operand.
This would cause InstCombine to think real work was done, causing
InstCombine to enter an infinite loop.

This fixes the second part of PR21576.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222265 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-18 22:06:45 +00:00
David Majnemer
063e54286c Revert "Revert r222040 because of bot failure."
This reverts commit r222203, reverting r222040 didn't end up turning the
bot green.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222261 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-18 21:30:02 +00:00
Chad Rosier
5759f0f944 [Reassociate] Use test cases that can actually be optimized to verify optional
flags are cleared.  The reassociation pass was just reordering the leaf nodes
in the previous test cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222250 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-18 20:34:01 +00:00
Philip Reames
0814bd85dd Tweak EarlyCSE to recognize series of dead stores
EarlyCSE is giving up on the current instruction immediately when it recognizes that the current instruction makes a previous store trivially dead. There's no reason to do this. Once the previous store has been deleted, it's perfectly legal to remember the value of the current store (for value forwarding) and the fact the store occurred (it could be dead too!).

Reviewed by: Hal
Differential Revision: http://reviews.llvm.org/D6301



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222241 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-18 17:46:32 +00:00
David Majnemer
0ede3a2ae5 InstCombine: Fold away tautological masked compares
It is impossible for (x & INT_MAX) == 0 && x == INT_MAX to ever be true.

While this sort of reasoning should normally live in InstSimplify,
the machinery that derives this result is not trivial to split out.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222230 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-18 09:31:41 +00:00
David Majnemer
1ade0f0faa IndVarSimplify: Allow LFTR to fire more often
I added a pessimization in r217102 to prevent miscompiles when the
incremented induction variable was used in a comparison; it would be
poison.

Try to use the incremented induction variable more often when we can be
sure that the increment won't end in poison.

Differential Revision: http://reviews.llvm.org/D6222

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222213 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-18 02:20:58 +00:00
Manman Ren
8ce35351f8 Revert r222040 because of bot failure.
http://lab.llvm.org:8080/green/job/clang-Rlto_master/298/
Hopefully, bot will be green.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222203 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-18 00:33:22 +00:00
Juergen Ributzka
18db81a0ec [SimplifyCFG] Make the value type of the hole check bitmask a power-of-2.
When converting a switch to a lookup table we might have to generate a bitmaks
to encode and check for holes in the original switch statement.

The type of this mask depends on the number of switch statements, which can
result in illegal types for pretty much all architectures.

To avoid unnecessary type legalization and help FastISel this commit increases
the size of the bitmask to next power-of-2 value when necessary.

This fixes rdar://problem/18984639.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222168 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-17 19:39:56 +00:00
Chad Rosier
14915fe523 [Reassociate] As the expression tree is rewritten make sure the operands are
emitted in canonical form.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222142 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-17 16:33:50 +00:00
Chad Rosier
ae3738f4a7 [Reassociate] Canonicalize constants to RHS operand.
Fix a thinko where the RHS was already a constant.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222139 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-17 15:52:51 +00:00
Erik Eckstein
72a1394991 Optimize switch lookup tables with linear mapping.
This is a simple optimization for switch table lookup:
It computes the output value directly with an (optional) mul and add if there is a linear mapping between index and output.
Example:

int f1(int x) {
  switch (x) {
    case 0: return 10;
    case 1: return 11;
    case 2: return 12;
    case 3: return 13;
  }
  return 0;
}

generates:

define i32 @f1(i32 %x) #0 {
entry:
  %0 = icmp ult i32 %x, 4
  br i1 %0, label %switch.lookup, label %return

switch.lookup:
  %switch.offset = add i32 %x, 10
  ret i32 %switch.offset

return:
  ret i32 0
}



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222121 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-17 09:13:57 +00:00
Rafael Espindola
daa09d03ab Add back r222061 with a fix.
This adds back r222061, but now calls initializePAEvalPass from the correct
library to avoid link problems.

Original message:

Don't make assumptions about the name of private global variables.

Private variables are can be renamed, so it is not reliable to make
decisions on the name.

The name is also dropped by the assembler before getting to the
linker, so using the name causes a disconnect between how llvm makes a
decision (var name) and how the linker makes a decision (section it is
in).

This patch changes one case where we were looking at the variable name to use
the section instead.

Test tuning by Michael Gottesman.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222117 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-17 02:28:27 +00:00
Reid Kleckner
bfabc8f8c5 Revert "Don't make assumptions about the name of private global variables."
This reverts commit r222061.

It's causing linker errors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222077 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-15 02:03:53 +00:00
Rafael Espindola
e2eb8b632d Don't make assumptions about the name of private global variables.
Private variables are can be renamed, so it is not reliable to make
decisions on the name.

The name is also dropped by the assembler before getting to the
linker, so using the name causes a disconnect between how llvm makes a
decision (var name) and how the linker makes a decision (section it is
in).

This patch changes one case where we were looking at the variable name to use
the section instead.

Test tuning by Michael Gottesman.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222061 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-14 23:17:47 +00:00
David Majnemer
9019a6092d InstCombine: Fix infinite loop caused by visitFPTrunc
We would attempt to replace a fptrunc of an frem with an identical
fptrunc.  This would cause the new fptrunc to be added to the worklist.
Of course, this results in an infinite loop because we will keep
visiting the newly created fptruncs.

This fixes PR21576.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222040 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-14 21:21:15 +00:00
Chad Rosier
1523db7c64 Reapply r221924: "[GVN] Perform Scalar PRE on gep indices that feed loads before
doing Load PRE"

This commit updates the failing test in
Analysis/TypeBasedAliasAnalysis/gvn-nonlocal-type-mismatch.ll

The failing test is sensitive to the order in which we process loads.  This
version turns on the RPO traversal instead of the while DT traversal in GVN.
The new test code is functionally same just the order of loads that are
eliminated is swapped.

This new version also fixes an issue where GVN splits a critical edge and
potentially invalidate the RPO/DT iterator.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222039 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-14 21:09:13 +00:00
Chad Rosier
5c76b3d03e [Reassociate] Canonicalize the operands of all binary operators.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222008 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-14 17:09:19 +00:00
Chad Rosier
1298a29c64 [Reassociate] Canonicalize operands of vector binary operators.
Prior to this commit fmul and fadd binary operators were being canonicalized for
both scalar and vector versions.  We now canonicalize add, mul, and, or, and xor
vector instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222006 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-14 17:08:15 +00:00
Chad Rosier
7e61b1fb62 [Reassociate] Canonicalize constants to RHS operand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222005 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-14 17:05:59 +00:00
Reid Kleckner
fb59c03af9 Relax the gcov version.ll test to check '.' instead of '\*'
The escaping of the '\*' doesn't work with my combination of testing
tools.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221944 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-13 23:07:55 +00:00
Chad Rosier
7984fde2dc Revert "[GVN] Perform Scalar PRE on gep indices that feed loads before doing Load PRE."
This reverts commit r221924.  It appears the commit was a bit premature and is causing
bot failures that need further investigation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221939 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-13 22:54:59 +00:00
Chad Rosier
a9cc4e7e35 [GVN] Perform Scalar PRE on gep indices that feed loads before doing Load PRE.
Phabricator Revision: http://reviews.llvm.org/D6103
Patch by "Balaram Makam" <bmakam@codeaurora.org>!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221924 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-13 21:17:58 +00:00
Sanjoy Das
de87c9165a Teach ScalarEvolution to sharpen range information.
If x is known to have the range [a, b), in a loop predicated by (icmp
ne x, a) its range can be sharpened to [a + 1, b).  Get
ScalarEvolution and hence IndVars to exploit this fact.

This change triggers an optimization to widen-loop-comp.ll, so it had
to be edited to get it to pass.

This change was originally landed in r219834 but had a bug and broke
ASan. It was reverted in r219878, and is now being re-landed after
fixing the original bug.

phabricator: http://reviews.llvm.org/D5639
reviewed by: atrick



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221839 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-13 00:00:58 +00:00
Ahmed Bougacha
b990a25d5a [CodeGenPrepare][AArch64] Fix a TLI legality check on iPTR to use a lowered instead.
Fixes PR21548.  Related to PR20474.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221820 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-12 22:16:55 +00:00