Commit Graph

6667 Commits

Author SHA1 Message Date
Duncan P. N. Exon Smith
bf2040f00c DI: Remove DW_TAG_arg_variable and DW_TAG_auto_variable
Remove the fake `DW_TAG_auto_variable` and `DW_TAG_arg_variable` tags,
using `DW_TAG_variable` in their place Stop exposing the `tag:` field at
all in the assembly format for `DILocalVariable`.

Most of the testcase updates were generated by the following sed script:

    find test/ -name "*.ll" -o -name "*.mir" |
    xargs grep -l 'DILocalVariable' |
    xargs sed -i '' \
      -e 's/tag: DW_TAG_arg_variable, //' \
      -e 's/tag: DW_TAG_auto_variable, //'

There were only a handful of tests in `test/Assembly` that I needed to
update by hand.

(Note: a follow-up could change `DILocalVariable::DILocalVariable()` to
set the tag to `DW_TAG_formal_parameter` instead of `DW_TAG_variable`
(as appropriate), instead of having that logic magically in the backend
in `DbgVariable`.  I've added a FIXME to that effect.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243774 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-31 18:58:39 +00:00
Wei Mi
4db6728b3a [SLP vectorizer]: Choose the best consecutive candidate to pair with a store instruction.
The patch changes the SLPVectorizer::vectorizeStores to choose the immediate
succeeding or preceding candidate for a store instruction when it has multiple
consecutive candidates. In this way it has better chance to find more slp
vectorization opportunities.

Differential Revision: http://reviews.llvm.org/D10445


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243666 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-30 17:40:39 +00:00
Michael Zolotukhin
607fe5bb28 [Unroll] Handle SwitchInst properly.
Previously successor selection was simply wrong.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243545 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-29 18:10:33 +00:00
Michael Zolotukhin
815580fe7a [Unroll] Don't crash when simplified branch condition is undef.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243544 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-29 18:10:29 +00:00
Michael Zolotukhin
952e40ff00 Rename test full-unroll-bad-geps.ll to full-unroll-crashers.ll.
No reason to limit it only to GEP-related crashes. More tests are to
come here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243543 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-29 18:10:23 +00:00
Sanjoy Das
44d65eac43 [Statepoints] Let patchable statepoints have a symbolic call target.
Summary:
As added initially, statepoints required their call targets to be a
constant pointer null if ``numPatchBytes`` was non-zero.  This turns out
to be a problem ergonomically, since there is no way to mark patchable
statepoints as calling a (readable) symbolic value.

This change remove the restriction of requiring ``null`` call targets
for patchable statepoints, and changes PlaceSafepoints to maintain the
symbolic call target through its transformation.

Reviewers: reames, swaroop.sridhar

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11550

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243502 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-28 23:50:30 +00:00
Jingyue Wu
7d4d116067 [SCEV] Apply NSW and NUW flags via poison value analysis
Summary:
Make Scalar Evolution able to propagate NSW and NUW flags from instructions to SCEVs in some cases. This is based on reasoning about when poison from instructions with these flags would trigger undefined behavior. This gives a 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX.

There does not seem to be clear agreement about when poison should be considered to propagate through instructions. In this analysis, poison propagates only in cases where that should be uncontroversial.

This change makes LSR able to create induction variables for expressions like &ptr[i + offset] for loops like this:

  for (int i = 0; i < limit; ++i) {
    sum += ptr[i + offset];
  }

Here ptr is a 64 bit pointer and offset is a 32 bit integer. For NVPTX, LSR currently creates an induction variable for i + offset instead, which is not as fast. Improving this situation is what brings the 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX.


There are more details in this discussion on llvmdev.
June: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-June/thread.html#87234
July: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/thread.html#87392

Patch by Bjarke Roune

Reviewers: eliben, atrick, sanjoy

Subscribers: majnemer, hfinkel, jingyue, meheff, llvm-commits

Differential Revision: http://reviews.llvm.org/D11212

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243460 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-28 18:22:40 +00:00
Chih-Hung Hsieh
dc73dc09f1 Implement target independent TLS compatible with glibc's emutls.c.
The 'common' section TLS is not implemented.
Current C/C++ TLS variables are not placed in common section.
DWARF debug info to get the address of TLS variables is not generated yet.

clang and driver changes in http://reviews.llvm.org/D10524

  Added -femulated-tls flag to select the emulated TLS model,
  which will be used for old targets like Android that do not
  support ELF TLS models.

Added TargetLowering::LowerToTLSEmulatedModel as a target-independent
function to convert a SDNode of TLS variable address to a function call
to __emutls_get_address.

Added into lib/Target/*/*ISelLowering.cpp to call LowerToTLSEmulatedModel
for TLSModel::Emulated. Although all targets supporting ELF TLS models are
enhanced, emulated TLS model has been tested only for Android ELF targets.
Modified AsmPrinter.cpp to print the emutls_v.* and emutls_t.* variables for
emulated TLS variables.
Modified DwarfCompileUnit.cpp to skip some DIE for emulated TLS variabls.

TODO: Add proper DIE for emulated TLS variables.
      Added new unit tests with emulated TLS.

Differential Revision: http://reviews.llvm.org/D10522


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243438 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-28 16:24:05 +00:00
Sanjoy Das
d9408bca26 FileCheck'ify some wc/grep based tests; NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243378 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-28 03:50:09 +00:00
Sanjoy Das
f7681b3e3a [LSR] Move X86 specific test case to X86/
rL243348 added the test case in the wrong directory.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243357 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-28 00:13:42 +00:00
Sanjoy Das
477137f4d7 [LSR] Generate and use zero extends
Summary:
If a scale or a base register can be rewritten as "Zext({A,+,1})" then
LSR will now consider a formula of that form in its normal cost
computation.

Depends on D9180

Reviewers: qcolombet, atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9181

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243348 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-27 23:27:51 +00:00
Sanjoy Das
39d6da003d [IndVars] Make loop varying predicates loop invariant.
Summary:
Was D9784: "Remove loop variant range check when induction variable is
strictly increasing"

This change re-implements D9784 with the two differences:

 1. It does not use SCEVExpander and does not generate new
    instructions.  Instead, it does a quick local search for existing
    `llvm::Value`s that it needs when modifying the `icmp`
    instruction.

 2. It is more general -- it deals with both increasing and decreasing
    induction variables.

I've added all of the tests included with D9784, and two more.

As an example on what this change does (copied from D9784):

Given C code:

```
for (int i = M; i < N; i++) // i is known not to overflow
  if (i < 0) break;
  a[i] = 0;
}
```

This transformation produces:

```
for (int i = M; i < N; i++)
  if (M < 0) break;
  a[i] = 0;
}
```

Which can be unswitched into:

```
if (!(M < 0))
  for (int i = M; i < N; i++)
    a[i] = 0;
}
```

I went back and forth on whether the top level logic should live in
`SimplifyIndvar::eliminateIVComparison` or be put into its own
routine.  Right now I've put it under `eliminateIVComparison` because
even though the `icmp` is not *eliminated*, it no longer is an IV
comparison.  I'm open to putting it in its own helper routine if you
think that is better.

Reviewers: reames, nicholas, atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11278

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243331 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-27 21:42:49 +00:00
Simon Pilgrim
8e2e335276 [InstCombine][X86][SSE] Replace sign/zero extension intrinsics with native IR
Now that we are generating sane codegen for vector sext/zext nodes on SSE targets, this patch uses instcombine to replace the SSE41/AVX2 pmovsx and pmovzx intrinsics with the equivalent native IR code.

Differential Revision: http://reviews.llvm.org/D11503

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243303 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-27 18:52:15 +00:00
Matt Arsenault
fd8928ded9 Fix assert when inlining a constantexpr addrspacecast
The pointer size of the addrspacecasted pointer might not have matched,
so this would have hit an assert in accumulateConstantOffset.

I think this was here to allow constant folding of a load of an
addrspacecasted constant. Accumulating the offset through the
addrspacecast doesn't make much sense, so something else is necessary
to allow folding the load through this cast.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243300 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-27 18:31:03 +00:00
Silviu Baranga
aee16c42dc The tests added in r243270 require asserts to be enabled
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243274 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-27 15:22:49 +00:00
Silviu Baranga
cff701eeb9 Fix the tests added in r243270. Use 2>&1 instead of |&
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243273 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-27 15:08:55 +00:00
Silviu Baranga
541d079947 [ARM/AArch64] Fix cost model for interleaved accesses
Summary:
Fix the cost of interleaved accesses for ARM/AArch64.
We were calling getTypeAllocSize and using it to check
the number of bits, when we should have called
getTypeAllocSizeInBits instead.

This would pottentially cause the vectorizer to
generate loads/stores and shuffles which cannot
be matched with an interleaved access instruction.

No performance changes are expected for now since
matching/generating interleaved accesses is still
disabled by default.

Reviewers: rengolin

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D11524

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243270 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-27 14:39:34 +00:00
Jingyue Wu
580991b5c9 Roll forward r243250
r243250 appeared to break clang/test/Analysis/dead-store.c on one of the build
slaves, but I couldn't reproduce this failure locally. Probably a false
positive as I saw this test was broken by r243246 or r243247 too but passed
later without people fixing anything.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243253 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-26 19:10:03 +00:00
Jingyue Wu
e48b1257f1 Revert r243250
breaks tests


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243251 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-26 18:30:13 +00:00
Jingyue Wu
9f141640b5 [TTI/CostModel] improve TTI::getGEPCost and use it in CostModel::getInstructionCost
Summary:
This patch updates TargetTransformInfoImplCRTPBase::getGEPCost to consider
addressing modes. It now returns TCC_Free when the GEP can be completely folded
to an addresing mode.

I started this patch as I refactored SLSR. Function isGEPFoldable looks common
and is indeed used by some WIP of mine. So I extracted that logic to getGEPCost.

Furthermore, I noticed getGEPCost wasn't directly tested anywhere. The best
testing bed seems CostModel, but its getInstructionCost method invokes
getAddressComputationCost for GEPs which provides very coarse estimation. So
this patch also makes getInstructionCost call the updated getGEPCost for GEPs.
This change inevitably breaks some tests because the cost model changes, but
nothing looks seriously wrong -- if we believe the new cost model is the right
way to go, these tests should be updated.

This patch is not perfect yet -- the comments in some tests need to be updated.
I want to know whether this is a right approach before fixing those details.

Reviewers: chandlerc, hfinkel

Subscribers: aschwaighofer, llvm-commits, aemerson

Differential Revision: http://reviews.llvm.org/D9819

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243250 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-26 17:28:13 +00:00
Simon Pilgrim
9e297691a4 [InstCombine] Split off SSE4a tests.
These aren't vector demanded bits tests. More tests to follow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243223 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-25 17:14:01 +00:00
Chen Li
7b0238cdc1 [LoopUnswitch] Improve loop unswitch pass to find trivial unswitch conditions more effectively
Summary:
This patch improves trivial loop unswitch. 

The current trivial loop unswitch only checks if loop header's terminator contains a trivial unswitch condition. But if the loop header only has one reachable successor (due to intentionally or unintentionally missed code simplification), we should consider the successor as part of the loop header. Therefore, instead of stopping at loop header's terminator, we should keep traversing its successors within loop until reach a *real* conditional branch or switch (whose condition can not be constant folded). This change will enable a single -loop-unswitch pass to unswitch multiple trivial conditions (unswitch one trivial condition could open opportunity to unswitch another one in the same loop), while the old implementation can unswitch only one per pass. 

Reviewers: reames, broune

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11481

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243203 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-25 03:21:06 +00:00
Duncan P. N. Exon Smith
cbfbb3ee4c DI/Verifier: Fix argument bitrot in DILocalVariable
Add a verifier check that `DILocalVariable`s of tag
`DW_TAG_arg_variable` always have a non-zero 'arg:' field, and those of
tag `DW_TAG_auto_variable` always have a zero 'arg:' field.  These are
the only configurations that are properly understood by the backend.

(Also, fix the bad examples in LangRef and test/Assembler, and fix the
bug in Kaleidoscope Ch8.)

A large number of testcases seem to have bitrotted their way forward
from some ancient version of the debug info hierarchy that didn't have
`arg:` parameters.  If you have out-of-tree testcases that start failing
in the verifier and you don't care enough to get the `arg:` right, you
may have some luck just calling:

    sed -e 's/, arg: 0/, arg: 1/'

or some such, but I hand-updated the ones in tree.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243183 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-24 23:59:25 +00:00
Lawrence Hu
5136ca2c6d Handle loop with negtive induction variable increment
This patch extend LoopReroll pass to hand the loops which
is similar to the following:

      while (len > 1) {
            sum4 += buf[len];
            sum4 += buf[len-1];
            len -= 2;
        }

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243171 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-24 22:01:49 +00:00
Philip Reames
ec2871f730 [RewriteStatepointsForGC] Adjust naming scheme to be more stable
The names for instructions inserted were previous dependent on iteration order.  By deriving the names from the original instructions, we can avoid instability in tests without resorting to ordered traversals.  It also makes the IR mildly easier to read at large scale.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243140 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-24 19:01:39 +00:00
Michael Zolotukhin
fc561accb7 Handle resolvable branches in complete loop unroll heuristic.
Summary:
Resolving a branch allows us to ignore blocks that won't be executed, and thus make our estimate more accurate.
This patch is intended to be applied after D10205 (though it could be applied independently).

Reviewers: chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10206

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243084 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-24 01:53:04 +00:00
Matt Wala
106d75d396 [Scalarizer] Fix potential for stale data in Scattered across invocations
Summary:
Scalarizer has two data structures that hold information about changes
to the function, Gathered and Scattered. These are cleared in finish()
at the end of runOnFunction() if finish() detects any changes to the
function.

However, finish() was checking for changes by only checking if
Gathered was non-empty. The function visitStore() only modifies
Scattered without touching Gathered. As a result, Scattered could have
ended up having stale data if Scalarizer only scalarized store
instructions. Since the data in Scattered is used during the execution
of the pass, this introduced dangling pointer errors.

The fix is to check whether both Scattered and Gathered are empty
before deciding what to do in finish(). This also fixes a problem
where the Function can be modified although the pass returns false.

Reviewers: rnk

Subscribers: rnk, srhines, llvm-commits

Differential Revision: http://reviews.llvm.org/D10459

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243040 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-23 20:53:46 +00:00
David Majnemer
1b072f2beb [ConstantFolding] Support folding loads from a GlobalAlias
The MSVC ABI requires that we generate an alias for the vtable which
means looking through a GlobalAlias which cannot be overridden improves
our ability to devirtualize.

Found while investigating PR20801.

Patch by Andrew Zhogin!

Differential Revision: http://reviews.llvm.org/D11306

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242955 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-22 22:29:30 +00:00
Michael Kuperstein
6418278e7d Fix mem2reg to correctly handle allocas only used in a single block
Currently, a load from an alloca that is used in as single block and is not preceded
by a store is replaced by undef. This is not always correct if the single block is
inside a loop.
Fix the logic so that:
1) If there are no stores in the block, replace the load with an undef, as before.
2) If there is a store (regardless of where it is in the block w.r.t the load), bail
out, and let the rest of mem2reg handle this alloca.

Patch by: gil.rapaport@intel.com
Differential Revision: http://reviews.llvm.org/D11355

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242884 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-22 10:29:29 +00:00
Chen Li
5015f9766b [LoopUnswitch] Code refactoring to separate trivial loop unswitch and non-trivial loop unswitch in processCurrentLoop()
Summary: The current code in LoopUnswtich::processCurrentLoop() mixes trivial loop unswitch and non-trivial loop unswitch together. It goes over all basic blocks in the loop and checks if a condition is trivial or non-trivial unswitch condition. However, trivial unswitch condition can only occur in the loop header basic block (where it controls whether or not the loop does something at all). This refactoring separate trivial loop unswitch and non-trivial loop unswitch. Before going over all basic blocks in the loop, it checks if the loop header contains a trivial unswitch condition. If so, unswitch it. Otherwise, go over all blocks like before but don't check trivial condition any more since they are not possible to be in the other blocks. This code has no functionality change.

Reviewers: meheff, reames, broune

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11276

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242873 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-22 05:26:29 +00:00
Chandler Carruth
2cf40d1ae5 [SROA] Fix a nasty pile of bugs to do with big-endian, different alloca
types and loads, loads or stores widened past the size of an alloca,
etc.

This started off with a bug report about big-endian behavior with
bitfields and loads and stores to a { i32, i24 } struct. An initial
attempt to fix this was sent for review in D10357, but that didn't
really get to the root of the problem.

The core issue was that canConvertValue and convertValue in SROA were
handling different bitwidth integers by doing a zext of the integer. It
wouldn't do a trunc though, only a zext! This would in turn lead SROA to
form an i24 load from an i24 alloca, zext it to i32, and then use it.
This would at least produce the wrong value for big-endian systems.

One of my many false starts here was to correct the computation for
big-endian systems by shifting. But this doesn't actually work because
the original code has a 64-bit store to the entire 8 bytes, and a 32-bit
load of the last 4 bytes, and because the alloc size is 8 bytes, we
can't lose that last (least significant if bigendian) byte! The real
problem here is that we're forming an i24 load in SROA which is actually
not sufficiently wide to load all of the necessary bits here. The source
has an i32 load, and SROA needs to form that as well.

The straightforward way to do this is to disable the zext logic in
canConvertValue and convertValue, forcing us to actually load all
32-bits. This seems like a really good change, but it in turn breaks
several other parts of SROA.

First in the chain of knock-on failures, we had places where we were
doing integer-widening promotion even though some of the integer loads
or stores extended *past the end* of the alloca's memory! There was even
a comment about preventing this, but it only prevented the case where
the type had a different bit size from its store size. So I added checks
to handle the cases where we actually have a widened load or store and
to avoid trying to special integer widening promotion in those cases.

Second, we actually rely on the ability to promote in the face of loads
past the end of an alloca! This is important so that we can (for
example) speculate loads around PHI nodes to do more promotion. The bits
loaded are garbage, but as long as they aren't used and the alignment is
suitable high (which it wasn't in the test case!) this is "fine". And we
can't stop promoting here, lots of things stop working well if we do. So
we need to add specific logic to handle the extension (and truncation)
case, but *only* where that extension or truncation are over bytes that
*are outside the alloca's allocated storage* and thus totally bogus to
load or store.

And of course, once we add back this correct handling of extension or
truncation, we need to correctly handle bigendian systems to avoid
re-introducing the exact bug that started us off on this chain of misery
in the first place, but this time even more subtle as it only happens
along speculated loads atop a PHI node.

I've ported an existing test for PHI speculation to the big-endian test
file and checked that we get that part correct, and I've added several
more interesting big-endian test cases that should help check that we're
getting this correct.

Fun times.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242869 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-22 03:32:42 +00:00
Arnold Schwaighofer
a5aae48d17 MergeFunc: Transfer the callee's attributes when replacing a direct caller
We insert a bitcast which obfuscates the getCalledFunction for the utility
function which looks up attributes from the called function. Loosing ABI
changing parameter attributes is a bad thing.

rdar://21516488

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242807 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-21 17:07:07 +00:00
Karthik Bhat
c721349466 Constfold trunc,rint,nearbyint,ceil and floor using APFloat
A patch by Chakshu Grover!
This patch allows constfolding of trunc,rint,nearbyint,ceil and floor intrinsics using APFloat class.
Differential Revision: http://reviews.llvm.org/D11144


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242763 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-21 08:52:23 +00:00
Arnold Schwaighofer
e88e75e38d Revert "MergeFuncs: Transfer the function parameter attributes to the call site"
It is okay to not transfer parameter attributes.

This reverts commit r242558.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242646 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-19 19:30:43 +00:00
Arnold Schwaighofer
beda80e3bb MergeFuncs: Transfer the function parameter attributes to the call site
rdar://21516488

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242558 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-17 18:59:08 +00:00
Jingyue Wu
1241b83d61 [NVPTX] enable SpeculativeExecution in NVPTX
Summary:
SpeculativeExecution enables a series straight line optimizations (such
as SLSR and NaryReassociate) on conditional code. For example,

  if (...)
    ... b * s ...
  if (...)
    ... (b + 1) * s ...

speculative execution can hoist b * s and (b + 1) * s from then-blocks,
so that we have

  ... b * s ...
  if (...)
    ...
  ... (b + 1) * s ...
  if (...)
    ...

Then, SLSR can rewrite (b + 1) * s to (b * s + s) because after
speculative execution b * s dominates (b + 1) * s.

The performance impact of this change is significant. It speeds up the
benchmarks running EigenFloatContractionKernelInternal16x16
(ba68f42fa6/unsupported/Eigen/CXX11/src/Tensor/TensorContractionCuda.h (cl-526))
by roughly 2%. Some internal benchmarks that have the above code pattern
are improved by up to 40%. No significant slowdowns are observed on
Eigen CUDA microbenchmarks.

Reviewers: jholewinski, broune, eliben

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D11201

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242437 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-16 20:13:48 +00:00
Peter Collingbourne
4a6d99b0a9 Internalize: internalize comdat members as a group, and drop comdat on such members.
Internalizing an individual comdat group member without also internalizing
the other members of the comdat can break comdat semantics. For example,
if a module contains a reference to an internalized comdat member, and the
linker chooses a comdat group from a different object file, this will break
the reference to the internalized member.

This change causes the internalizer to only internalize comdat members if all
other members of the comdat are not externally visible. Once a comdat group
has been fully internalized, there is no need to apply comdat rules to its
members; later optimization passes (e.g. globaldce) can legally drop individual
members of the comdat. So we drop the comdat attribute from all comdat members.

Differential Revision: http://reviews.llvm.org/D10679

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242423 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-16 17:42:21 +00:00
JF Bastien
063eb4e389 Fix mergefunc infinite loop
Self-referential constants containing references to a merged function
no longer cause the MergeFunctions pass to infinite loop. Also adds a
reproduction IR which would otherwise fail, which was isolated from a similar
issue in Chromium.

Author: jrkoenig
Reviewers: nlewycky, jfb
Subscribers: llvm-commits, nlewycky, jfb

Differential Revision: http://reviews.llvm.org/D11208

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242337 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-15 21:51:33 +00:00
Michael Zolotukhin
72d14a0792 Tidy-up test case from r242257.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242268 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-15 01:51:51 +00:00
Michael Zolotukhin
67ee52cf04 [LoopUnrolling] Handle cast instructions.
During estimation of unrolling effect we should be able to propagate
constants through casts.

Differential Revision: http://reviews.llvm.org/D10207

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242257 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-15 00:19:51 +00:00
David Majnemer
137ad1ded9 [InstCombine] Generalize sub of selects optimization to all BinaryOperators
This exposes further optimization opportunities if the selects are
correlated.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242235 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-14 22:39:23 +00:00
Tim Northover
0e34491fef GVN: tolerate an instruction being replaced without existing in the leaderboard
Sometimes an incidentally created instruction can duplicate a Value used
elsewhere. It then often doesn't end up in the leader table. If it's later
removed, we attempt to remove it from the leader table and segfault.

Instead we should just ignore the removal request, which won't cause any
problems. The reverse situation, where the original instruction is replaced by
the new one (which you might think could leave the leader table empty) cannot
occur, because the incidental instruction will never be found in the first
place.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242199 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-14 21:03:18 +00:00
David Majnemer
2a27389edc [SROA] Don't de-atomic volatile loads and stores
Volatile loads and stores are made visible in global state regardless of
what memory is involved.  It is not correct to disregard the ordering
and synchronization scope because it is possible to synchronize with
memory operations performed by hardware.

This partially addresses PR23737.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242126 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-14 06:19:58 +00:00
Reid Kleckner
b53f724f91 Update enforceKnownAlignment after the isWeakForLinker semantic change
Previously we would refrain from attempting to increase the linkage of
available_externally globals because they were considered weak for the
linker. Now they are treated more like a declaration instead of a weak
definition.

This was causing SSE alignment faults in Chromuim, when some code
assumed it could increase the alignment of a dllimported global that it
didn't control.  http://crbug.com/509256

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242091 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-14 00:11:08 +00:00
Pete Cooper
cc383db66a Remove unnecessary lines from the test in r242068.
This test case was breaking the hexagon elf bot.  The failing lines
were actually unnecessary as checking that the store still reads the
correct value demonstrates that everything is working fine now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242073 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-13 21:50:35 +00:00
Pete Cooper
71a4b301fd Loop idiom recognizer was replacing too many uses of popcount.
When spotting that a loop can use ctpop, we were incorrectly replacing all uses of a value with a value derived from ctpop.

The bug here was exposed because we were replacing a use prior to the ctpop with the ctpop value and so we have a use before def, i.e., we changed

 %tobool.5 = icmp ne i32 %num, 0
 store i1 %tobool.5, i1* %ptr
 br i1 %tobool.5, label %for.body.lr.ph, label %for.end

to

 store i1 %1, i1* %ptr
 %0 = call i32 @llvm.ctpop.i32(i32 %num)
 %1 = icmp ne i32 %0, 0
 br i1 %1, label %for.body.lr.ph, label %for.end

Even if we inserted the ctpop so that it dominates the store here, that would still be incorrect.  The store doesn’t want the result of ctpop.

The fix is very simple, and involves replacing only the branch condition with the ctpop instead of all uses.

Reviewed by Hal Finkel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242068 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-13 21:25:33 +00:00
Mark Heffernan
8a9e01d606 Enable runtime unrolling with unroll pragma metadata
Enable runtime unrolling for loops with unroll count metadata ("#pragma unroll N")
and a runtime trip count. Also, do not unroll loops with unroll full metadata if the
loop has a runtime loop count. Previously, such loops would be unrolled with a
very large threshold (pragma-unroll-threshold) if runtime unrolled happened to be
enabled resulting in a very large (and likely unwise) unroll factor.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242047 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-13 18:26:27 +00:00
Rafael Espindola
94162a044b Don't change the visibility when converting a definition to a declaration.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242030 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-13 14:18:22 +00:00
Jingyue Wu
8e7e3650af [LSR] don't attempt to promote ephemeral values to indvars
Summary:
This at least saves compile time. I also encountered a case where
ephemeral values affect whether other variables are promoted, causing
performance issues. It may be a bug in LSR, but I didn't manage to
reduce it yet. Anyhow, I believe it's in general not worth considering
ephemeral values in LSR.

Reviewers: atrick, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11115

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242011 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-13 03:28:53 +00:00
David Majnemer
46b13dd880 [InstSimplify] Teach InstSimplify how to simplify extractelement
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242008 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-13 01:15:53 +00:00