Commit Graph

6480 Commits

Author SHA1 Message Date
Ahmed Bougacha
6eb095d7ae [MemCpyOpt] Look at any dependency -not just source- for memset+memcpy.
This fixes another miscompile introduced by r235232: when there was a
dependency on the memcpy destination other than the memset, we would
ignore it, because we only looked at the source dependency.

It was a mistake to use SrcDepInfo.  Instead, just use DepInfo.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237066 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-11 23:09:46 +00:00
Sanjoy Das
5b5782c20e [RewriteStatepointsForGC] Fix a bug on creating gc_relocate for pointer to vector of pointers
Summary:
In RewriteStatepointsForGC pass, we create a gc_relocate intrinsic for
each relocated pointer, and the gc_relocate has the same type with the
pointer. During the creation of gc_relocate intrinsic, llvm requires to
mangle its type. However, llvm does not support mangling of all possible
types. RewriteStatepointsForGC will hit an assertion failure when it
tries to create a gc_relocate for pointer to vector of pointers because
mangling for vector of pointers is not supported.

This patch changes the way RewriteStatepointsForGC pass creates
gc_relocate. For each relocated pointer, we erase the type of pointers
and create an unified gc_relocate of type i8 addrspace(1)*. Then a
bitcast is inserted to convert the gc_relocate to the correct type. In
this way, gc_relocate does not need to deal with different types of
pointers and the unsupported type mangling is no longer a problem. This
change would also ease further merge when LLVM erases types of pointers
and introduces an unified pointer type.

Some minor changes are also introduced to gc_relocate related part in
InstCombineCalls, CodeGenPrepare, and Verifier accordingly.

Patch by Chen Li!

Reviewers: reames, AndyAyers, sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9592

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237009 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-11 18:49:34 +00:00
Hal Finkel
cdd4737be8 [InstCombine/PowerPC] Fix single-precision QPX load/store replacement
The QPX single-precision load/store intrinsics have implied
truncation/extension from/to the declared value type of <4 x double> to the
memory type of <4 x float>. When we can prove the alignment of the pointer
argument, and thus replace the intrinsic with a regular load or store, we need
to load or store the correct data type (<4 x float>) instead of (<4 x double>).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236973 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-11 06:37:03 +00:00
David Majnemer
73f2a7bbb2 Make buildbots happy
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236970 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-11 05:33:27 +00:00
David Majnemer
c88eae46da [InstCombine] Canonicalize single element array store
Use the element type instead of the aggregate type.

Differential Revision: http://reviews.llvm.org/D9591

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236969 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-11 05:04:27 +00:00
David Majnemer
3101b1a432 [InstCombine] Canonicalize single element array load
Use the element type instead of the aggregate type.

Differential Revision: http://reviews.llvm.org/D9596

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236968 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-11 05:04:22 +00:00
Pat Gavlin
5c7f7462e4 Extend the statepoint intrinsic to allow statepoints to be marked as transitions from GC-aware code to code that is not GC-aware.
This changes the shape of the statepoint intrinsic from:

  @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 unused, ...call args, i32 # deopt args, ...deopt args, ...gc args)

to:

  @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 flags, ...call args, i32 # transition args, ...transition args, i32 # deopt args, ...deopt args, ...gc args)

This extension offers the backend the opportunity to insert (somewhat) arbitrary code to manage the transition from GC-aware code to code that is not GC-aware and back.

In order to support the injection of transition code, this extension wraps the STATEPOINT ISD node generated by the usual lowering lowering with two additional nodes: GC_TRANSITION_START and GC_TRANSITION_END. The transition arguments that were passed passed to the intrinsic (if any) are lowered and provided as operands to these nodes and may be used by the backend during code generation.

Eventually, the lowering of the GC_TRANSITION_{START,END} nodes should be informed by the GC strategy in use for the function containing the intrinsic call; for now, these nodes are instead replaced with no-ops.

Differential Revision: http://reviews.llvm.org/D9501

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236888 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-08 18:07:42 +00:00
Jingyue Wu
c417f302cb [NoTTI] reject negative scale in addressing mode
Summary:
I noticed this bug when deubging a WIP on LSR. I wonder whether and how we
should add a regression test for this.

Test Plan: no tests failed.

Reviewers: atrick

Subscribers: hfinkel, llvm-commits

Differential Revision: http://reviews.llvm.org/D9536

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236887 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-08 18:07:24 +00:00
Michael Zolotukhin
d7b5144d53 Populate list of vectorizable functions for Accelerate library.
Summary:
This patch adds majority of supported by Accelerate library functions to the
list of vectorizable functions.

The full list of available vector functions could be found here:
https://developer.apple.com/library/mac/documentation/Performance/Conceptual/vecLib/index.html

Test Plan: Unit tests are added.

Reviewers: hfinkel, aschwaighofer, nadav

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9543

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236747 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-07 17:11:51 +00:00
Mehdi Amini
4d5e059cdb Update InstCombine to transform aggregate loads into scalar loads.
Summary:
One step further getting aggregate loads and store being optimized
properly. This will only handle struct with one element at this point.

Test Plan: Added unit tests for the new supported cases.

Reviewers: chandlerc, joker-eph, joker.eph, majnemer

Reviewed By: majnemer

Subscribers: pete, llvm-commits

Differential Revision: http://reviews.llvm.org/D8339

Patch by Amaury Sechet.

From: Amaury Sechet <amaury@fb.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236695 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-07 05:52:40 +00:00
Philip Reames
16e7e690b7 [JumpThreading] Simplify comparisons when simplifying branches
If we have recognized that a conditional is constant at a particular location in the code (while trying to decide if we can simplify a conditional branch), we can eagerly replace that condition with a constant if it's definition is post dominated by the branch in question.

In practice, this ends up being a compile time savings at most. JumpThreading would have visited each using branch anyways. CVP would have visited the cmp itself again. Unless LVI gives up early, we shouldn't gain any addition power by doing this transformation early. What we do gain is simplicity and compile time.

Differential Revision: http://reviews.llvm.org/D9312



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236684 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-07 00:19:14 +00:00
Akira Hatanaka
e6f0494cd8 Let llc and opt override "-target-cpu" and "-target-features" via command line
options.

This commit fixes a bug in llc and opt where "-mcpu" and "-mattr" wouldn't
override function attributes "-target-cpu" and "-target-features" in the IR.

Differential Revision: http://reviews.llvm.org/D9537


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236677 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 23:54:14 +00:00
Sanjoy Das
8a86e2564d [IRBuilder] Add a CreateGCStatepointInvoke.
Renames the original CreateGCStatepoint to CreateGCStatepointCall, and
moves invoke creating functionality from PlaceSafepoints.cpp to
IRBuilder.cpp.

This changes the labels generated for PlaceSafepoints/invokes.ll so use
a regex there to make the basic block labels more resilient.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236672 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 23:53:09 +00:00
Diego Novillo
26e46f2283 Allow 0-weight branches in BranchProbabilityInfo.
Summary:
When computing branch weights in BPI, we used to disallow branches with
weight 0. This is a minor nuisance, because a branch with weight 0 is
different to "don't have information". In the context of
instrumentation, it may mean "never executed", in the context of
sampling, it means "never or seldom executed".

In allowing 0 weight branches, I ran into issues with the switch
expansion code in selection DAG. It is currently hardwired to not handle
branches with weight 0. To maintain the current behaviour, I changed it
to use 1 when it finds 0, but perhaps the algorithm needs changes to
tolerate branches with weight zero.

Reviewers: hansw

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9533

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236617 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 17:55:11 +00:00
Wei Mi
cac51be31f [X86] Disable loop unrolling in loop vectorization pass when VF is 1.
The patch disabled unrolling in loop vectorization pass when VF==1 on x86 architecture,
by setting MaxInterleaveFactor to 1. Unrolling in loop vectorization pass may introduce
the cost of overflow check, memory boundary check and extra prologue/epilogue code when
regular unroller will unroll the loop another time. Disable it when VF==1 remove the
unnecessary cost on x86. The same can be done for other platforms after verifying
interleaving/memory bound checking to be not perf critical on those platforms.

Differential Revision: http://reviews.llvm.org/D9515


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236613 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 17:12:25 +00:00
David Majnemer
cfdd004e7e [Inliner] Discard empty COMDAT groups
COMDAT groups which have become rendered unused because of inline are
discardable if we can prove that we've made the group empty.

This fixes PR22285.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236539 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-05 20:14:22 +00:00
Daniel Berlin
7a5c0e599c Update BasicAliasAnalysis to understand that nothing aliases with undef values.
It got this in some cases (if one of them was an identified object), but not in all cases.

This caused stores to undef to block load-forwarding in some cases, etc.

Added test to Transforms/GVN to verify optimization occurs as expected.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236511 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-05 18:10:49 +00:00
Matthias Braun
af2e236c11 InstCombineSimplifyDemanded: Remove nsw/nuw flags when optimizing demanded bits
When optimizing demanded bits of the operands of an Add we have to
remove the nsw/nuw flags as we have no guarantee anymore that we don't
wrap.  This is legal here because the top bit is not demanded.  In fact
this operaion was already performed but missed in the case of an Add
with a constant on the right side.  To fix this this patch refactors the
code to unify the code paths in SimplifyDemandedUseBits() handling of
Add/Sub:

- The transformation of Add->Or is removed from the simplify demand
  code because the equivalent transformation exists in
  InstCombiner::visitAdd()
- KnownOnes/KnownZero are not adjusted for Add x, C anymore as
  computeKnownBits() already performs these computations.
- The simplification of the operands is unified. In this new version
  constant on the right side of a Sub are shrunk now as I could not find
  a reason why not to do so.
- The special case for clearing nsw/nuw in ShrinkDemandedConstant() is
  not necessary anymore as the caller does that already.

Differential Revision: http://reviews.llvm.org/D9415

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236269 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-30 22:05:30 +00:00
Sanjoy Das
a34038226e [InstCombine] Add new rule for MIN(MAX(~A, ~B), ~C) et. al.
Summary:
Optimizing these well are especially interesting for IRCE since it
"clamps" values by generating this sort of pattern through SCEV
expressions.

Depends on D9352.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9353

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236203 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-30 04:56:04 +00:00
Sanjoy Das
c0730628a4 [InstCombine] Add a new formula for SMIN.
Summary:
After this change `MatchSelectPattern` recognizes the following form
of SMIN:

  Y >s C ? ~Y : ~C == ~Y <s ~C ? ~Y : ~C = SMIN(~Y, ~C)

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9352

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236202 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-30 04:56:00 +00:00
Duncan P. N. Exon Smith
e56023a059 IR: Give 'DI' prefix to debug info metadata
Finish off PR23080 by renaming the debug info IR constructs from `MD*`
to `DI*`.  The last of the `DIDescriptor` classes were deleted in
r235356, and the last of the related typedefs removed in r235413, so
this has all baked for about a week.

Note: If you have out-of-tree code (like a frontend), I recommend that
you get everything compiling and tests passing with the *previous*
commit before updating to this one.  It'll be easier to keep track of
what code is using the `DIDescriptor` hierarchy and what you've already
updated, and I think you're extremely unlikely to insert bugs.  YMMV of
course.

Back to *this* commit: I did this using the rename-md-di-nodes.sh
upgrade script I've attached to PR23080 (both code and testcases) and
filtered through clang-format-diff.py.  I edited the tests for
test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns
were off-by-three.  It should work on your out-of-tree testcases (and
code, if you've followed the advice in the previous paragraph).

Some of the tests are in badly named files now (e.g.,
test/Assembler/invalid-mdcompositetype-missing-tag.ll should be
'dicompositetype'); I'll come back and move the files in a follow-up
commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236120 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-29 16:38:44 +00:00
Pawel Bylica
59764b94a7 Constfold insertelement to undef when index is out-of-bounds
Summary:
This patch adds constant folding of insertelement instruction to undef value when index operand is constant and is not less than vector size or is undef.

InstCombine does not support this case, but I'm happy to add it there also if this change is accepted.

Test Plan: Unittests and regression tests for ConstProp pass.

Reviewers: majnemer

Reviewed By: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9287

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235854 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-27 09:30:49 +00:00
Philip Reames
a404b6f421 [RewriteStatepointsForGC] Exclude constant values from being considered live at a safepoint
There can be various constant pointers in the IR which do not get relocated at a safepoint. One example is the address of a global variable. Another example is a pointer created via inttoptr. Note that the optimizer itself likes to create such inttoptrs when locally propagating constants through dynamically dead code.

To deal with this, we need to exclude uses of constants from contributing to the liveness of a safepoint which might reach that use. At some later date, it might be worth exploring what could be done to support the relocation of various special types of "constants", but that's future work.

Differential Revision: http://reviews.llvm.org/D9236



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235821 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-26 19:48:03 +00:00
Philip Reames
83a049f2a6 Don't Place Entry Safepoints Before the llvm.frameescape() Intrinsic
llvm.frameescape() intrinsic is not a real call. The intrinsic can only exist in the entry block. Inserting a gc.statepoint() before llvm.frameescape() may split the entry block, and push the intrinsic out of the entry block.

Patch by: Swaroop.Sridhar@microsoft.com
Differential Revision: http://reviews.llvm.org/D8910




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235820 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-26 19:41:23 +00:00
Sanjay Patel
1111a216ee [x86] instcombine more cases of insertps into a shufflevector
This is a follow-on to D8833 (insertps optimization when the zero mask is not used).

In this patch, we check for the case where the zmask is used, but both input vectors
to the insertps intrinsic are the same operand or the zmask overrides the destination
lane. This lets us replace the 2nd shuffle input operand with the zero vector.

Differential Revision: http://reviews.llvm.org/D9257



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235810 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-25 20:55:25 +00:00
Hans Wennborg
7a301c1b8c SimplifyCFG: Correctly handle switch lookup tables which fully cover the input type and use bit tests to check for holes
When using bit tests for hole checks, we call AddPredecessorToBlock to give the
phi node a value from the bit test block. This would break if we've
previously called removePredecessor on the default destination because the
switch is fully covered.

Test case by Mark Lacey.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235771 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-24 20:57:56 +00:00
David Blaikie
e41f3849bc [opaque pointer type] Add textual IR support for explicit type parameter to the invoke instruction
Same as r235145 for the call instruction - the justification, tradeoffs,
etc are all the same. The conversion script worked the same without any
false negatives (after replacing 'call' with 'invoke').

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235755 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-24 19:32:54 +00:00
David Blaikie
8b6356c73e Skip extra LLVM IR assemble/disassemble steps in some tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235736 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-24 18:06:09 +00:00
Jingyue Wu
728ad0157c Resurrect r235688
We should skip vector types which are not SCEVable.

test/CodeGen/NVPTX/sched2.ll passes


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235695 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-24 04:22:39 +00:00
Jingyue Wu
f42450abb6 Revert r235688
Seems breaking builds


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235690 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-24 03:26:11 +00:00
Jingyue Wu
d83b3b1a8d [NVPTX] enable NaryReassociate in NVPTX
Summary:
We run NaryReassociate right after SLSR because SLSR enables many
opportunities for NaryReassociate. For example, in nary-slsr.ll

  foo((a + b) + c);
  foo((a + b * 2) + c);
  foo((a + b * 3) + c);   // 2 muls and 6 adds

after SLSR:

  ab = a + b;
  foo(ab + c);
  ab2 = ab + b;
  foo(ab2 + c);
  ab3 = ab2 + b;
  foo(ab3 + c);           // 6 adds

after NaryReassociate:

  abc = (a + b) + c;
  foo(abc);
  ab2c = abc + b;
  foo(ab2c);
  ab3c = ab2c + b;
  foo(ab3c);              // 4 adds

Test Plan: nary-slsr.ll

Reviewers: jholewinski, eliben

Reviewed By: eliben

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D9066

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235688 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-24 02:54:06 +00:00
Jingyue Wu
12f341611a [NVPTX] run SeparateConstOffsetFromGEP before SLSR
Summary:
We pick this order because SeparateConstOffsetFromGEP may create more
opportunities for SLSR.

Test Plan:
reassociate-geps-and-slsr.ll
no performance regression on internal benchmarks

Reviewers: meheff

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D9230

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235632 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 20:00:04 +00:00
Karthik Bhat
7ab8b5573e Add support to interchange loops with reductions.
This patch enables interchanging of tightly nested loops with reductions.
Differential Revision: http://reviews.llvm.org/D8314


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235571 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 04:51:44 +00:00
David Majnemer
f9c92b069a [InstCombine] Use a more targeted fix instead of r235544
Only clear out the NSW/NUW flags if we are optimizing 'add'/'sub' while
taking advantage that the sign bit is not set.  We do this optimization
to further shrink the mask but shrinking the mask isn't NSW/NUW
preserving in this case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235558 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 22:42:05 +00:00
David Majnemer
3bd87826e5 [InstCombine] Clear out nsw/nuw if we modify computation in the chain
An nsw/nuw operation relies on the values feeding into it to not
overflow if 'poison' is not to be produced.  This means that
optimizations which make modifications to the bottom of a chain (like
SimplifyDemandedBits) must strip out nsw/nuw if they cannot ensure that
they will be preserved.

This fixes PR23309.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235544 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 20:59:28 +00:00
NAKAMURA Takumi
bc4233f437 Remove a zero-length file of llvm/test/Transforms/InstCombine/descale-zero.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235457 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 23:14:33 +00:00
Wei Mi
ef67950b62 Limiting gep merging to fix the performance problem described in
https://llvm.org/bugs/show_bug.cgi?id=23163.

Gep merging sometimes behaves like a reverse CSE/LICM optimization,
which has negative impact on performance. In this patch we restrict
gep merging to happen only when the indexes to be merged are both consts,
which ensures such merge is always beneficial.

The patch makes gep merging only happen in very restrictive cases.
It is possible that some analysis/optimization passes rely on the merged
geps to get better result, and we havn't notice them yet. We will be ready
to further improve it once we see the cases.

Differential Revision: http://reviews.llvm.org/D8911


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235455 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 23:02:15 +00:00
Wei Mi
480fc70c43 Revert r235451 since it is attached to a wrong Differential Revision. Sorry.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235453 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 22:56:09 +00:00
Wei Mi
73a5fa9ad6 Limiting gep merging to fix the performance problem described in
https://llvm.org/bugs/show_bug.cgi?id=23163.

Gep merging sometimes behaves like a reverse CSE/LICM optimizations,
which has negative impact on performance. In this patch we restrict
gep merging to happen only when the indexes to be merged are both consts,
which ensures such merge is always beneficial.

The patch makes gep merging only happen in very restrictive cases.
It is possible that some analysis/optimization passes rely on the merged
geps to get better result, and we havn't notice them yet. We will be ready
to further improve it once we see the cases.

Differential Revision: http://reviews.llvm.org/D9007


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235451 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 22:37:09 +00:00
Ahmed Bougacha
0f32a037ef [MemCpyOpt] Use the raw i8* dest when optimizing memset+memcpy.
MemIntrinsic::getDest() looks through pointer casts, and using it
directly when building the new GEP+memset results in stuff like:

  %0 = getelementptr i64* %p, i32 16
  %1 = bitcast i64* %0 to i8*
  call ..memset(i8* %1, ...)

instead of the correct:

  %0 = bitcast i64* %p to i8*
  %1 = getelementptr i8* %0, i32 16
  call ..memset(i8* %1, ...)

Instead, use getRawDest, which just gives you the i8* value.
While there, use the memcpy's dest, as it's live anyway.

In most cases, when the optimization triggers, the memset and memcpy
sizes are the same, so the built memset is 0-sized and eliminated.
The problem occurs when they're different.

Fixes a regression caused by r235232: PR23300.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235419 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 21:28:33 +00:00
Jingyue Wu
423476d899 [SLSR] garbage-collect unused instructions
Summary:
After we rewrite a candidate, the instructions used by the old form may
become unused. This patch cleans up these unused instructions so that we
needn't run DCE after SLSR.

Test Plan: removed -dce in all the SLSR tests

Reviewers: broune, meheff

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9101

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235410 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 19:56:18 +00:00
Jingyue Wu
412b6fc7b9 [SeparateConstOffsetFromGEP] garbage-collect intermediate instructions
Summary: so that we needn't run DCE after this pass.

Test Plan: removed -dce from the commandline in split-gep.ll and split-gep-and-gvn.ll

Reviewers: meheff

Subscribers: llvm-commits, HaoLiu, hfinkel, jholewinski

Differential Revision: http://reviews.llvm.org/D9096

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235409 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 19:53:18 +00:00
Fiona Glaser
b5750565de InstCombine: fold (sitofp (zext x)) to (uitofp x)
This is okay because the zext guarantees the high bit is zero,
and so the value is unsigned.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235364 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 00:05:41 +00:00
Akira Hatanaka
ca9313e65e [InlineFunction] Don't add lifetime markers for zero-sized allocas.
This commit fixes the code which adds lifetime markers in InlineFunction to skip
zero-sized allocas instead of asserting on them.

rdar://problem/20531155


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235312 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 16:11:05 +00:00
Ahmed Bougacha
3a65b111b5 [MemCpyOpt] Don't force i64 when promoting memset/memcpy sizes.
Harden r235258 to support any integer bitwidth.  The quick glance at
the reference made me think only i32 and i64 were valid types, but
they're not special, so any overload is legal.

Thanks to David Majnemer for noticing!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235261 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-18 23:06:04 +00:00
Ahmed Bougacha
ebb4371478 [MemCpyOpt] Promote both memset/memcpy sizes if differently typed.
Followup to r235232, which caused PR23278.

We can't assume the memset and memcpy sizes have the same type, as
nothing in the language reference prevents that.
Instead, zext both to i64 if they disagree.

While there, robustify tests by using i8 %c rather than i8 0 for the
memset character.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235258 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-18 17:57:41 +00:00
David Majnemer
c5edbea4e7 [InstCombine] (mul nsw 1, INT_MIN) != (shl nsw 1, 31)
Multiplying INT_MIN by 1 doesn't trigger nsw.  However, shifting 1 into
the sign bit *does* trigger nsw.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235250 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-18 04:41:30 +00:00
Ahmed Bougacha
7349765ddb [MemCpyOpt] Optimize double-storing by memset+memcpy.
A common idiom in some code is to do the following:

  memset(dst, 0, dst_size);
  memcpy(dst, src, src_size);

Some of the memset is redundant; instead, we can do:

  memcpy(dst, src, src_size);
  memset(dst + src_size, 0,
         dst_size <= src_size ? 0 : dst_size - src_size);

Original patch by: Joel Jones
Differential Revision: http://reviews.llvm.org/D498


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235232 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-17 22:20:57 +00:00
Jingyue Wu
f182c81444 [NaryReassociate] run NaryReassociate iteratively
Summary:
An alternative is to use a worklist approach. However, that approach
would break the traversing order so that we couldn't lookup SeenExprs
efficiently. I don't see a clear winner here, so I picked the easier approach.

Along with two minor improvements:
1. preserves ScalarEvolution by forgetting instructions replaced
2. removes dead code locally avoiding the need of running DCE afterwards

Test Plan: add to slsr-add.ll a test that requires multiple iterations

Reviewers: broune, dberlin, atrick, meheff

Reviewed By: atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9058

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235151 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-17 00:25:10 +00:00
David Blaikie
32b845d223 [opaque pointer type] Add textual IR support for explicit type parameter to the call instruction
See r230786 and r230794 for similar changes to gep and load
respectively.

Call is a bit different because it often doesn't have a single explicit
type - usually the type is deduced from the arguments, and just the
return type is explicit. In those cases there's no need to change the
IR.

When that's not the case, the IR usually contains the pointer type of
the first operand - but since typed pointers are going away, that
representation is insufficient so I'm just stripping the "pointerness"
of the explicit type away.

This does make the IR a bit weird - it /sort of/ reads like the type of
the first operand: "call void () %x(" but %x is actually of type "void
()*" and will eventually be just of type "ptr". But this seems not too
bad and I don't think it would benefit from repeating the type
("void (), void () * %x(" and then eventually "void (), ptr %x(") as has
been done with gep and load.

This also has a side benefit: since the explicit type is no longer a
pointer, there's no ambiguity between an explicit type and a function
that returns a function pointer. Previously this case needed an explicit
type (eg: a function returning a void() function was written as
"call void () () * @x(" rather than "call void () * @x(" because of the
ambiguity between a function returning a pointer to a void() function
and a function returning void).

No ambiguity means even function pointer return types can just be
written alone, without writing the whole function's type.

This leaves /only/ the varargs case where the explicit type is required.

Given the special type syntax in call instructions, the regex-fu used
for migration was a bit more involved in its own unique way (as every
one of these is) so here it is. Use it in conjunction with the apply.sh
script and associated find/xargs commands I've provided in rr230786 to
migrate your out of tree tests. Do let me know if any of this doesn't
cover your cases & we can iterate on a more general script/regexes to
help others with out of tree tests.

About 9 test cases couldn't be automatically migrated - half of those
were functions returning function pointers, where I just had to manually
delete the function argument types now that we didn't need an explicit
function type there. The other half were typedefs of function types used
in calls - just had to manually drop the * from those.

import fileinput
import sys
import re

pat = re.compile(r'((?:=|:|^|\s)call\s(?:[^@]*?))(\s*$|\s*(?:(?:\[\[[a-zA-Z0-9_]+\]\]|[@%](?:(")?[\\\?@a-zA-Z0-9_.]*?(?(3)"|)|{{.*}}))(?:\(|$)|undef|inttoptr|bitcast|null|asm).*$)')
addrspace_end = re.compile(r"addrspace\(\d+\)\s*\*$")
func_end = re.compile("(?:void.*|\)\s*)\*$")

def conv(match, line):
  if not match or re.search(addrspace_end, match.group(1)) or not re.search(func_end, match.group(1)):
    return line
  return line[:match.start()] + match.group(1)[:match.group(1).rfind('*')].rstrip() + match.group(2) + line[match.end():]

for line in sys.stdin:
  sys.stdout.write(conv(re.search(pat, line), line))

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235145 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-16 23:24:18 +00:00