Commit Graph

2083 Commits

Author SHA1 Message Date
Philip Reames
76acab3fc7 Require a GC strategy be specified for functions which use gc.statepoint
This was discussed a while back and I left it optional for migration.  Since it's been far more than the 'week or two' that was discussed, time to actually make this manditory.  



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233357 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 05:09:33 +00:00
Benjamin Kramer
8f820e9495 InstCombine: fold (A << C) == (B << C) --> ((A^B) & (~0U >> C)) == 0
Anding and comparing with zero can be done in a single instruction on
most archs so this is a bit cheaper.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233291 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-26 17:12:06 +00:00
Sanjay Patel
14c1d068a3 optimize the AVX2 (integer) version of vperm2 into a shuffle
...because this is what happens when an instruction
set puts its underwear on after its pants.

This is an extension of r232852, r233100, and 233110:
http://llvm.org/viewvc/llvm-project?view=revision&revision=232852
http://llvm.org/viewvc/llvm-project?view=revision&revision=233100
http://llvm.org/viewvc/llvm-project?view=revision&revision=233110



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233127 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 22:39:29 +00:00
David Blaikie
ef9962d9bb Revert "Remove an InstCombine that seems to have become redundant."
Assertion fires in compiler-rt. Guess it does fire..

This reverts commit r233116.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233121 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 21:50:35 +00:00
David Blaikie
80da8623a4 Remove an InstCombine that seems to have become redundant.
Assert that this doesn't fire - I'll remove all of this later, but just
leaving it in for a while in case this is firing & we just don't have
test coverage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233116 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 21:31:31 +00:00
Sanjay Patel
5e0ce9d13a [X86, AVX] instcombine vperm2 intrinsics with zero inputs into shuffles
This is the IR optimizer follow-on patch for D8563: the x86 backend patch
that converts this kind of shuffle back into a vperm2.

This is also a continuation of the transform that started in D8486. 
In that patch, Andrea suggested that we could convert vperm2 intrinsics that
use zero masks into a single shuffle. 

This is an implementation of that suggestion.

Differential Revision: http://reviews.llvm.org/D8567



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233110 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 20:36:42 +00:00
Benjamin Kramer
42a84b54bf [SimplifyLibCalls] Fix negative shifts being produced by the memchr -> bitfield transform.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232903 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-21 22:04:26 +00:00
Benjamin Kramer
fd48a80e14 [SimplifyLibCalls] Turn memchr(const, C, const) into a bitfield check.
strchr("123!", C) != nullptr is a common pattern to check if C is one
of 1, 2, 3 or !. If the largest element of the string is smaller than
the target's register size we can easily create a bitfield and just
do a simple test for set membership.

int foo(char C) { return strchr("123!", C) != nullptr; } now becomes

	cmpl	$64, %edi ## range check
	sbbb	%al, %al
	movabsq	$0xE000200000001, %rcx
	btq	%rdi, %rcx ## bit test
	sbbb	%cl, %cl
	andb	%al, %cl ## and the two conditions
	andb	$1, %cl
	movzbl	%cl, %eax ## returning an int
	ret

(imho the backend should expand this into a series of branches, but
that's a different story)

The code is currently limited to bit fields that fit in a register, so
usually 64 or 32 bits. Sadly, this misses anything using alpha chars
or {}. This could be fixed by just emitting a i128 bit field, but that
can generate really ugly code so we have to find a better way. To some
degree this is also recreating switch lowering logic, but we can't
simply emit a switch instruction and thus change the CFG within
instcombine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232902 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-21 21:09:33 +00:00
Benjamin Kramer
4b74df7229 SimplifyLibCalls: Add basic optimization of memchr calls.
This is just memchr(x, y, 0) -> nullptr and constant folding.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232896 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-21 15:36:21 +00:00
Sanjay Patel
be9ee96926 [X86, AVX] instcombine common cases of vperm2* intrinsics into shuffles
vperm2* intrinsics are just shuffles. 
In a few special cases, they're not even shuffles.

Optimizing intrinsics in InstCombine is better than
handling this in the front-end for at least two reasons:

1. Optimizing custom-written SSE intrinsic code at -O0 makes vector coders
   really angry (and so I have regrets about some patches from last week).

2. Doing mask conversion logic in header files is hard to write and 
   subsequently read.

There are a couple of TODOs in this patch to complete this optimization.

Differential Revision: http://reviews.llvm.org/D8486



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232852 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-20 21:47:56 +00:00
Daniel Jasper
3baea2951d [InstCombine] Don't fold a GEP into itself through a PHI node
This can only occur (I think) through the back-edge of the loop.

However, folding a GEP into itself means that the value of the previous
iteration needs to be stored in the meantime, thus requiring an
additional register variable to be live, but not actually achieving
anything (the gep still needs to be executed once per loop iteration).

The attached test case is derived from:
  typedef unsigned uint32;
  typedef unsigned char uint8;
  inline uint8 *f(uint32 value, uint8 *target) {
    while (value >= 0x80) {
      value >>= 7;
      ++target;
    }
    ++target;
    return target;
  }
  uint8 *g(uint32 b, uint8 *target) {
    target = f(b, f(42, target));
    return target;
  }

What happens is that the GEP stored in incptr2 is folded into itself
through the loop's back-edge and the phi-node stored in loopptr,
effectively incrementing the ptr by "2" in each iteration instead of "1".

In this case, it is actually increasing the number of GEPs required as
the GEP before the loop can't be folded away anymore. For comparison:

With this patch:
  define i8* @test4(i32 %value, i8* %buffer) {
  entry:
    %cmp = icmp ugt i32 %value, 127
    br i1 %cmp, label %loop.header, label %exit

  loop.header:                                      ; preds = %entry
    br label %loop.body

  loop.body:                                        ; preds = %loop.body, %loop.header
    %buffer.pn = phi i8* [ %buffer, %loop.header ], [ %loopptr, %loop.body ]
    %newval = phi i32 [ %value, %loop.header ], [ %shr, %loop.body ]
    %loopptr = getelementptr inbounds i8, i8* %buffer.pn, i64 1
    %shr = lshr i32 %newval, 7
    %cmp2 = icmp ugt i32 %newval, 16383
    br i1 %cmp2, label %loop.body, label %loop.exit

  loop.exit:                                        ; preds = %loop.body
    br label %exit

  exit:                                             ; preds = %loop.exit, %entry
    %0 = phi i8* [ %loopptr, %loop.exit ], [ %buffer, %entry ]
    %incptr3 = getelementptr inbounds i8, i8* %0, i64 2
    ret i8* %incptr3
  }

Without this patch:
  define i8* @test4(i32 %value, i8* %buffer) {
  entry:
    %incptr = getelementptr inbounds i8, i8* %buffer, i64 1
    %cmp = icmp ugt i32 %value, 127
    br i1 %cmp, label %loop.header, label %exit

  loop.header:                                      ; preds = %entry
    br label %loop.body

  loop.body:                                        ; preds = %loop.body, %loop.header
    %0 = phi i8* [ %buffer, %loop.header ], [ %loopptr, %loop.body ]
    %loopptr = phi i8* [ %incptr, %loop.header ], [ %incptr2, %loop.body ]
    %newval = phi i32 [ %value, %loop.header ], [ %shr, %loop.body ]
    %shr = lshr i32 %newval, 7
    %incptr2 = getelementptr inbounds i8, i8* %0, i64 2
    %cmp2 = icmp ugt i32 %newval, 16383
    br i1 %cmp2, label %loop.body, label %loop.exit

  loop.exit:                                        ; preds = %loop.body
    br label %exit

  exit:                                             ; preds = %loop.exit, %entry
    %ptr2 = phi i8* [ %incptr2, %loop.exit ], [ %incptr, %entry ]
    %incptr3 = getelementptr inbounds i8, i8* %ptr2, i64 1
    ret i8* %incptr3
  }

Review: http://reviews.llvm.org/D8245

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232718 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-19 11:05:08 +00:00
Duncan P. N. Exon Smith
08e687e684 Verifier: Check debug info intrinsic arguments
Verify that debug info intrinsic arguments are valid.  (These checks
will not recurse through the full debug info graph, so they don't need
to be cordoned of in `DebugInfoVerifier`.)

With those checks in place, changing the `DbgIntrinsicInst` accessors to
downcast to `MDLocalVariable` and `MDExpression` is natural (added isa
specializations in `Metadata.h` to support this).

Added tests to `test/Verifier` for the new -verify checks, and fixed the
debug info in all the in-tree tests.

If you have out-of-tree testcases that have started to fail to -verify,
hopefully the verify checks are helpful.  The most likely problem is
that the expression argument is `!{}` (instead of `!MDExpression()`).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232296 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-15 01:21:30 +00:00
Mehdi Amini
3af5418aa4 Update InstCombine to transform aggregate stores into scalar stores.
Summary: This is a first step toward getting proper support for aggregate loads and stores.

Test Plan: Added unittests

Reviewers: reames, chandlerc

Reviewed By: chandlerc

Subscribers: majnemer, joker.eph, chandlerc, llvm-commits

Differential Revision: http://reviews.llvm.org/D7780

Patch by Amaury Sechet

From: Mehdi Amini <mehdi.amini@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232284 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-14 22:19:33 +00:00
Duncan P. N. Exon Smith
d747f3f3fe instcombine: alloca: Canonicalize scalar allocation array size
As a follow-up to r232200, add an `-instcombine` to canonicalize scalar
allocations to `i32 1`.  Since r232200, `iX 1` (for X != 32) are only
created by RAUWs, so this shouldn't fire too often.  Nevertheless, it's
a cheap check and a nice cleanup.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232202 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-13 19:42:09 +00:00
Duncan P. N. Exon Smith
95ff656ae0 AsmWriter: Write alloca array size explicitly (and -instcombine fixup)
Write the `alloca` array size explicitly when it's non-canonical.
Previously, if the array size was `iX 1` (where X is not 32), the type
would mutate to `i32` when round-tripping through assembly.

The testcase I added fails in `verify-uselistorder` (as well as
`FileCheck`), since the use-lists for `i32 1` and `i64 1` change.
(Manman Ren came across this when running `verify-uselistorder` on some
non-trivial, optimized code as part of PR5680.)

The type mutation started with r104911, which allowed array sizes to be
something other than an `i32`.  Starting with r204945, we
"canonicalized" to `i64` on 64-bit platforms -- and then on every
round-trip through assembly, mutated back to `i32`.

I bundled a fixup for `-instcombine` to avoid r204945 on scalar
allocations.  (There wasn't a clean way to sequence this into two
commits, since the assembly change on its own caused testcase churn, and
the `-instcombine` change can't be tested without the assembly changes.)

An obvious alternative fix -- change `AllocaInst::AllocaInst()`,
`AsmWriter` and `LLParser` to treat `intptr_t` as the canonical type for
scalar allocations -- was rejected out of hand, since this required
teaching them each about the data layout.

A follow-up commit will add an `-instcombine` to canonicalize the scalar
allocation array size to `i32 1` rather than leaving `iX 1` alone.

rdar://problem/20075773

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232200 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-13 19:30:44 +00:00
David Blaikie
5a70dd1d82 [opaque pointer type] Add textual IR support for explicit type parameter to gep operator
Similar to gep (r230786) and load (r230794) changes.

Similar migration script can be used to update test cases, which
successfully migrated all of LLVM and Polly, but about 4 test cases
needed manually changes in Clang.

(this script will read the contents of stdin and massage it into stdout
- wrap it in the 'apply.sh' script shown in previous commits + xargs to
apply it over a large set of test cases)

import fileinput
import sys
import re

rep = re.compile(r"(getelementptr(?:\s+inbounds)?\s*\()((<\d*\s+x\s+)?([^@]*?)(|\s*addrspace\(\d+\))\s*\*(?(3)>)\s*)(?=$|%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|zeroinitializer|<|\[\[[a-zA-Z]|\{\{)", re.MULTILINE | re.DOTALL)

def conv(match):
  line = match.group(1)
  line += match.group(4)
  line += ", "
  line += match.group(2)
  return line

line = sys.stdin.read()
off = 0
for match in re.finditer(rep, line):
  sys.stdout.write(line[off:match.start()])
  sys.stdout.write(conv(match))
  off = match.end()
sys.stdout.write(line[off:])

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232184 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-13 18:20:45 +00:00
David Majnemer
9c2d178707 InstCombine: Don't fold call bitcast into args if callee is byval
This fixes a bug reported here:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150309/265341.html

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231948 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-11 18:03:05 +00:00
Sanjay Patel
c2e4231627 Inliner should not add callgraph edges for intrinsic calls (PR22857)
The CallGraphNode function "addCalledFunction()" asserts that edges are not to intrinsics.

This patch makes sure that the Inliner does not add such an edge to the callgraph.

Fix for clang crash by assertion: https://llvm.org/bugs/show_bug.cgi?id=22857

Differential Revision: http://reviews.llvm.org/D8231



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231927 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-11 15:12:32 +00:00
Philip Reames
fb4ffccacb If a conditional branch jumps to the same target, remove the condition
Given that large parts of inst combine is restricted to instructions which have one use, getting rid of a use on the condition can help the effectiveness of the optimizer. Also, it allows the condition to potentially be deleted by instcombine rather than waiting for another pass.

I noticed this completely by accident in another test case. It's not anything that actually came from a real workload.

p.s. We should probably do the same thing for switch instructions.

Differential Revision: http://reviews.llvm.org/D8220



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231881 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-10 22:52:37 +00:00
Philip Reames
7c7b72a066 Infer known bits from dominating conditions
This patch adds limited support in ValueTracking for inferring known bits of a value from conditional expressions which must be true to reach the instruction we're trying to optimize. At this time, the feature is off by default. Once landed, I'm hoping for feedback from others on both profitability and compile time impact.

Forms of conditional value propagation have been tried in LLVM before and have failed due to compile time problems.  In an attempt to side step that, this patch only considers conditions where the edge leaving the branch dominates the context instruction. It does not attempt full dataflow.  Even with that restriction, it handles many interesting cases:
 * Early exits from functions
 * Early exits from loops (for context instructions in the loop and after the check)
 * Conditions which control entry into loops, including multi-version loops (such as those produced during vectorization, IRCE, loop unswitch, etc..)

Possible applications include optimizing using information provided by constructs such as: preconditions, assumptions, null checks, & range checks.

This patch implements two approaches to the problem that need further benchmarking.  Approach 1 is to directly walk the dominator tree looking for interesting conditions.  Approach 2 is to inspect other uses of the value being queried for interesting comparisons.  From initial benchmarking, it appears that Approach 2 is faster than Approach 1, but this needs to be further validated.  

Differential Revision: http://reviews.llvm.org/D7708



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231879 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-10 22:43:20 +00:00
Owen Anderson
3a3665fd38 Fix a crash in InstCombine where we could try to truncate a switch comparison to zero width.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231761 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-10 06:51:39 +00:00
Owen Anderson
645fd68c5c Fix an infinite loop in InstCombine when an instruction with no users and side effects can be constant folded.
ReplaceInstUsesWith needs to return nullptr when the input has no users,
because in that case it does not mutate the program.  Otherwise, we can
get stuck in an infinite loop of repeatedly attempting to constant fold
and instruction with no users.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231755 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-10 05:13:47 +00:00
Mehdi Amini
4c8f5afd99 InstCombine: fix fold "fcmp x, undef" to account for NaN
Summary:
See the two test cases.

; Can fold fcmp with undef on one side by choosing NaN for the undef

; Can fold fcmp with undef on both side
;   fcmp u_pred undef, undef -> true
;   fcmp o_pred undef, undef -> false
; because whatever you choose for the first undef
; you can choose NaN for the other undef

Reviewers: hfinkel, chandlerc, majnemer

Reviewed By: majnemer

Subscribers: majnemer, llvm-commits

Differential Revision: http://reviews.llvm.org/D7617

From: Mehdi Amini <mehdi.amini@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231626 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-09 03:20:25 +00:00
Owen Anderson
c03496d4d0 Teach DataLayout to infer a plausible alignment for things even when nothing is specified by the user.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231613 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-08 21:53:59 +00:00
Nadav Rotem
368d2e9976 Teach ComputeNumSignBits about signed reminder.
This optimization a continuation of r231140 that reasoned about signed div.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231433 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-06 00:23:58 +00:00
Michael Kuperstein
2d8a36ee71 [InstCombine] Fix an assertion when fmul has a ConstantExpr operand
isNormalFp and isFiniteNonZeroFp should not assume vector operands can not be constant expressions.

Patch by Pawel Jurek <pawel.jurek@intel.com>
Differential Revision: http://reviews.llvm.org/D8053

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231359 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-05 08:38:57 +00:00
Mehdi Amini
c94da20917 Make DataLayout Non-Optional in the Module
Summary:
DataLayout keeps the string used for its creation.

As a side effect it is no longer needed in the Module.
This is "almost" NFC, the string is no longer
canonicalized, you can't rely on two "equals" DataLayout
having the same string returned by getStringRepresentation().

Get rid of DataLayoutPass: the DataLayout is in the Module

The DataLayout is "per-module", let's enforce this by not
duplicating it more than necessary.
One more step toward non-optionality of the DataLayout in the
module.

Make DataLayout Non-Optional in the Module

Module->getDataLayout() will never returns nullptr anymore.

Reviewers: echristo

Subscribers: resistor, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D7992

From: Mehdi Amini <mehdi.amini@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231270 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-04 18:43:29 +00:00
David Majnemer
8db493c4e1 InstCombine: Ensure select condition types are identical before merging
Selection conditions may be vectors or scalars.  Make sure InstCombine
doesn't indiscriminately assume that a select which is value dependent
on another select have identical select condition types.

This fixes PR22773.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231156 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-03 22:40:36 +00:00
Nadav Rotem
10faa1b211 Teach ComputeNumSignBits about signed divisions.
http://reviews.llvm.org/D8028
rdar://20023136



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231140 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-03 21:39:02 +00:00
Duncan P. N. Exon Smith
b056aa798d DebugInfo: Move new hierarchy into place
Move the specialized metadata nodes for the new debug info hierarchy
into place, finishing off PR22464.  I've done bootstraps (and all that)
and I'm confident this commit is NFC as far as DWARF output is
concerned.  Let me know if I'm wrong :).

The code changes are fairly mechanical:

  - Bumped the "Debug Info Version".
  - `DIBuilder` now creates the appropriate subclass of `MDNode`.
  - Subclasses of DIDescriptor now expect to hold their "MD"
    counterparts (e.g., `DIBasicType` expects `MDBasicType`).
  - Deleted a ton of dead code in `AsmWriter.cpp` and `DebugInfo.cpp`
    for printing comments.
  - Big update to LangRef to describe the nodes in the new hierarchy.
    Feel free to make it better.

Testcase changes are enormous.  There's an accompanying clang commit on
its way.

If you have out-of-tree debug info testcases, I just broke your build.

  - `upgrade-specialized-nodes.sh` is attached to PR22564.  I used it to
    update all the IR testcases.
  - Unfortunately I failed to find way to script the updates to CHECK
    lines, so I updated all of these by hand.  This was fairly painful,
    since the old CHECKs are difficult to reason about.  That's one of
    the benefits of the new hierarchy.

This work isn't quite finished, BTW.  The `DIDescriptor` subclasses are
almost empty wrappers, but not quite: they still have loose casting
checks (see the `RETURN_FROM_RAW()` macro).  Once they're completely
gutted, I'll rename the "MD" classes to "DI" and kill the wrappers.  I
also expect to make a few schema changes now that it's easier to reason
about everything.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231082 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-03 17:24:31 +00:00
David Blaikie
7c9c6ed761 [opaque pointer type] Add textual IR support for explicit type parameter to load instruction
Essentially the same as the GEP change in r230786.

A similar migration script can be used to update test cases, though a few more
test case improvements/changes were required this time around: (r229269-r229278)

import fileinput
import sys
import re

pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)")

for line in sys.stdin:
  sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line))

Reviewers: rafael, dexonsmith, grosser

Differential Revision: http://reviews.llvm.org/D7649

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230794 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 21:17:42 +00:00
David Blaikie
198d8baafb [opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction
One of several parallel first steps to remove the target type of pointers,
replacing them with a single opaque pointer type.

This adds an explicit type parameter to the gep instruction so that when the
first parameter becomes an opaque pointer type, the type to gep through is
still available to the instructions.

* This doesn't modify gep operators, only instructions (operators will be
  handled separately)

* Textual IR changes only. Bitcode (including upgrade) and changing the
  in-memory representation will be in separate changes.

* geps of vectors are transformed as:
    getelementptr <4 x float*> %x, ...
  ->getelementptr float, <4 x float*> %x, ...
  Then, once the opaque pointer type is introduced, this will ultimately look
  like:
    getelementptr float, <4 x ptr> %x
  with the unambiguous interpretation that it is a vector of pointers to float.

* address spaces remain on the pointer, not the type:
    getelementptr float addrspace(1)* %x
  ->getelementptr float, float addrspace(1)* %x
  Then, eventually:
    getelementptr float, ptr addrspace(1) %x

Importantly, the massive amount of test case churn has been automated by
same crappy python code. I had to manually update a few test cases that
wouldn't fit the script's model (r228970,r229196,r229197,r229198). The
python script just massages stdin and writes the result to stdout, I
then wrapped that in a shell script to handle replacing files, then
using the usual find+xargs to migrate all the files.

update.py:
import fileinput
import sys
import re

ibrep = re.compile(r"(^.*?[^%\w]getelementptr inbounds )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")
normrep = re.compile(       r"(^.*?[^%\w]getelementptr )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")

def conv(match, line):
  if not match:
    return line
  line = match.groups()[0]
  if len(match.groups()[5]) == 0:
    line += match.groups()[2]
  line += match.groups()[3]
  line += ", "
  line += match.groups()[1]
  line += "\n"
  return line

for line in sys.stdin:
  if line.find("getelementptr ") == line.find("getelementptr inbounds"):
    if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("):
      line = conv(re.match(ibrep, line), line)
  elif line.find("getelementptr ") != line.find("getelementptr ("):
    line = conv(re.match(normrep, line), line)
  sys.stdout.write(line)

apply.sh:
for name in "$@"
do
  python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name"
  rm -f "$name.tmp"
done

The actual commands:
From llvm/src:
find test/ -name *.ll | xargs ./apply.sh
From llvm/src/tools/clang:
find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}"
From llvm/src/tools/polly:
find test/ -name *.ll | xargs ./apply.sh

After that, check-all (with llvm, clang, clang-tools-extra, lld,
compiler-rt, and polly all checked out).

The extra 'rm' in the apply.sh script is due to a few files in clang's test
suite using interesting unicode stuff that my python script was throwing
exceptions on. None of those files needed to be migrated, so it seemed
sufficient to ignore those cases.

Reviewers: rafael, dexonsmith, grosser

Differential Revision: http://reviews.llvm.org/D7636

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230786 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 19:29:02 +00:00
Hal Finkel
532af6859f [InstCombine/PowerPC] Convert aligned QPX load/store intrinsics into loads/stores
InstCombine has long had logic to convert aligned Altivec load/store intrinsics
into regular loads and stores. This mirrors that functionality for QPX vector
load/store intrinsics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230660 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-26 18:56:03 +00:00
Hal Finkel
9f9a6fd453 [InstCombine] Add a test for altivec load/store intrinsic simplification
InstCombine has logic to convert aligned Altivec load/store intrinsics into
regular loads and stores. Unfortunately, there seems to be no regression test
covering this behavior. Adding one...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230632 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-26 14:22:41 +00:00
JF Bastien
6fec24744f InstCombine: extract instead of shuffle when performing vector/array type punning
Summary: SROA generates code that isn't quite as easy to optimize and contains unusual-sized shuffles, but that code is generally correct. As discussed in D7487 the right place to clean things up is InstCombine, which will pick up the type-punning pattern and transform it into a more obvious bitcast+extractelement, while leaving the other patterns SROA encounters as-is.

Test Plan: make check

Reviewers: jvoung, chandlerc

Subscribers: llvm-commits

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230560 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-25 22:30:51 +00:00
Charles Davis
fba7e30f0f [IC] Turn non-null MD on pointer loads to range MD on integer loads.
Summary:
This change fixes the FIXME that you recently added when you committed
(a modified version of) my patch.  When `InstCombine` combines a load and
store of an pointer to those of an equivalently-sized integer, it currently
drops any `!nonnull` metadata that might be present.  This change replaces
`!nonnull` metadata with `!range !{ 1, -1 }` metadata instead.

Reviewers: chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7621

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230462 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-25 05:10:25 +00:00
Sanjoy Das
f922d9cfe4 New instcombine rule: max(~a,~b) -> ~min(a, b)
This case is interesting because ScalarEvolutionExpander lowers min(a,
b) as ~max(~a,~b).  I think the profitability heuristics can be made
more clever/aggressive, but this is a start.

Differential Revision: http://reviews.llvm.org/D7821



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230285 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-24 00:08:41 +00:00
Hal Finkel
5ecf528fc2 [InstCombine] Remove unnecessary variable indexing into single-element arrays
This change addresses a deficiency pointed out in PR22629. To copy from the bug
report:

[from the bug report]

Consider this code:

int f(int x) {
  int a[] = {12};
  return a[x];
}

GCC knows to optimize this to

movl     $12, %eax
ret

The code generated by recent Clang at -O3 is:

movslq   %edi, %rax
movl     .L_ZZ1fiE1a(,%rax,4), %eax
retq

.L_ZZ1fiE1a:
  .long    12                      # 0xc

[end from the bug report]

This definitely seems worth fixing. I've also seen this kind of code before (as
the base case of generic vector wrapper templates with one element).

The general idea is to look at the GEP feeding a load or a store, which has
some variable as its first non-zero index, and determine if that index must be
zero (or else an out-of-bounds access would occur). We can do this for allocas
and globals with constant initializers where we know the maximum size of the
underlying object. When we find such a GEP, we create a new one for the memory
access with that first variable index replaced with a constant zero.

Even if we can't eliminate the memory access (and sometimes we can't), it is
still useful because it removes unnecessary indexing calculations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229959 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 03:05:53 +00:00
Akira Hatanaka
5506e22865 [InstCombine] Do not insert a GEP instruction before a landingpad instruction.
InstCombiner::visitGetElementPtrInst was using getFirstNonPHI to compute the
insertion point, which caused the verifier to complain when a GEP was inserted
before a landingpad instruction. This commit fixes it to use getFirstInsertionPt
instead.

rdar://problem/19394964


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229619 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-18 03:30:11 +00:00
Mehdi Amini
e97c675022 InstCombine: fold more cases of (fp_to_u/sint (u/sint_to_fp val))
Fixes radar 15486701.

From: Fiona Glaser <fglaser@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229437 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-16 21:47:54 +00:00
Mehdi Amini
be55a79941 Tests: reformat sitofp.ll and use FileCheck
From: Fiona Glaser <fglaser@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229436 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-16 21:47:50 +00:00
Ramkumar Ramachandra
0608cec657 InstCombine: propagate deref via new addDereferenceableAttr
The "dereferenceable" attribute cannot be added via .addAttribute(),
since it also expects a size in bytes. AttrBuilder#addAttribute or
AttributeSet#addAttribute is wrapped by classes Function, InvokeInst,
and CallInst. Add corresponding wrappers to
AttrBuilder#addDereferenceableAttr.

Having done this, propagate the dereferenceable attribute via
gc.relocate, adding a test to exercise it. Note that -datalayout is
required during execution over and above -instcombine, because
InstCombine only optionally requires DataLayoutPass.

Differential Revision: http://reviews.llvm.org/D7510

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229265 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-14 19:37:54 +00:00
Philip Reames
d777c2c0c0 [InstCombine] When canonicalizing gep indices, prefer zext when possible
If we know that the sign bit of a value being sign extended is zero, we can use a zero extension instead.  This is motivated by the fact that zero extensions are generally cheaper on x86 (and most other architectures?).  We already apply a similar transform in DAGCombine, this just extends that to the IR level.

This comes up when we eagerly canonicalize gep indices to the width of a machine register (i64 on x86_64). To do so, we insert sign extensions (sext) to promote smaller types. 

Differential Revision: http://reviews.llvm.org/D7255



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229189 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-14 00:05:36 +00:00
Andrea Di Biagio
d25126faae [InstCombine] Fix regression introduced at r227197.
This patch fixes a problem I accidentally introduced in an instruction combine
on select instructions added at r227197. That revision taught the instruction
combiner how to fold a cttz/ctlz followed by a icmp plus select into a single
cttz/ctlz with flag 'is_zero_undef' cleared.

However, the new rule added at r227197 would have produced wrong results in the
case where a cttz/ctlz with flag 'is_zero_undef' cleared was follwed by a
zero-extend or truncate. In that case, the folded instruction would have
been inserted in a wrong location thus leaving the CFG in an inconsistent
state.

This patch fixes the problem and add two reproducible test cases to
existing test 'InstCombine/select-cmp-cttz-ctlz.ll'.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229124 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 16:33:34 +00:00
Michael Liao
4235574ce3 [InstCombine] Fix a bug when combining icmp from ptrtoint
- First, there's a crash when we try to combine that pointers into `icmp`
  directly by creating a `bitcast`, which is invalid if that two pointers are
  from different address spaces.

- It's not always appropriate to cast one pointer to another if they are from
  different address spaces as that is not no-op cast. Instead, we only combine
  `icmp` from `ptrtoint` if that two pointers are of the same address space.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229063 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 04:51:26 +00:00
Chandler Carruth
a768c4a096 [IC] Fix a bug with the instcombine canonicalizing of loads and
propagating of metadata.

We were propagating !nonnull metadata even when the newly formed load is
no longer of a pointer type. This is clearly broken and results in LLVM
failing the verifier and aborting. This patch just restricts the
propagation of !nonnull metadata to when we actually have a pointer
type.

This bug report and the initial version of this patch was provided by
Charles Davis! Many thanks for finding this!

We still need to add logic to round-trip the metadata correctly if we
combine from pointer types to integer types and then back by using range
metadata for the integer type loads. But this is the minimal and safe
version of the patch, which is important so we can backport it into 3.6.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229029 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 02:30:01 +00:00
Benjamin Kramer
d038a7fe67 InstCombine: Allow folding of xor into icmp by changing the predicate for vectors
The loop vectorizer can create this pattern.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228954 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-12 20:26:46 +00:00
Chandler Carruth
3e77df419d Revert r228556: InstCombine: propagate nonNull through assume
This commit isn't using the correct context, and is transfoming calls
that are operands to loads rather than calls that are operands to an
icmp feeding into an assume. I've replied on the original review thread
with a very reduced test case and some thoughts on how to rework this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228677 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-10 08:07:32 +00:00
Ramkumar Ramachandra
5439a80dca InstCombine: propagate nonNull through assume
Make assume (load (call|invoke) != null) set nonNull return attribute
for the call and invoke. Also include tests.

Differential Revision: http://reviews.llvm.org/D7107

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228556 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-09 01:13:13 +00:00
Matthias Braun
2f2dec87fb InstCombine: Combine select sequences into a single select
Normalize
select(C0, select(C1, a, b), b) -> select((C0 & C1), a, b)
select(C0, a, select(C1, a, b)) -> select((C0 | C1), a, b)

This normal form may enable further combines on the And/Or and shortens
paths for the values. Many targets prefer the other but can go back
easily in CodeGen.

Differential Revision: http://reviews.llvm.org/D7399

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228409 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-06 17:49:36 +00:00