Commit Graph

5773 Commits

Author SHA1 Message Date
Rui Ueyama
2764f3ded3 Revert "r214897 - Remove dead zero store to calloc initialized memory"
It broke msan.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214989 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 19:30:38 +00:00
Philip Reames
b835f3446f Remove dead zero store to calloc initialized memory
Optimize the following IR:

%1 = tail call noalias i8* @calloc(i64 1, i64 4)
%2 = bitcast i8* %1 to i32*
; This store is dead and should be removed
store i32 0, i32* %2, align 4

Memory returned by calloc is guaranteed to be zero initialized. If the value being stored is the constant zero (and the store is not otherwise observable across threads), we can delete the store.  If the store is to an out of bounds address, it is undefined and thus also removable.

Reviewed By: nicholas

Differential Revision: http://reviews.llvm.org/D3942




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214897 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 17:48:20 +00:00
James Molloy
72035e9a8e Teach the SLP Vectorizer that keeping some values live over a callsite can have a cost.
Some types, such as 128-bit vector types on AArch64, don't have any callee-saved registers. So if a value needs to stay live over a callsite, it must be spilled and refilled. This cost is now taken into account.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214859 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 12:30:34 +00:00
Manman Ren
3d5463d81f [SimplifyCFG] fix accessing deleted PHINodes in switch-to-table conversion.
When we have a covered lookup table, make sure we don't delete PHINodes that
are cached in PHIs.

rdar://17887153


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214642 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-02 23:41:54 +00:00
Erik Eckstein
8624519c0c fix bug 20513 - Crash in SLP Vectorizer
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214638 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-02 19:39:42 +00:00
Tyler Nowicki
842a06e8dd Add diagnostics to the vectorizer cost model.
When the cost model determines vectorization is not possible/profitable these remarks print an analysis of that decision.

Note that in selectVectorizationFactor() we can assume that OptForSize and ForceVectorization are mutually exclusive.

Reviewed by Arnold Schwaighofer


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214599 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-02 00:14:03 +00:00
Peter Collingbourne
f425efdbc2 PartiallyInlineLibCalls: Check sqrt result type before transforming it.
Some configure scripts declare this with the wrong prototype, which can lead
to an assertion failure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214593 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-01 23:21:21 +00:00
Erik Eckstein
956268f9dc SLPVectorizer: improved scheduling algorithm.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214494 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-01 09:20:42 +00:00
Suyog Sarda
1952b5a4da This patch implements transform for pattern "(A & ~B) ^ (~A) -> ~(A & B)".
Differential Revision: http://reviews.llvm.org/D4653



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214479 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-01 05:07:20 +00:00
Suyog Sarda
78061f4db4 This patch implements transform for pattern "(A | B) & ((~A) ^ B) -> (A & B)".
Differential Revision: http://reviews.llvm.org/D4628



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214478 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-01 04:59:26 +00:00
Suyog Sarda
d05b6c6f2c This patch implements transform for pattern "( A & (~B)) | (A ^ B) -> (A ^ B)"
Differential Revision: http://reviews.llvm.org/D4652



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214477 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-01 04:50:31 +00:00
Suyog Sarda
87569413b0 This patch implements transform for pattern "(A & B) | ((~A) ^ B) -> (~A ^ B)".
Patch Credit to Ankit Jain !

Differential Revision: http://reviews.llvm.org/D4655



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214476 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-01 04:41:43 +00:00
Tyler Nowicki
f7be7f15c1 Improve the remark generated for -Rpass-missed.
The current remark is ambiguous and makes it sounds like explicitly specifying vectorization will allow the loop to be vectorized. This is not the case. The improved remark directs the user to -Rpass-analysis=loop-vectorize to determine the cause of the pass-miss.

Reviewed by Arnold Schwaighofer`


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214445 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-31 21:22:22 +00:00
Tyler Nowicki
88212074a8 Improve the remark generated when a variable that is used outside the loop is not a reduction or induction variable.
Reviewed by Arnold Schwaighofer


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214440 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-31 21:02:40 +00:00
David Majnemer
a4a812fedd InstCombine: Correctly propagate NSW/NUW for x-(-A) -> x+A
We can only propagate the nsw bits if both subtraction instructions are
marked with the appropriate bit.

N.B.  We only propagate the nsw bit in InstCombine because the nuw case
is already handled in InstSimplify.

This fixes PR20189.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214385 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-31 04:49:29 +00:00
David Majnemer
ec7ee07036 InstSimplify: Simplify (X - (0 - Y)) if the second sub is NUW
If the NUW bit is set for 0 - Y, we know that all values for Y other
than 0 would produce a poison value.  This allows us to replace (0 - Y)
with 0 in the expression (X - (0 - Y)) which will ultimately leave us
with X.

This partially fixes PR20189.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214384 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-31 04:49:18 +00:00
Rafael Espindola
d57120551f Use "weak alias" instead of "alias weak"
Before this patch we had

@a = weak global ...
but
@b = alias weak ...

The patch changes aliases to look more like global variables.

Looking at some really old code suggests that the reason was that the old
bison based parser had a reduction for alias linkages and another one for
global variable linkages. Putting the alias first avoided the reduce/reduce
conflict.

The days of the old .ll parser are long gone. The new one parses just "linkage"
and a later check is responsible for deciding if a linkage is valid in a
given context.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214355 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-30 22:51:54 +00:00
David Majnemer
5624046453 InstCombine: Simplify (A ^ B) or/and (A ^ B ^ C)
While we can already transform A | (A ^ B) into A | B, things get bad
once we have (A ^ B) | (A ^ B ^ Cst) because reassociation will morph
this into (A ^ B) | ((A ^ Cst) ^ B).  Our existing patterns fail once
this happens.

To fix this, we add a new pattern which looks through the tree of xor
binary operators to see that, in fact, there exists a redundant xor
operation.

What follows bellow is a correctness proof of the transform using CVC3.

$ cat t.cvc
A, B, C : BITVECTOR(64);

QUERY BVXOR(A, B) | BVXOR(BVXOR(B, C), A) = BVXOR(A, B) | C;
QUERY BVXOR(BVXOR(A, C), B) | BVXOR(A, B) = BVXOR(A, B) | C;

QUERY BVXOR(A, B) & BVXOR(BVXOR(B, C), A) = BVXOR(A, B) & ~C;
QUERY BVXOR(BVXOR(A, C), B) & BVXOR(A, B) = BVXOR(A, B) & ~C;

$ cvc3 < t.cvc
Valid.
Valid.
Valid.
Valid.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214342 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-30 21:26:37 +00:00
Chad Rosier
7f6a685444 SLP Vectorizer: Canonicalize tree operands of commutitive binary operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214338 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-30 21:07:56 +00:00
Rafael Espindola
7fef5a3d19 SimplifyCFG: Avoid miscompilations due to removed lifetime intrinsics.
The lifetime intrinsics need some work in order to make it clear which
optimizations are or are not valid.

For now dropping this optimization avoids a miscompilation.

Patch by Björn Steinbrink.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214336 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-30 21:04:00 +00:00
Tim Northover
6f800f7333 CodeGenPrep: fall back to MVT::Other if instruction's type isn't an EVT.
The test being performed is just an approximation anyway, so it really
shouldn't crash when things don't go entirely as expected.

Should fix PR20474.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214177 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-29 10:20:22 +00:00
Hal Finkel
5cb8ab5b5a Canonicalization for @llvm.assume
Adds simple logical canonicalization of assumption intrinsics to instcombine,
currently:
 - invariant(a && b) -> invariant(a); invariant(b)
 - invariant(!(a || b)) -> invariant(!a); invariant(!b)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213977 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-25 21:45:17 +00:00
Hal Finkel
8ef7b17dfc Add @llvm.assume, lowering, and some basic properties
This is the first commit in a series that add an @llvm.assume intrinsic which
can be used to provide the optimizer with a condition it may assume to be true
(when the control flow would hit the intrinsic call). Some basic properties are added here:

 - llvm.invariant(true) is dead.
 - llvm.invariant(false) is unreachable (this directly corresponds to the
   documented behavior of MSVC's __assume(0)), so is llvm.invariant(undef).

The intrinsic is tagged as writing arbitrarily, in order to maintain control
dependencies. BasicAA has been updated, however, to return NoModRef for any
particular location-based query so that we don't unnecessarily block code
motion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213973 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-25 21:13:35 +00:00
Hal Finkel
9f0a2a8bd5 Convert noalias parameter attributes into noalias metadata during inlining
This functionality is currently turned off by default.

Part of the motivation for introducing scoped-noalias metadata is to enable the
preservation of noalias parameter attribute information after inlining.
Sometimes this can be inferred from the code in the caller after inlining, but
often we simply lose valuable information.

The overall process if fairly simple:
 1. Create a new unqiue scope domain.
 2. For each (used) noalias parameter, create a new alias scope.
 3. For each pointer, collect the underlying objects. Add a noalias scope for
    each noalias parameter from which we're not derived (and has not been
    captured prior to that point).
 4. Add an alias.scope for each noalias parameter from which we might be
    derived (or has been captured before that point).

Note that the capture checks apply only if one of the underlying objects is not
an identified function-local object.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213949 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-25 15:50:08 +00:00
Mark Heffernan
d10aa6f8b2 After unrolling a loop with llvm.loop.unroll.count metadata (unroll factor
hint) the loop unroller replaces the llvm.loop.unroll.count metadata with
llvm.loop.unroll.disable metadata to prevent any subsequent unrolling
passes from unrolling more than the hint indicates.  This patch fixes
an issue where loop unrolling could be disabled for other loops as well which
share the same llvm.loop metadata.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213900 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-24 22:36:40 +00:00
Manman Ren
7963749c69 Try to fix the bots again by moving test to X86 directory.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213884 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-24 17:57:09 +00:00
Manman Ren
203dc9d460 Try to fix the bots. If this does not work, I am going to move it to X86 directory.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213880 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-24 17:18:33 +00:00
Hal Finkel
16fd27b2c3 Add scoped-noalias metadata
This commit adds scoped noalias metadata. The primary motivations for this
feature are:
  1. To preserve noalias function attribute information when inlining
  2. To provide the ability to model block-scope C99 restrict pointers

Neither of these two abilities are added here, only the necessary
infrastructure. In fact, there should be no change to existing functionality,
only the addition of new features. The logic that converts noalias function
parameters into this metadata during inlining will come in a follow-up commit.

What is added here is the ability to generally specify noalias memory-access
sets. Regarding the metadata, alias-analysis scopes are defined similar to TBAA
nodes:

!scope0 = metadata !{ metadata !"scope of foo()" }
!scope1 = metadata !{ metadata !"scope 1", metadata !scope0 }
!scope2 = metadata !{ metadata !"scope 2", metadata !scope0 }
!scope3 = metadata !{ metadata !"scope 2.1", metadata !scope2 }
!scope4 = metadata !{ metadata !"scope 2.2", metadata !scope2 }

Loads and stores can be tagged with an alias-analysis scope, and also, with a
noalias tag for a specific scope:

... = load %ptr1, !alias.scope !{ !scope1 }
... = load %ptr2, !alias.scope !{ !scope1, !scope2 }, !noalias !{ !scope1 }

When evaluating an aliasing query, if one of the instructions is associated
with an alias.scope id that is identical to the noalias scope associated with
the other instruction, or is a descendant (in the scope hierarchy) of the
noalias scope associated with the other instruction, then the two memory
accesses are assumed not to alias.

Note that is the first element of the scope metadata is a string, then it can
be combined accross functions and translation units. The string can be replaced
by a self-reference to create globally unqiue scope identifiers.

[Note: This overview is slightly stylized, since the metadata nodes really need
to just be numbers (!0 instead of !scope0), and the scope lists are also global
unnamed metadata.]

Existing noalias metadata in a callee is "cloned" for use by the inlined code.
This is necessary because the aliasing scopes are unique to each call site
(because of possible control dependencies on the aliasing properties). For
example, consider a function: foo(noalias a, noalias b) { *a = *b; } that gets
inlined into bar() { ... if (...) foo(a1, b1); ... if (...) foo(a2, b2); } --
now just because we know that a1 does not alias with b1 at the first call site,
and a2 does not alias with b2 at the second call site, we cannot let inlining
these functons have the metadata imply that a1 does not alias with b2.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213864 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-24 14:25:39 +00:00
Manman Ren
4cae9cb034 SimplifyCFG: fix a bug in switch to table conversion
We use gep to access the global array "switch.table", and the table index
should be treated as unsigned. When the highest bit is 1, this commit
zero-extends the index to an integer type with larger size.

For a switch on i2, we used to generate:
%switch.tableidx = sub i2 %0, -2
getelementptr inbounds [4 x i64]* @switch.table, i32 0, i2 %switch.tableidx

It is incorrect when %switch.tableidx is 2 or 3. The fix is to generate
%switch.tableidx = sub i2 %0, -2
%switch.tableidx.zext = zext i2 %switch.tableidx to i3
getelementptr inbounds [4 x i64]* @switch.table, i32 0, i3 %switch.tableidx.zext

rdar://17735071


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213815 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 23:13:23 +00:00
David Blaikie
ccd1035ad4 ArgPromo+DebugInfo: Handle updating debug info over multiple applications of argument promotion.
While the subprogram map cache used by Dead Argument Elimination works
there, I made a mistake when reusing it for Argument Promotion in
r212128 because ArgPromo may transform functions more than once whereas
DAE transforms each function only once, removing all the dead arguments
in one go.

To address this, ensure that the map is updated after each argument
promotion.

In retrospect it might be a little wasteful to create a map of all
subprograms when only handling a single CGSCC, but the alternative is
walking the debug info for each function in the CGSCC that gets updated.
It's not clear to me what the right tradeoff is there, but since the
current tradeoff seems to be working OK (and the code to keep things
updated is very cheap), let's stick with that for now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213805 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 22:09:29 +00:00
David Blaikie
163a3f95f2 Test debug info in arg promotion with an actual promotion case, rather than a degenerate arg promotion that's actually DAE performed by ArgPromo
Also the debug location I had here was bogus, describing the location of
the call site as in the callee - and unnecessary, so just drop it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213803 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 21:30:59 +00:00
Mark Heffernan
d55c7c7f42 Do not add unroll disable metadata after unrolling pass for loops with #pragma clang loop unroll(full).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213789 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 20:05:44 +00:00
Mark Heffernan
e8d7ebcd5a In unroll pragma syntax and loop hint metadata, change "enable" forms to a new form using the string "full".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213772 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 17:31:37 +00:00
Nick Lewycky
1bda2d9386 We may visit a call that uses an alloca multiple times in callUsesLocalStack, sometimes with IsNocapture true and sometimes with IsNocapture false. We accidentally skipped work we needed to do in the IsNocapture=false case if we were called with IsNocapture=true the first time. Fixes PR20405!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213726 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 06:24:49 +00:00
Suyog Sarda
c9ea25fc51 This patch implements optimization as mentioned in PR19753: Optimize comparisons with "ashr/lshr exact" of a constanst.
It handles the errors which were seen in PR19958 where wrong code was being emitted due to earlier patch.
Added code for lshr as well as non-exact right shifts.

It implements : 
(icmp eq/ne (ashr/lshr const2, A), const1)" ->
(icmp eq/ne A, Log2(const2/const1)) ->
(icmp eq/ne A, Log2(const2) - Log2(const1))

Differential Revision: http://reviews.llvm.org/D4068
 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213678 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 19:19:36 +00:00
Suyog Sarda
3326ee444a Added InstCombine transform for pattern "(A & B) ^ (A ^ B) -> (A | B)"
Patch idea by Ankit Jain !

Differential Revision: http://reviews.llvm.org/D4618



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213677 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 18:30:54 +00:00
Suyog Sarda
1a1b1f708d Added InstCombine Transform for patterns:
"((~A & B) | A) -> (A | B)" and "((A & B) | ~A) -> (~A | B)"

Original Patch credit to Ankit Jain !!

Differential Revision: http://reviews.llvm.org/D4591



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213676 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 18:09:41 +00:00
Hal Finkel
b3b2aac5be Make use of the align parameter attribute for all pointer arguments
We previously supported the align attribute on all (pointer) parameters, but we
only used it for byval parameters. However, it is completely consistent at the
IR level to treat 'align n' on all pointer parameters as an alignment
assumption on the pointer, and now we wll. Specifically, this causes
computeKnownBits to use the align attribute on all pointer parameters, not just
byval parameters. I've also added an explicit parameter attribute test for this
to test/Bitcode/attributes.ll.

And I've updated the LangRef to document the align parameter attribute (as it
turns out, it was not documented at all previously, although the byval
documentation mentioned that it could be used).

There are (at least) two benefits to doing this:
 - It allows enhancing alignment based on the pointer alignment after inlining callees.
 - It allows simplification of pointer arithmetic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213670 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 16:58:55 +00:00
Suyog Sarda
578c74e35d This patch implements transform for pattern "(A | B) ^ (~A) -> (A | ~B)".
Patch Credit to Ankit Jain !!

Differential Revision: http://reviews.llvm.org/D4588



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213662 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 15:37:39 +00:00
Mark Heffernan
bc7f1aba2d Rename metadata llvm.loop.vectorize.unroll to llvm.loop.vectorize.interleave.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213588 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 23:11:03 +00:00
Hal Finkel
160f9b9c10 [LoopVectorize] Use AA to partition potential dependency checks
Prior to this change, the loop vectorizer did not make use of the alias
analysis infrastructure. Instead, it performed memory dependence analysis using
ScalarEvolution-based linear dependence checks within equivalence classes
derived from the results of ValueTracking's GetUnderlyingObjects.

Unfortunately, this meant that:
  1. The loop vectorizer had logic that essentially duplicated that in BasicAA
     for aliasing based on identified objects.
  2. The loop vectorizer could not partition the space of dependency checks
     based on information only easily available from within AA (TBAA metadata is
     currently the prime example).

This means, for example, regardless of whether -fno-strict-aliasing was
provided, the vectorizer would only vectorize this loop with a runtime
memory-overlap check:

void foo(int *a, float *b) {
  for (int i = 0; i < 1600; ++i)
    a[i] = b[i];
}

This is suboptimal because the TBAA metadata already provides the information
necessary to show that this check unnecessary. Of course, the vectorizer has a
limit on the number of such checks it will insert, so in practice, ignoring
TBAA means not vectorizing more-complicated loops that we should.

This change causes the vectorizer to use an AliasSetTracker to keep track of
the pointers in the loop. The resulting alias sets are then used to partition
the space of dependency checks, and potential runtime checks; this results in
more-efficient vectorizations.

When pointer locations are added to the AliasSetTracker, two things are done:
  1. The location size is set to UnknownSize (otherwise you'd not catch
     inter-iteration dependencies)
  2. For instructions in blocks that would need to be predicated, TBAA is
     removed (because the metadata might have a control dependency on the condition
     being speculated).

For non-predicated blocks, you can leave the TBAA metadata. This is safe
because you can't have an iteration dependency on the TBAA metadata (if you
did, and you unrolled sufficiently, you'd end up with the same pointer value
used by two accesses that TBAA says should not alias, and that would yield
undefined behavior).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213486 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-20 23:07:52 +00:00
Hal Finkel
2350e9f6b7 [LoopVectorize] Propagate known metadata to vectorized instructions
There are some kinds of metadata that are safe to propagate from the scalar
instructions to the vector instructions (fpmath and tbaa currently).

Regarding TBAA, one might worry about propagating it on if-converted loads and
stores, because the metadata might have had a control dependency on the
condition, and thus actually aliased with some other non-speculated memory
access when the condition was false. However, this would be caught by the
runtime overlap checks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213452 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-19 13:33:16 +00:00
Hal Finkel
7c11695a23 Make Value::isDereferenceablePointer handle offsets to pointer types with dereferenceable attributes
When we have a parameter (or call site return) with a dereferenceable
attribute, it can specify the size of an array pointed to by that parameter. If
we have a value for which we can accumulate a constant offset to such a
parameter, then we can use that offset in a direct comparison with the size
specified by the dereferenceable attribute.

This enables us to handle cases like this:

  int foo(int a[static 3]) {
    return a[2]; /* this is always dereferenceable */
  }

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213447 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-19 03:25:16 +00:00
Mark Heffernan
354f2afffd Remove unroll pragma metadata after it is used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213412 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-18 21:04:33 +00:00
Gerolf Hoflehner
d94715e273 MergedLoadStoreMotion pass
Merges equivalent loads on both sides of a hammock/diamond
and hoists into into the header.
Merges equivalent stores on both sides of a hammock/diamond
and sinks it to the footer.
Can enable if conversion and tolerate better load misses
and store operand latencies.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213396 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-18 19:13:09 +00:00
Hal Finkel
11af4b49b2 Add a dereferenceable attribute
This attribute indicates that the parameter or return pointer is
dereferenceable. Practically speaking, loads from such a pointer within the
associated byte range are safe to speculatively execute. Such pointer
parameters are common in source languages (C++ references, for example).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213385 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-18 15:51:28 +00:00
Matt Arsenault
a32c319741 R600: Implement TTI:getPopcntSupport
The test is just copied from X86, and I don't know of a better
way to test it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213351 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-18 06:07:13 +00:00
Suyog Sarda
c84f22aac5 Move ashr optimization from InstCombineShift to InstSimplify.
Refactor code, no functionality change, test case moved from instcombine to instsimplify.

Differential Revision: http://reviews.llvm.org/D4102
 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213231 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-17 06:28:15 +00:00
Hal Finkel
8f609696e0 Improve BasicAA CS-CS queries (redux)
This reverts, "r213024 - Revert r212572 "improve BasicAA CS-CS queries", it
causes PR20303." with a fix for the bug in pr20303. As it turned out, the
relevant code was both wrong and over-conservative (because, as with the code
it replaced, it would return the overall ModRef mask even if just Ref had been
implied by the argument aliasing results). Hopefully, this correctly fixes both
problems.

Thanks to Nick Lewycky for reducing the test case for pr20303 (which I've
cleaned up a little and added in DSE's test directory). The BasicAA test has
also been updated to check for this error.

Original commit message:

BasicAA contains knowledge of certain intrinsics, such as memcpy and memset,
and uses that information to form more-accurate answers to CallSite vs. Loc
ModRef queries. Unfortunately, it did not use this information when answering
CallSite vs. CallSite queries.

Generically, when an intrinsic takes one or more pointers and the intrinsic is
marked only to read/write from its arguments, the offset/size is unknown. As a
result, the generic code that answers CallSite vs. CallSite (and CallSite vs.
Loc) queries in AA uses UnknownSize when forming Locs from an intrinsic's
arguments. While BasicAA's CallSite vs. Loc override could use more-accurate
size information for some intrinsics, it did not do the same for CallSite vs.
CallSite queries.

This change refactors the intrinsic-specific logic in BasicAA into a generic AA
query function: getArgLocation, which is overridden by BasicAA to supply the
intrinsic-specific knowledge, and used by AA's generic implementation. This
allows the intrinsic-specific knowledge to be used by both CallSite vs. Loc and
CallSite vs. CallSite queries, and simplifies the BasicAA implementation.

Currently, only one function, Mac's memset_pattern16, is handled by BasicAA
(all the rest are intrinsics). As a side-effect of this refactoring, BasicAA's
getModRefBehavior override now also returns OnlyAccessesArgumentPointees for
this function (which is an improvement).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213219 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-17 01:28:25 +00:00
Jingyue Wu
1d56cda023 Partially revert r210444 due to performance regression
Summary:
Converting outermost zext(a) to sext(a) causes worse code when the
computation of zext(a) could be reused. For example, after converting

... = array[zext(a)]
... = array[zext(a) + 1]

to

... = array[sext(a)]
... = array[zext(a) + 1],

the program computes sext(a), which is actually unnecessary. I added one
test in split-gep-and-gvn.ll to illustrate this scenario.

Also, with r211281 and r211084, we annotate more "nuw" tags to
computation involving CUDA intrinsics such as threadIdx.x. These
annotations help with splitting GEP a lot, rendering the benefit we get
from this reverted optimization only marginal.

Test Plan: make check-all

Reviewers: eliben, meheff

Reviewed By: meheff

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D4542

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213209 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-16 23:25:00 +00:00