Commit Graph

17472 Commits

Author SHA1 Message Date
Benjamin Kramer
badffcf8fd LoopIdiom: Add checks to avoid turning memmove into an infinite loop.
I don't think this is possible with the current implementation but that may change eventually.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166877 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-27 15:18:28 +00:00
Benjamin Kramer
d11c5d08a5 LoopIdiom: Recognize memmove loops.
This turns loops like
  for (unsigned i = 0; i != n; ++i)
    p[i] = p[i+1];
into memmove, which has a highly optimized implementation in most libcs.

This was really easy with the new DependenceAnalysis :)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166875 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-27 14:25:51 +00:00
Benjamin Kramer
96c8735e28 LoopIdiom: Replace custom dependence analysis with DependenceAnalysis.
Requires a lot less code and complexity on loop-idiom's side and the more
precise analysis can catch more cases, like the one I included as a test case.
This also fixes the edge-case miscompilation from PR9481.

Compile time performance seems to be slightly worse, but this is mostly due
to an extra LCSSA run scheduled by the PassManager and should be fixed there.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166874 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-27 14:25:44 +00:00
Nadav Rotem
f065a84677 1. Fix a bug in getTypeConversion. When a *simple* type is split, we need to return the type of the split result.
2. Change the maximum vectorization width from 4 to 8.
3. A test for both.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166864 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-27 04:11:32 +00:00
Quentin Colombet
80acd97266 [code size][ARM] Emit regular call instructions instead of the move, branch sequence
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166854 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-27 01:10:17 +00:00
Reed Kotler
7797e8f901 Implement MipsHi for mips16
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166852 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-27 00:57:14 +00:00
Akira Hatanaka
21a9a98b77 [mips] Do not tail-call optimize vararg functions or functions with byval
arguments.

This is rather conservative and should be fixed later to be more aggressive.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166851 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-27 00:56:56 +00:00
Akira Hatanaka
4618e0b574 [mips] Make sure FuncArg doesn't advance when OrigArgIndex is the same as in the
previous iteration.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166850 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-27 00:44:39 +00:00
Nadav Rotem
a5a3a61c5f Refactor the VectorTargetTransformInfo interface.
Add getCostXXX calls for different families of opcodes, such as casts, arithmetic, cmp, etc.

Port the LoopVectorizer to the new API.

The LoopVectorizer now finds instructions which will remain uniform after vectorization. It uses this information when calculating the cost of these instructions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166836 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 23:49:28 +00:00
Jakob Stoklund Olesen
17f42e02a1 Revert r163298 "Optimize codegen for VSETLNi{8,16,32} operating on Q registers."
Keep the integer_insertelement test case, the new coalescer can handle
this kind of lane insertion without help from pseudo-instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166835 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 23:39:46 +00:00
Reed Kotler
25424154f4 implement mips16 tls global addr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166827 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 22:57:32 +00:00
Jakob Stoklund Olesen
cd275f5687 Add GPRPair Register class to ARM.
Some instructions in ARM require 2 even-odd paired GPRs. This
patch adds support for such register class.

Patch by Weiming Zhao!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166816 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 21:29:15 +00:00
Benjamin Kramer
b8b3f6081f Remove LoopDependenceAnalysis.
It was unmaintained and not much more than a stub. The new DependenceAnalysis
pass is both more general and complete.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166810 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 20:25:01 +00:00
Hal Finkel
ecc69a1d99 Move target-specific BBVectorize tests into a separate directory.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166802 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 19:38:09 +00:00
Nadav Rotem
f563801144 Move the target-specific tests, which require specific backends, to dirs that only run if the target is present.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166796 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 18:52:01 +00:00
Rafael Espindola
e5551ed9ce Change the internalize pass to internalize all symbols when given an empty
list of externals. This makes sense since a shared library with no symbols
can still be useful if it has static constructors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166795 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 18:47:48 +00:00
Benjamin Kramer
b2b2273ef4 Fix SCEV cache invalidation in LCSSA and LoopSimplify.
The LoopSimplify bug is pretty harmless because the loop goes from unanalyzable
to analyzable but the LCSSA bug is very nasty. It only comes into play with a
specific order of the LoopPassManager worklist and can cause actual
miscompilations, when a SCEV refers to a value that has been replaced with PHI
node. SCEVExpander may then insert code into the wrong place, either violating
domination or randomly miscompiling stuff.

Comes with an extensive test case reduced from the test-suite with
bugpoint+SCEVValidator.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166787 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 17:31:43 +00:00
Nadav Rotem
12145f0339 Fix a crash in SimpliftDemandedBits of vectors of pointers.
PR14183.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166785 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 17:17:05 +00:00
Reed Kotler
a81be80b0e Implement carry for subtract/add for mips16
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166755 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 04:46:26 +00:00
Reed Kotler
28a6214b59 implement large (>16 bit) constant loading.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166749 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 03:09:34 +00:00
Rafael Espindola
d2930d364a Fix unexpected passes. These test do work with LTO on linux. I tested both
a cmake and an autoconf build.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166748 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 02:19:02 +00:00
Reed Kotler
30675b12fd fix test setgek.ll so that it will not give false "make check"
failure in some cases


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166747 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 01:29:42 +00:00
Rafael Espindola
cadc38a4b1 Port testcase to FileCheck.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166742 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 00:14:11 +00:00
Hal Finkel
822ab00847 Disable generation of pointer vectors by BBVectorize.
Once vector-of-pointer support works, then this can be reverted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166741 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 00:05:26 +00:00
Nadav Rotem
0636291137 Revert 166726 because it may have broken a number of SPEC tests. PR14183.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166739 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 23:51:48 +00:00
Nadav Rotem
4f3c676102 Fix a crash in ValueTracking. Add support for vectors of pointers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166726 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 21:52:52 +00:00
Nadav Rotem
b6cbb6215c Fix the cost-model test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166722 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 21:42:50 +00:00
Reed Kotler
4c5a6dab17 implement mips16 patterns for select nodes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166721 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 21:33:30 +00:00
Hal Finkel
32ff623c8b Add CPU model to BBVectorize cost-model tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166720 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 21:31:51 +00:00
Nadav Rotem
402a4d616f Add the cpu model to the test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166718 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 21:18:42 +00:00
Hal Finkel
65309660fa Begin incorporating target information into BBVectorize.
This is the first of several steps to incorporate information from the new
TargetTransformInfo infrastructure into BBVectorize. Two things are done here:

 1. Target information is used to determine if it is profitable to fuse two
    instructions. This means that the cost of the vector operation must not
    be more expensive than the cost of the two original operations. Pairs that
    are not profitable are no longer considered (because current cost information
    is incomplete, for intrinsics for example, equal-cost pairs are still
    considered).

 2. The 'cost savings' computed for the profitability check are also used to
    rank the DAGs that represent the potential vectorization plans. Specifically,
    for nodes of non-trivial depth, the cost savings is used as the node
    weight.

The next step will be to incorporate the shuffle costs into the DAG weighting;
this will give the edges of the DAG weights as well. Once that is done, when
target information is available, we should be able to dispense with the
depth heuristic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166716 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 21:12:23 +00:00
Jakob Stoklund Olesen
e5a7a68dfa Also optimize large switch statements.
The isValueEqualityComparison() guard at the top of SimplifySwitch()
only applies to some of the possible transformations.

The newer transformations work just fine on large switches, and the
check on predecessor count is nonsensical.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166710 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 18:51:15 +00:00
Michael Liao
8d7cd1d8fc Add test for ATOM ISA SSSE3
- Remove SSE4.1 feature in other ATOM-based test cases



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166699 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 17:50:05 +00:00
Bill Schmidt
37900c5dcb This patch addresses a PPC64 ELF issue with passing parameters consisting of
structs having size 3, 5, 6, or 7.  Such a struct must be passed and received
as right-justified within its register or memory slot.  The problem is only
present for structs that are passed in registers.

Previously, as part of a patch handling all structs of size less than 8, I
added logic to rotate the incoming register so that the struct was left-
justified prior to storing the whole register.  This was incorrect because
the address of the parameter had already been adjusted earlier to point to
the right-adjusted value in the storage slot.  Essentially I had accidentally
accounted for the right-adjustment twice.

In this patch, I removed the incorrect logic and reorganized the code to make
the flow clearer.

The removal of the rotates changes the expected code generation, so test case
structsinregs.ll has been modified to reflect this.  I also added a new test
case, jaggedstructs.ll, to demonstrate that structs of these sizes can now
be properly received and passed.

I've built and tested the code on powerpc64-unknown-linux-gnu with no new
regressions.  I also ran the GCC compatibility test suite and verified that
earlier problems with these structs are now resolved, with no new regressions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166680 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 13:38:09 +00:00
Adhemerval Zanella
aa71428378 Initial TOC support for PowerPC64 object creation
This patch adds initial PPC64 TOC MC object creation using the small mcmodel
(a single 64K TOC) adding the some TOC relocations (R_PPC64_TOC,
R_PPC64_TOC16, and R_PPC64_TOC16DS).

The addition of 'undefinedExplicitRelSym' hook on 'MCELFObjectTargetWriter'
is meant to avoid the creation of an unreferenced ".TOC." symbol (used in
the .odp creation) as well to set the R_PPC64_TOC relocation target as the
temporary ".TOC." symbol. On PPC64 ABI, the R_PPC64_TOC relocation should
not point to any symbol.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166677 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 12:27:42 +00:00
Elena Demikhovsky
c0cd72204d The test avx-intel-ocl.ll failed. I can't reproduce on any of my machines. I added -mcpu flag, may be it will fix the problem
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166669 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 08:38:42 +00:00
Chandler Carruth
a2b88163af Teach SROA how to split whole-alloca integer loads and stores into
smaller integer loads and stores.

The high-level motivation is that the frontend sometimes generates
a single whole-alloca integer load or store during ABI lowering of
splittable allocas. We need to be able to break this apart in order to
see the underlying elements and properly promote them to SSA values. The
hope is that this fixes some performance regressions on x86-32 with the
new SROA pass.

Unfortunately, this causes quite a bit of churn in the test cases, and
bloats some IR that comes out. When we see an alloca that consists soley
of bits and bytes being extracted and re-inserted, we now do some
splitting first, before building widened integer "bucket of bits"
representations. These are always well folded by instcombine however, so
this shouldn't actually result in missed opportunities.

If this splitting of all-integer allocas does cause problems (perhaps
due to smaller SSA values going into the RA), we could potentially go to
some extreme measures to only do this integer splitting trick when there
are non-integer component accesses of an alloca, but discovering this is
quite expensive: it adds yet another complete walk of the recursive use
tree of the alloca.

Either way, I will be watching build bots and LNT bots to see what
fallout there is here. If anyone gets x86-32 numbers before & after this
change, I would be very interested.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166662 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 04:37:07 +00:00
Nadav Rotem
8dbac7b529 Add support for additional reduction variables: AND, OR, XOR.
Patch by Paul Redmond <paul.redmond@intel.com>.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166649 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 00:08:41 +00:00
Nadav Rotem
2652c50f74 Implement a basic cost model for vector and scalar instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166642 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 23:47:38 +00:00
Chad Rosier
d95666c226 Tell llvm-mc we're using intel syntax, so we don't have to use directives.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166640 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 23:34:38 +00:00
Chad Rosier
b3009eec47 [ms-inline asm] Add back-end test case for r166632. Make sure we emit the
correct .s output as well as get the correct encoding by the integrated
assembler.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166638 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 23:10:28 +00:00
Hal Finkel
aacb68806f Update GVN to support vectors of pointers.
GVN will now generate ptrtoint instructions for vectors of pointers.
Fixes PR14166.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166624 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 21:22:30 +00:00
Nadav Rotem
50bec6f8c4 LoopVectorizer: Add a basic cost model which uses the VTTI interface.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166620 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 20:36:32 +00:00
Evan Cheng
d258eb3ec5 Fix a miscompilation caused by a typo. When turning a adde with negative value
into a sbc with a positive number, the immediate should be complemented, not
negated. Also added a missing pattern for ARM codegen.

rdar://12559385


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166613 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 19:53:01 +00:00
Hal Finkel
8c65549318 getSmallConstantTripMultiple should never return zero.
When the trip count is -1, getSmallConstantTripMultiple could return zero,
and this would cause runtime loop unrolling to assert. Instead of returning
zero, one is now returned (consistent with the existing overflow cases).
Fixes PR14167.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166612 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 19:46:44 +00:00
Micah Villmow
aa76e9e2cf Add in support for getIntPtrType to get the pointer type based on the address space.
This checkin also adds in some tests that utilize these paths and updates some of the
clients.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166578 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 15:52:52 +00:00
Elena Demikhovsky
3575222175 Special calling conventions for Intel OpenCL built-in library.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166566 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 14:46:16 +00:00
Duncan Sands
747fcd58bc Add a testcase that would have noticed the typo fixed in commit 166475.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166547 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 07:17:20 +00:00
Michael Liao
1a5cc710ee Teach DAG combine to fold (buildvec (Xint2fp x)) to (Xint2fp (buildvec x))
- If more than 1 elemennts are defined and target supports the vectorized
  conversion, use the vectorized one instead to reduce the strength on
  conversion operation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166546 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 04:14:18 +00:00
Michael Liao
991b6a22b6 Add custom conversion from v2u32 to v2f32 in 32-bit mode
- As there's no 64-bit GPRs in 32-bit mode, a custom conversion from v2u32 to
  v2f32 is added to improve the efficiency of the code generated.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166545 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 04:09:32 +00:00
Akira Hatanaka
2ef5bd3ba6 [mips] Make sure sret argument is returned in register V0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166539 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 02:10:54 +00:00
Rafael Espindola
847a9c6d77 Change x86_fastcallcc to require inreg markers. This allows it to known
the difference from "int x" (which should go in registers and
"struct y {int x;}" (which should not).

Clang will be updated in the next patches.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166536 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 01:58:48 +00:00
Michael Liao
0787274b70 Fix PR14161
- Check index being extracted to be constant 0 before simplfiying.
  Otherwise, retain the original sequence.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166504 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-23 21:40:15 +00:00
Nadav Rotem
6457001f31 Use the AliasAnalysis isIdentifiedObj because it also understands mallocs and c++ news.
PR14158.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166491 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-23 18:44:18 +00:00
Bill Wendling
b846719663 Ignore unreachable blocks when doing memory dependence analysis on non-local
loads. It's not really profitable and may result in GVN going into an infinite
loop when it hits constructs like this:

     %x = gep %some.type %x, ...

Found via an LTO build of LLVM.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166490 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-23 18:37:11 +00:00
Michael Liao
d9d09600ee Enable lowering ZERO_EXTEND/ANY_EXTEND to PMOVZX from SSE4.1
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166486 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-23 17:34:00 +00:00
Duncan Sands
bbc7016c60 Transform code like this
%V = mul i64 %N, 4
 %t = getelementptr i8* bitcast (i32* %arr to i8*), i32 %V
into
 %t1 = getelementptr i32* %arr, i32 %N
 %t = bitcast i32* %t1 to i8*
incorporating the multiplication into the getelementptr.
This happens all the time in dragonegg, for example for
  int foo(int *A, int N) {
    return A[N];
  }
because gcc turns this into byte pointer arithmetic before it hits the plugin:
  D.1590_2 = (long unsigned int) N_1(D);
  D.1591_3 = D.1590_2 * 4;
  D.1592_5 = A_4(D) + D.1591_3;
  D.1589_6 = *D.1592_5;
  return D.1589_6;
The D.1592_5 line is a POINTER_PLUS_EXPR, which is turned into a getelementptr
on a bitcast of A_4 to i8*, so this becomes exactly the kind of IR that the
transform fires on.

An analogous transform (with no testcases!) already existed for bitcasts of
arrays, so I rewrote it to share code with this one.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166474 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-23 08:28:26 +00:00
Reed Kotler
293e5d0e0a implement setXX patterns
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166459 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-23 01:35:48 +00:00
Bill Wendling
8f47fc8f00 When a block ends in an indirect branch, add its successors to the machine basic block.
The CFG of the machine function needs to know that the targets of the indirect
branch are successors to the indirect branch.
<rdar://problem/12529625>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166448 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-22 23:30:04 +00:00
Kevin Enderby
3ed0316f75 Add support for annotated disassembly output for X86 and arm.
Per the October 12, 2012 Proposal for annotated disassembly output sent out by
Jim Grosbach this set of changes implements this for X86 and arm.  The llvm-mc
tool now has a -mdis option to produced the marked up disassembly and a couple
of small example test cases have been added.

rdar://11764962


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166445 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-22 22:31:46 +00:00
Nadav Rotem
782090aa02 Don't crash if the load/store pointer is not a GEP.
Fix by Shivarama Rao <Shivarama.Rao@amd.com>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166427 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-22 18:27:56 +00:00
Nadav Rotem
81750822f4 Add a testcase for the previous commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166425 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-22 18:16:55 +00:00
Argyrios Kyrtzidis
0b06e2331e Revert r166407 because it caused analyzer tests to crash and broke self-host bots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166424 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-22 18:16:14 +00:00
Hal Finkel
e29c19091c BBVectorize should ignore unreachable blocks.
Unreachable blocks can have invalid instructions. For example,
jump threading can produce self-referential instructions in
unreachable blocks. Also, we should not be spending time
optimizing unreachable code. Fixes PR14133.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166423 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-22 18:00:55 +00:00
Nadav Rotem
565048e78a Vectorizer: optimize the generation of selects. If the condition is uniform, generate a scalar-cond select (i1 as selector).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166409 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-22 04:38:00 +00:00
Nick Lewycky
18b1f4e769 Reapply r166405, teaching tailcallelim to be smarter about nocapture, with a
very small but very important bugfix:
  bool shouldExplore(Use *U) {
    Value *V = U->get();
    if (isa<CallInst>(V) || isa<InvokeInst>(V))
    [...]
should have read:
  bool shouldExplore(Use *U) {
    Value *V = U->getUser();
    if (isa<CallInst>(V) || isa<InvokeInst>(V))
Fixes PR14143!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166407 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-22 03:03:52 +00:00
NAKAMURA Takumi
d581b9e61f Revert r166405, "Teach TailRecursionElimination to consider 'nocapture' when deciding whether"
It broke selfhosting stage2 in several builders.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166406 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-22 00:48:51 +00:00
Nick Lewycky
241d1398e0 Teach TailRecursionElimination to consider 'nocapture' when deciding whether
calls can be marked tail.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166405 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-21 23:51:22 +00:00
Hal Finkel
3d39fb8a3f DataLayout should use itself when calculating the size of a vector.
This is important for vectors of pointers because only DataLayout,
not the underlying vector type, knows how to calculate the size
of the pointers in the vector. Fixes PR14138.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166401 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-21 20:38:03 +00:00
Benjamin Kramer
3740e798bc Revert r166390 "LoopIdiom: Replace custom dependence analysis with LoopDependenceAnalysis."
It passes all tests, produces better results than the old code but uses the
wrong pass, LoopDependenceAnalysis, which is old and unmaintained. "Why is it
still in tree?", you might ask. The answer is obviously: "To confuse developers."

Just swapping in the new dependency pass sends the pass manager into an infinte
loop, I'll try to figure out why tomorrow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166399 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-21 19:31:16 +00:00
Benjamin Kramer
5c6e9ae14e LoopIdiom: Replace custom dependence analysis with LoopDependenceAnalysis.
Requires a lot less code and complexity on loop-idiom's side and the more
precise analysis can catch more cases, like the one I included as a test case.
This also fixes the edge-case miscompilation from PR9481. I'm not entirely
sure that all cases are handled that the old checks handled but LDA will
certainly become smarter in the future.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166390 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-21 15:03:07 +00:00
Nadav Rotem
bb950854ac Fix a bug in the vectorization of wide load/store operations.
We used a SCEV to detect that A[X] is consecutive. We assumed that X was
the induction variable. But X can be any expression that uses the induction
for example: X = i + 2;



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166388 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-21 06:49:10 +00:00
Nadav Rotem
c847872629 Add support for reduction variables that do not start at zero.
This is important for nested-loop reductions such as :

In the innermost loop, the induction variable does not start with zero:

for (i = 0 .. n)
 for (j = 0 .. m)
  sum += ...



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166387 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-21 05:52:51 +00:00
Nadav Rotem
5a418ba5f5 Vectorizer: fix a bug in the classification of induction/reduction phis.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166384 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-21 02:38:01 +00:00
Nadav Rotem
ccaccfa8bf Fix an infinite loop in the loop-vectorizer.
PR14134.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166379 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-20 20:45:01 +00:00
Benjamin Kramer
82a1833865 InstCombine: Fix an edge case where constant icmps could sneak into ConstantFoldInstOperands and crash.
Have to refactor the ConstantFolder interface one day to define bugs like this away. Fixes PR14131.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166374 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-20 08:43:52 +00:00
Nadav Rotem
bf8772ed2c Vectorize: teach cavVectorizeMemory to distinguish between A[i]+=x and A[B[i]]+=x.
If the pointer is consecutive then it is safe to read and write. If the pointer is non-loop-consecutive then
it is unsafe to vectorize it because we may hit an ordering issue.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166371 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-20 08:26:33 +00:00
Nadav Rotem
5dbe64e2bc Vectorizer: Add support for loop reductions.
For example:

  for (i=0; i<n; i++)
   sum += A[i] +  B[i] + i;



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166351 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 23:05:40 +00:00
Akira Hatanaka
30580cea43 [mips] Use 64-bit registers to return an sret pointer if target ABI is N64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166344 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 22:11:40 +00:00
Akira Hatanaka
2b861be96e [mips] Add code to do tail call optimization.
Currently, it is enabled only if option "enable-mips-tail-calls" is given and
all of the callee's arguments are passed in registers.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166342 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 21:47:33 +00:00
Benjamin Kramer
0aae4bd0fc SimplifyLibcalls: The return value of ffsll is always i32, even when the input is zero.
Fixes PR13028.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166313 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 20:43:44 +00:00
Daniel Dunbar
9f4acd0aab tests: Stop mangling '-vg' into the triple, we don't use this currently.
- Also, lit is going to get a valgrind feature, instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166302 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 20:11:56 +00:00
Shuxin Yang
970755e519 This patch is to fix radar://8426430. It is about llvm support of __builtin_debugtrap()
which is supposed to consistently raise SIGTRAP across all systems. In contrast,
__builtin_trap() behave differently on different systems. e.g. it raises SIGTRAP on ARM, and
SIGILL on X86. The purpose of __builtin_debugtrap() is to consistently provide "trap"
functionality, in the mean time preserve the compatibility with on gcc on __builtin_trap().

  The X86 backend is already able to handle debugtrap(). This patch is to:
  1) make front-end recognize "__builtin_debugtrap()" (emboddied in the one-line change to Clang).
  2) In DAG legalization phase, by default, "debugtrap" will be replaced with "trap", which
     make the __builtin_debugtrap() "available" to all existing ports without the hassle of
     changing their code.
  3) If trap-function is specified (via -trap-func=xyz to llc), both __builtin_debugtrap() and
     __builtin_trap() will be expanded into the function call of the specified trap function.
    This behavior may need change in the future.

  The provided testing-case is to make sure 2) and 3) are working for ARM port, and we
already have a testing case for x86. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166300 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 20:11:16 +00:00
Benjamin Kramer
7182126b0f Indvars: Don't recursively delete instruction during BB iteration.
This can invalidate the iterators leading to use after frees and crashes.
Fixes PR12536.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166291 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 17:53:54 +00:00
Michael Liao
facace808c Lower BUILD_VECTOR to SHUFFLE + INSERT_VECTOR_ELT for X86
- If INSERT_VECTOR_ELT is supported (above SSE2, either by custom
  sequence of legal insn), transform BUILD_VECTOR into SHUFFLE +
  INSERT_VECTOR_ELT if most of elements could be built from SHUFFLE with few
  (so far 1) elements being inserted.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166288 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 17:15:18 +00:00
Benjamin Kramer
239fd44f7a SCEVExpander: Don't crash when trying to merge two constant phis.
Just constant fold them so they can't cause any trouble. Fixes PR12627.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166286 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 16:37:30 +00:00
Stepan Dyatkovskiy
0d3c8d5d16 ARM:
Removed extra stack frame object for fixed byval arguments,
VarArgsStyleRegisters invocation was reworked due to some improper usage in
past. PR14099 also demonstrates it.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166273 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 08:23:06 +00:00
Kostya Serebryany
bd0052a0f2 [asan] make sure asan erases old unused allocas after it created a new one. This became important after the recent move from ModulePass to FunctionPass because no cleanup is happening after asan pass any more.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166267 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 06:20:53 +00:00
Nadav Rotem
89e7b356f2 vectorizer: Add support for reading and writing from the same memory location.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166255 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 01:24:18 +00:00
Bob Wilson
82c5180ed1 Mark bugpoint tests with XFAIL when building with LTO. <rdar://problem/12473675>
The LTO Internalize pass is hiding symbols needed by the bugpoint-passes
plug-in.  We need to add a flag to control whether Internalize should be run.
This is a temporary workaround to make these tests pass in the meantime.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166239 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-18 22:03:31 +00:00
Daniel Dunbar
b288ad8e36 test: Add a lit config variable to check if LTO is enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166225 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-18 20:43:11 +00:00
Sebastian Pop
92c39cadbc Use pre-python 2.5 syntax in lit.cfg.
Author:    Quentin Neill <qneill@codeaurora.org>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166217 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-18 19:58:28 +00:00
Sebastian Pop
83ba06afa8 Clear unknown mem ops when merging stack slots (pr14090)
When merging stack slots, if StackColoring::remapInstructions gets a
value back from GetUnderlyingObject that it does not know about or is
not itself a stack slot, clear the memory operand in case it aliases
the merged slot. This prevents the introduction of incorrect aliasing
information.

Author:    Matthew Curtis <mcurtis@codeaurora.org>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166216 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-18 19:53:48 +00:00
Meador Inge
0c41d57b09 instcombine: Migrate strcpy optimizations
This patch migrates the strcpy optimizations from the simplify-libcalls pass
into the instcombine library call simplifier.  Note also that StrCpyChkOpt
has been updated with a few simplifications that were being done in the
simplify-libcalls version of StrCpyOpt, but not in the migrated implementation
of StrCpyOpt.  There is no reason to overload StrCpyOpt with fortified and
regular simplifications in the new model since there is already a dedicated
simplifier for __strcpy_chk.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166198 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-18 18:12:40 +00:00
Nadav Rotem
1c5bf3f429 In SimplifySelectOps we pulled two loads through a select node despite the fact that one was dependent on the other.
rdar://12513091



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166196 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-18 18:06:48 +00:00
Ulrich Weigand
6c28a7eec8 This patch fixes failures in the SingleSource/Regression/C/uint64_to_float
test case on PowerPC caused by rounding errors when converting from a 64-bit
integer to a single-precision floating point. The reason for this are
double-rounding effects, since on PowerPC we have to convert to an
intermediate double-precision value first, which gets rounded to the
final single-precision result.

The patch fixes the problem by preparing the 64-bit integer so that the
first conversion step to double-precision will always be exact, and the
final rounding step will result in the correctly-rounded single-precision
result.  The generated code sequence is equivalent to what GCC would generate.

When -enable-unsafe-fp-math is in effect, that extra effort is omitted
and we accept possible rounding errors (just like GCC does as well).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166178 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-18 13:16:11 +00:00
Nadav Rotem
1953ace81d Vectorizer: Add support for loops with an unknown count. For example:
for (i=0; i<n; i++){
        a[i] = b[i+1] + c[i+3];
     }



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166165 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-18 05:29:12 +00:00
Michael Liao
07edaf3801 Revert part of r166049 back and enable test case in r166125.
- Folding (trunc (concat ... X )) to (concat ... (trunc X) ...) is valid
  when '...' are all 'undef's.
- r166125 relies on this transformation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166155 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 23:45:54 +00:00
Michael Liao
6e0c2b36d6 Disable extract-concat test case temporarily
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166141 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 23:08:19 +00:00
Michael Liao
4031e9018b Revert r166049
- In general, it's unsafe for this transformation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166135 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 22:41:15 +00:00
Reed Kotler
95a2bb4cdf Add conditional branch instructions and their patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166134 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 22:29:54 +00:00
Michael Liao
13429e224c Teach DAG combine to fold (extract_subvec (concat v1, ..) i) to v_i
- If the extracted vector has the same type of all vectored being concatenated
  together, it should be simplified directly into v_i, where i is the index of
  the element being extracted.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166125 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 20:48:33 +00:00
Nadav Rotem
d15c0c7ac1 Add a loop vectorizer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166112 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 18:25:06 +00:00
Anton Korobeynikov
e4b33a115b Fix fallout from RegInfo => FrameLowering refactoring on MSP430.
Patch by Job Noorman!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166108 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 17:37:11 +00:00
Chandler Carruth
02bf98ab38 This just in, it is a *bad idea* to use 'udiv' on an offset of
a pointer. A very bad idea. Let's not do that. Fixes PR14105.

Note that this wasn't *that* glaring of an oversight. Originally, these
routines were only called on offsets within an alloca, which are
intrinsically positive. But over the evolution of the pass, they ended
up being called for arbitrary offsets, and things went downhill...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166095 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 09:23:48 +00:00
Michael Liao
281ae5abf5 Fix setjmp on models with non-Small code model nor non-Static relocation model
- MBB address is only valid as an immediate value in Small & Static
  code/relocation models. On other models, LEA is needed to load IP address of
  the restore MBB.
- A minor fix of MBB in MC lowering is added as well to enable target
  relocation flag being propagated into MC.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166084 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 02:22:27 +00:00
Jakob Stoklund Olesen
320db3f805 Avoid rematerializing a redef immediately after the old def.
PR14098 contains an example where we would rematerialize a MOV8ri
immediately after the original instruction:

  %vreg7:sub_8bit<def> = MOV8ri 9; GR32_ABCD:%vreg7
  %vreg22:sub_8bit<def> = MOV8ri 9; GR32_ABCD:%vreg7

Besides being pointless, it is also wrong since the original instruction
only redefines part of the register, and the value read by the new
instruction is wrong.

The problem was the LiveRangeEdit::allUsesAvailableAt() didn't
special-case OrigIdx == UseIdx and found the wrong SSA value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166068 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 22:51:58 +00:00
Jakob Stoklund Olesen
cdcdfd2cab Revert r166046 "Switch back to the old coalescer for now to fix the 32 bit bit"
A fix for PR14098, including the test case is in the next commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166067 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 22:51:55 +00:00
Michael Gottesman
4932bbe20c [InstCombine] Teach InstCombine how to handle an obfuscated splat.
An obfuscated splat is where the frontend poorly generates code for a splat
using several different shuffles to create the splat, i.e.,

  %A = load <4 x float>* %in_ptr, align 16
  %B = shufflevector <4 x float> %A, <4 x float> undef, <4 x i32> <i32 0, i32 0, i32 undef, i32 undef>
  %C = shufflevector <4 x float> %B, <4 x float> %A, <4 x i32> <i32 0, i32 1, i32 4, i32 undef>
  %D = shufflevector <4 x float> %C, <4 x float> %A, <4 x i32> <i32 0, i32 1, i32 2, i32 4>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166061 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 21:29:38 +00:00
Michael Liao
272ea03239 Teach DAG combine to fold (trunc (fptoXi x)) to (fptoXi x)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166049 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 19:38:35 +00:00
Rafael Espindola
6f7cccd2e2 Switch back to the old coalescer for now to fix the 32 bit bit
llvm+clang+compiler-rt bootstrap.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166046 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 19:34:06 +00:00
Bill Schmidt
7a6cb15a92 This patch addresses PR13949.
For the PowerPC 64-bit ELF Linux ABI, aggregates of size less than 8
bytes are to be passed in the low-order bits ("right-adjusted") of the
doubleword register or memory slot assigned to them.  A previous patch
addressed this for aggregates passed in registers.  However, small
aggregates passed in the overflow portion of the parameter save area are
still being passed left-adjusted.

The fix is made in PPCTargetLowering::LowerCall_Darwin_Or_64SVR4 on the
caller side, and in PPCTargetLowering::LowerFormalArguments_64SVR4 on
the callee side.  The main fix on the callee side simply extends
existing logic for 1- and 2-byte objects to 1- through 7-byte objects,
and correcting a constant left over from 32-bit code.  There is also a
fix to a bogus calculation of the offset to the following argument in
the parameter save area.

On the caller side, again a constant left over from 32-bit code is
fixed.  Additionally, some code for 1, 2, and 4-byte objects is
duplicated to handle the 3, 5, 6, and 7-byte objects for SVR4 only.  The
LowerCall_Darwin_Or_64SVR4 logic is getting fairly convoluted trying to
handle both ABIs, and I propose to separate this into two functions in a
future patch, at which time the duplication can be removed.

The patch adds a new test (structsinmem.ll) to demonstrate correct
passing of structures of all seven sizes.  Eight dummy parameters are
used to force these structures to be in the overflow portion of the
parameter save area.

As a side effect, this corrects the case when aggregates passed in
registers are saved into the first eight doublewords of the parameter
save area:  Previously they were stored left-justified, and now are
properly stored right-justified.  This requires changing the expected
output of existing test case structsinregs.ll.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166022 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 13:30:53 +00:00
Stepan Dyatkovskiy
b52ba9f8a8 Issue:
Stack is formed improperly for long structures passed as byval arguments for
EABI mode.

If we took AAPCS reference, we can found the next statements:

A: "If the argument requires double-word alignment (8-byte), the NCRN (Next
Core Register Number) is rounded up to the next even register number." (5.5
Parameter Passing, Stage C, C.3).

B: "The alignment of an aggregate shall be the alignment of its most-aligned
component." (4.3 Composite Types, 4.3.1 Aggregates).

So if we have structure with doubles (9 double fields) and 3 Core unused
registers (r1, r2, r3): caller should use r2 and r3 registers only.
Currently r1,r2,r3 set is used, but it is invalid.

Callee VA routine should also use r2 and r3 regs only. All is ok here. This
behaviour is guessed by rounding up SP address with ADD+BFC operations.

Fix:
Main fix is in ARMTargetLowering::HandleByVal. If we detected AAPCS mode and
8 byte alignment, we waste odd registers then.

P.S.:
I also improved LDRB_POST_IMM regression test. Since ldrb instruction will
not generated by current regression test after this patch. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166018 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 07:16:47 +00:00
NAKAMURA Takumi
e26874556b Reapply r165661, Patch by Shuxin Yang <shuxin.llvm@gmail.com>.
Original message:

The attached is the fix to radar://11663049. The optimization can be outlined by following rules:

   (select (x != c), e, c) -> select (x != c), e, x),
   (select (x == c), c, e) -> select (x == c), x, e)
where the <c> is an integer constant.

 The reason for this change is that : on x86, conditional-move-from-constant needs two instructions;
however, conditional-move-from-register need only one instruction.

  While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place. The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase.

  The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource".

Original message since r165661:

My previous change has a bug: I negated the condition code of a CMOV, and go ahead creating a new CMOV using the *ORIGINAL* condition code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166017 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 06:28:34 +00:00
Rafael Espindola
0cead1cfef Fix the cpu name and add -verify-machineinstrs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166003 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 01:13:06 +00:00
Andrew Trick
27c28cef11 misched: Added handleMove support for updating all kill flags, not just for allocatable regs.
This is a medium term workaround until we have a more robust solution
in the form of a register liveness utility for postRA passes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166001 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 00:22:51 +00:00
Michael Liao
6c0e04c823 Add __builtin_setjmp/_longjmp supprt in X86 backend
- Besides used in SjLj exception handling, __builtin_setjmp/__longjmp is also
  used as a light-weight replacement of setjmp/longjmp which are used to
  implementation continuation, user-level threading, and etc. The support added
  in this patch ONLY addresses this usage and is NOT intended to support SjLj
  exception handling as zero-cost DWARF exception handling is used by default
  in X86.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165989 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 22:39:43 +00:00
Jim Grosbach
64ba635209 ARM: v1i64 and v2i64 VBSL intrinsic support.
rdar://12502028

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165981 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 21:23:40 +00:00
David Blaikie
66d4b0fbf4 Add dependency on llvm-bcanalyzer from tests to the CMake build.
This fixes a CMake build break introduced by r165739.

Thanks Jan Voung for the quick suggestion/fix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165978 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 21:11:46 +00:00
Andrew Trick
874c0a6ec7 Check output of the misched unit tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165959 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 20:33:14 +00:00
Rafael Espindola
32522201a8 Add a cpu to try to fix the atom builder.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165956 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 19:25:43 +00:00
Rafael Espindola
3da6f0392a Add testcase for pr14088.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165954 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 19:00:10 +00:00
Andrew Trick
bb20b24224 misched tests: add a triple to speculatively fix windows builders.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165952 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 18:21:08 +00:00
Andrew Trick
1e94e98b0e misched: ILP scheduler for experimental heuristics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165950 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 18:02:27 +00:00
Kostya Serebryany
2611eeda98 [asan] fix a test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165938 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 14:30:30 +00:00
Chandler Carruth
d2cd73f6a5 Update the memcpy rewriting to fully support widened int rewriting. This
includes extracting ints for copying elsewhere and inserting ints when
copying into the alloca. This should fix the CanSROA assertion coming
out of Clang's regression test suite.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165931 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 10:24:43 +00:00
Chandler Carruth
94fc64c42f Follow-up fix to r165928: handle memset rewriting for widened integers,
and generally clean up the memset handling. It had rotted a bit as the
other rewriting logic got polished more.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165930 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 10:24:40 +00:00
Silviu Baranga
bb1078ea13 Fixed PR13938: the ARM backend was crashing because it couldn't select a VDUPLANE node with the vector input size different from the output size. This was bacause the BUILD_VECTOR lowering code didn't check that the size of the input vector was correct for using VDUPLANE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165929 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 09:41:32 +00:00
Chandler Carruth
81ff90db44 First major step toward addressing PR14059. This teaches SROA to handle
cases where we have partial integer loads and stores to an otherwise
promotable alloca to widen[1] those loads and stores to cover the entire
alloca and bitcast them into the appropriate type such that promotion
can proceed.

These partial loads and stores stem from an annoying confluence of ARM's
calling convention and ABI lowering and the FCA pre-splitting which
takes place in SROA. Clang lowers a { double, double } in-register
function argument as a [4 x i32] function argument to ensure it is
placed into integer 32-bit registers (a really unnerving implicit
contract between Clang and the ARM backend I would add). This results in
a FCA load of [4 x i32]* from the { double, double } alloca, and SROA
decomposes this into a sequence of i32 loads and stores. Inlining
proceeds, code gets folded, but at the end of the day, we still have i32
stores to the low and high halves of a double alloca. Widening these to
be i64 operations, and bitcasting them to double prior to loading or
storing allows promotion to proceed for these allocas.

I looked quite a bit changing the IR which Clang produces for this case
to be more friendly, but small changes seem unlikely to help. I think
the best representation we could use currently would be to pass 4 i32
arguments thereby avoiding any FCAs, but that would still require this
fix. It seems like it might eventually be nice to somehow encode the ABI
register selection choices outside of the parameter type system so that
the parameter can be a { double, double }, but the CC register
annotations indicate that this should be passed via 4 integer registers.

This patch does not address the second problem in PR14059, which is the
reverse: when a struct alloca is loaded as a *larger* single integer.

This patch also does not address some of the code quality issues with
the FCA-splitting. Those don't actually impede any optimizations really,
but they're on my list to clean up.

[1]: Pedantic footnote: for those concerned about memory model issues
here, this is safe. For the alloca to be promotable, it cannot escape or
have any use of its address that could allow these loads or stores to be
racing. Thus, widening is always safe.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165928 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 08:40:30 +00:00
Meador Inge
a239c2e6a7 instcombine: Migrate strcmp and strncmp optimizations
This patch migrates the strcmp and strncmp optimizations from the
simplify-libcalls pass into the instcombine library call simplifier.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165915 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 03:47:37 +00:00
Benjamin Kramer
08b6b81ec5 X86: Depending on the local semantics of .align this test can also emit a nopl instead of nopw.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165880 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-13 17:38:00 +00:00
Benjamin Kramer
126afcbf65 X86: Disable long nops for all cpus prior to pentiumpro/i686.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165878 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-13 17:28:35 +00:00
Jakob Stoklund Olesen
d86296a4ae Drop <def,dead> flags when merging into an unused lane.
The new coalescer can merge a dead def into an unused lane of an
otherwise live vector register.

Clear the <dead> flag when that happens since the flag refers to the
full virtual register which is still live after the partial dead def.

This fixes PR14079.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165877 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-13 17:26:47 +00:00
Meador Inge
186f8d90df instcombine: Migrate strchr and strrchr optimizations
This patch migrates the strchr and strrchr optimizations from the
simplify-libcalls pass into the instcombine library call simplifier.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165875 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-13 16:45:37 +00:00
Meador Inge
73d8a5864f instcombine: Migrate strcat and strncat optimizations
This patch migrates the strcat and strncat optimizations from the
simplify-libcalls pass into the instcombine library call simplifier.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165874 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-13 16:45:32 +00:00
Jakob Stoklund Olesen
af89690760 Allow for loops in LiveIntervals::pruneValue().
It is possible that the live range of the value being pruned loops back
into the kill MBB where the search started. When that happens, make sure
that the beginning of KillMBB is also pruned.

Instead of starting a DFS at KillMBB and skipping the root of the
search, start a DFS at each KillMBB successor, and allow the search to
loop back to KillMBB.

This fixes PR14078.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165872 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-13 16:15:31 +00:00
Benjamin Kramer
f8b65aaf39 X86: Fix accidentally swapped operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165871 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-13 12:50:19 +00:00
Chandler Carruth
07525a6be6 Teach SROA to cope with wrapper aggregates. These show up a lot in ABI
type coercion code, especially when targetting ARM. Things like [1
x i32] instead of i32 are very common there.

The goal of this logic is to ensure that when we are picking an alloca
type, we look through such wrapper aggregates and across any zero-length
aggregate elements to find the simplest type possible to form a type
partition.

This logic should (generally speaking) rarely fire. It only ends up
kicking in when an alloca is accessed using two different types (for
instance, i32 and float), and the underlying alloca type has wrapper
aggregates around it. I noticed a significant amount of this occurring
looking at stepanov_abstraction generated code for arm, and suspect it
happens elsewhere as well.

Note that this doesn't yet address truly heinous IR productions such as
PR14059 is concerning. Those result in mismatched *sizes* of types in
addition to mismatched access and alloca types.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165870 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-13 10:49:33 +00:00
Benjamin Kramer
444dccecfc X86: Promote i8 cmov when both operands are coming from truncates of the same width.
X86 doesn't have i8 cmovs so isel would emit a branch. Emitting branches at this
level is often not a good idea because it's too late for many optimizations to
kick in. This solution doesn't add any extensions (truncs are free) and tries
to avoid introducing partial register stalls by filtering direct copyfromregs.

I'm seeing a ~10% speedup on reading a random .png file with libpng15 via
graphicsmagick on x86_64/westmere, but YMMV depending on the microarchitecture.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165868 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-13 10:39:49 +00:00
Manman Ren
e6c3cc8dc5 ARM: tail-call inside a function where part of a byval argument is on caller's
local frame causes problem.

For example:
void f(StructToPass s) {
  g(&s, sizeof(s));
}
will cause problem with tail-call since part of s is passed via registers and
saved in f's local frame. When g tries to access s, part of s may be corrupted
since f's local frame is popped out before the tail-call.

The current fix is to disable tail-call if getVarArgsRegSaveSize is not 0 for
the caller. This is a conservative approach, if we can prove the address of
s or part of s is not taken and passed to g, it should be okay to perform
tail-call.

rdar://12442472


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165853 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 23:39:43 +00:00
Jakob Stoklund Olesen
2bbb07d13c Fix buildbots: -misched=shuffle is only available in +Asserts builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165846 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 23:01:33 +00:00
Jim Grosbach
4346fa9437 ARM: Mark VSELECT as 'expand'.
The backend already pattern matches to form VBSL when it can. We may want to
teach it to use the vbsl intrinsics at some point to prevent machine licm from
mucking with this, but using the Expand is completely correct.

http://llvm.org/bugs/show_bug.cgi?id=13831
http://llvm.org/bugs/show_bug.cgi?id=13961

Patch by Peter Couperus <peter.couperus@st.com>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165845 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 22:59:21 +00:00
Jakob Stoklund Olesen
ad5e969ba9 Use a transposed algorithm for handleMove().
Completely update one interval at a time instead of collecting live
range fragments to be updated. This avoids building data structures,
except for a single SmallPtrSet of updated intervals.

Also share code between handleMove() and handleMoveIntoBundle().

Add support for moving dead defs across other live values in the
interval. The MI scheduler can do that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165824 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 21:31:57 +00:00
Jakob Stoklund Olesen
795f951c6d Fix coalescing with IMPLICIT_DEF values.
PHIElimination inserts IMPLICIT_DEF instructions to guarantee that all
PHI predecessors have a live-out value. These IMPLICIT_DEF values are
not considered to be real interference when coalescing virtual
registers:

  %vreg1 = IMPLICIT_DEF
  %vreg2 = MOV32r0

When joining %vreg1 and %vreg2, the IMPLICIT_DEF instruction and its
value number should simply be erased since the %vreg2 value number now
provides a live-out value for the PHI predecesor block.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165813 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 18:03:04 +00:00
NAKAMURA Takumi
20ce6e6562 llvm/test/CodeGen/PowerPC/2012-10-12-bitcast.ll: Try to fix failure on non-ppc hosts, to add -mattr=+altivec.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165803 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 16:01:08 +00:00
Ulrich Weigand
7bbb9c7b4a Fix big-endian codegen bug in DAGTypeLegalizer::ExpandRes_BITCAST
On PowerPC, a bitcast of <16 x i8> to i128 may run through a code
path in ExpandRes_BITCAST that attempts to do an intermediate
bitcast to a <4 x i32> vector, and then construct the Hi and Lo parts
of the resulting i128 by pairing up two of those i32 vector elements
each.  The code already recognizes that on a big-endian system, the
first two vector elements form the Hi part, and the final two vector
elements form the Lo part (vice-versa from the little-endian situation).

However, we also need to take endianness into account when forming each
of those separate pairs:  on a big-endian system, vector element 0 is
the *high* part of the pair making up the Hi part of the result, and
vector element 1 is the low part of the pair.  The code currently always
uses vector element 0 as the low part and vector element 1 as the high
part, as is appropriate for little-endian platforms only.

This patch fixes this by swapping the vector elements as they are
paired up as appropriate.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165802 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 15:42:58 +00:00
Reed Kotler
7d90d4d709 Div, Rem int/unsigned int
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165783 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 02:01:09 +00:00
Evan Cheng
d36696c4e0 Legalizer optimize a pair of div / mod to a call to divrem libcall if they are
not legal. However, it should use a div instruction + mul + sub if divide is
legal. The rem legalization code was missing a check and incorrectly uses a
divrem libcall even when div is legal.

rdar://12481395


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165778 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 01:15:47 +00:00
Jan Wen Voung
d9a3bad448 Change encoding of instruction operands in bitcode binaries to be relative
to the instruction position.  The old encoding would give an absolute
ID which counts up within a function, and only resets at the next function.

I.e., Instead of having:

... = icmp eq i32 n-1, n-2
br i1 ..., label %bb1, label %bb2

it will now be roughly:

... = icmp eq i32 1, 2
br i1 1, label %bb1, label %bb2

This makes it so that ids remain relatively small and can be encoded
in fewer bits.

With this encoding, forward reference operands will be given
negative-valued IDs.  Use signed VBRs for the most common case
of forward references, which is phi instructions.

To retain backward compatibility we bump the bitcode version
from 0 to 1 to distinguish between the different encodings.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165739 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-11 20:20:40 +00:00
Jakob Stoklund Olesen
ebba49395c Pass an explicit operand number to addLiveIns.
Not all instructions define a virtual register in their first operand.
Specifically, INLINEASM has a different format.

<rdar://problem/12472811>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165721 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-11 16:46:07 +00:00