On PPC the stack pointer is X1, but ADJCALLSTACK writes R1.
Fixes PR14315: Register regmask dependency problem with misched.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168248 91177308-0d34-0410-b5e6-96231b3b80d8
This is a partial solution to PR14351. It removes some of the special
significance of the first incoming phi value in the phi aliasing checking logic
in BasicAA. In the context of a loop, the old logic assumes that the first
incoming value is the interesting one (meaning that it is the one that comes
from outside the loop), but this is often not the case. With this change, we
now test first the incoming value that comes from a block other than the parent
of the phi being tested.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168245 91177308-0d34-0410-b5e6-96231b3b80d8
This patch replaces the hard coded GPR pair [R0, R1] of
Intrinsic:arm_ldrexd and [R2, R3] of Intrinsic:arm_strexd with
even/odd GPRPair reg class.
Similar to the lowering of atomic_64 operation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168207 91177308-0d34-0410-b5e6-96231b3b80d8
Before, the parser would assert on the following code:
@a2 = global i8 addrspace(1)* @a
@a = addrspace(1) global i8 0
because the type of @a was "i8*" instead of "i8 addrspace(1)*" when parsing
the initializer for @a2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168197 91177308-0d34-0410-b5e6-96231b3b80d8
replaced by this patch is equivalent to the new logic, but you'd be wrong, and
that's exactly where the bug was. There's a similar bug in instsimplify which
manifests itself as instsimplify failing to simplify this, rather than doing it
wrong, see next commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168181 91177308-0d34-0410-b5e6-96231b3b80d8
It turns out that the operands of a Constant are not always themselves
Constant. For example, one of the operands of BlockAddress is
BasicBlock, which is not a Constant.
This should fix the dragonegg-x86_64-linux-gcc-4.6-test build which
broke in r168037.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168147 91177308-0d34-0410-b5e6-96231b3b80d8
This patch lowers the llvm.floor, llvm.ceil, llvm.trunc, and
llvm.nearbyint to Altivec instruction when using 4 single-precision
float vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168086 91177308-0d34-0410-b5e6-96231b3b80d8
For global variables that get the same value stored into them
everywhere, GlobalOpt will replace them with a constant. The problem is
that a thread-local GlobalVariable looks like one value (the address of
the TLS var), but is different between threads.
This patch introduces Constant::isThreadDependent() which returns true
for thread-local variables and constants which depend on them (e.g. a GEP
into a thread-local array), and teaches GlobalOpt not to track such
values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168037 91177308-0d34-0410-b5e6-96231b3b80d8
the utility for extracting a chain of operations from the IR, thought that it
might as well combine any constants it came across (rather than just returning
them along with everything else). On the other hand, the factorization code
would like to see the individual constants (this is quite reasonable: it is
much easier to pull a factor of 3 out of 2*3 than it is to pull it out of 6;
you may think 6/3 isn't so hard, but due to overflow it's not as easy to undo
multiplications of constants as it may at first appear). This patch therefore
makes LinearizeExprTree stupider: it now leaves optimizing to the optimization
part of reassociate, and sticks to just analysing the IR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168035 91177308-0d34-0410-b5e6-96231b3b80d8
PPC64 target. The five tests modified herein test code generation that is
sensitive to the code model selected. So I've added -code-model=small to
the RUN commands for each.
Since small code model is the default, this has no effect for now; but this
prepares us for eventually changing the default to medium code model for PPC64.
Test changes verified with small and medium code model as default on
powerpc64-unknown-linux-gnu. All tests continue to pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167999 91177308-0d34-0410-b5e6-96231b3b80d8
The stack realignment code was fixed to work when there is stack realignment and
a dynamic alloca is present so this shouldn't cause correctness issues anymore.
Note that this also enables generation of AVX instructions for memset
under the assumptions:
- Unaligned loads/stores are always fast on CPUs supporting AVX
- AVX is not slower than SSE
We may need some tweaked heuristics if one of those assumptions turns out not to
be true.
Effectively reverts r58317. Part of PR2962.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167967 91177308-0d34-0410-b5e6-96231b3b80d8
This patch changes the definition of negative from -0..-255 to -1..-255. I am changing this because of
a bug that we had in some of the patterns that assumed that "subs" of zero does not set the carry flag.
rdar://12028498
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167963 91177308-0d34-0410-b5e6-96231b3b80d8
Loads from i1 become loads from i8 followed by trunc
Stores to i1 become zext to i8 followed by store to i8
Fixes PR13291
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167948 91177308-0d34-0410-b5e6-96231b3b80d8
eh table and handler data if there are no landing pads in the function.
Patch by Logan Chien with some cleanups from me.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167945 91177308-0d34-0410-b5e6-96231b3b80d8
When an instruction as written requires 32-bit mode and we're assembling
in 64-bit mode, or vice-versa, issue a more specific diagnostic about
what's wrong.
rdar://12700702
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167937 91177308-0d34-0410-b5e6-96231b3b80d8
temporarily as it is breaking the gdb bots.
This reverts commit r167806/e7ff4c14b157746b3e0228d2dce9f70712d1c126.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167886 91177308-0d34-0410-b5e6-96231b3b80d8
chain is correctly setup.
As an example, if the original load must happen before later stores, we need
to make sure the constructed VZEXT_LOAD is constrained to be before the stores.
rdar://12684358
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167859 91177308-0d34-0410-b5e6-96231b3b80d8
physical register as candidate for common subexpression elimination
in MachineCSE.
This fixes a bug on PowerPC in MultiSource/Applications/oggenc/oggenc
caused by MachineCSE invalidly merging two separate DYNALLOC insns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167855 91177308-0d34-0410-b5e6-96231b3b80d8
On MSYS, 70 is not seen, but 1.
r127726 should be reworked. Candidate options are;
1) Use not exit(70), but _exit(70), in report_fatal_error().
2) Return with _exit(70) in ~raw_ostream().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167836 91177308-0d34-0410-b5e6-96231b3b80d8
Previously in a vector of pointers, the pointer couldn't be any pointer type,
it had to be a pointer to an integer or floating point type. This is a hassle
for dragonegg because the GCC vectorizer happily produces vectors of pointers
where the pointer is a pointer to a struct or whatever. Vector getelementptr
was restricted to just one index, but now that vectors of pointers can have
any pointer type it is more natural to allow arbitrary vector getelementptrs.
There is however the issue of struct GEPs, where if each lane chose different
struct fields then from that point on each lane will be working down into
unrelated types. This seems like too much pain for too little gain, so when
you have a vector struct index all the elements are required to be the same.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167828 91177308-0d34-0410-b5e6-96231b3b80d8
This patch migrates the math library call simplifications from the
simplify-libcalls pass into the instcombine library call simplifier.
I have typically migrated just one simplifier at a time, but the math
simplifiers are interdependent because:
1. CosOpt, PowOpt, and Exp2Opt all depend on UnaryDoubleFPOpt.
2. CosOpt, PowOpt, Exp2Opt, and UnaryDoubleFPOpt all depend on
the option -enable-double-float-shrink.
These two factors made migrating each of these simplifiers individually
more of a pain than it would be worth. So, I migrated them all together.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167815 91177308-0d34-0410-b5e6-96231b3b80d8
Don't choose a vectorization plan containing only shuffles and
vector inserts/extracts. Due to inperfections in the cost model,
these can lead to infinite recusion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167811 91177308-0d34-0410-b5e6-96231b3b80d8
If we have a type 'int a[1]' and a type 'int b[0]', the generated DWARF is the
same for both of them because we use the 'upper_bound' attribute. Instead use
the 'count' attrbute, which gives the correct number of elements in the array.
<rdar://problem/12566646>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167806 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes another infinite recursion case when using target costs.
We can only replace insert element input chains that are pure (end
with inserting into an undef).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167784 91177308-0d34-0410-b5e6-96231b3b80d8
The old checking code, which assumed that input shuffles and insert-elements
could always be folded (and thus were free) is too simple.
This can only happen in special circumstances.
Using the simple check caused infinite recursion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167750 91177308-0d34-0410-b5e6-96231b3b80d8
The pass would previously assert when trying to compute the cost of
compare instructions with illegal vector types (like struct pointers).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167743 91177308-0d34-0410-b5e6-96231b3b80d8
The assertion is trigged when the Reassociater tries to transform expression
... + 2 * n * 3 + 2 * m + ...
into:
... + 2 * (n*3 + m).
In the process of the transformation, a helper routine folds the constant 2*3 into 6,
confusing optimizer which is trying the to eliminate the common factor 2, and cannot
find 2 any more.
Review is pending. But I'd like commit first in order to help those who are waiting
for this fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167740 91177308-0d34-0410-b5e6-96231b3b80d8
This adds support for weak DAG edges to the general scheduling
infrastructure in preparation for MachineScheduler support for
heuristics based on weak edges.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167738 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes a bug where shuffles were being fused such that the
resulting input types were not legal on the target. This would
occur only when both inputs and dependencies were also foldable
operations (such as other shuffles) and there were other connected
pairs in the same block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167731 91177308-0d34-0410-b5e6-96231b3b80d8
The library call simplifier folds memcmp calls with all constant arguments
to a constant. For example:
memcmp("foo", "foo", 3) -> 0
memcmp("hel", "foo", 3) -> 1
memcmp("foo", "hel", 3) -> -1
The folding is implemented in terms of the system memcmp that LLVM gets
linked with. It currently just blindly uses the value returned from
the system memcmp as the folded constant.
This patch normalizes the values returned from the system memcmp to
(-1, 0, 1) so that we get consistent results across multiple platforms.
The test cases were adjusted accordingly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167726 91177308-0d34-0410-b5e6-96231b3b80d8
- Fix operand order for atomic sub, where the minuend is the value
loaded from memory and the subtrahend is the parameter specified.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167718 91177308-0d34-0410-b5e6-96231b3b80d8
Each SM and PTX version is modeled as a subtarget feature/CPU. Additionally,
PTX 3.1 is added as the default PTX version to be out-of-the-box compatible
with CUDA 5.0.
Available CPUs for this target:
sm_10 - Select the sm_10 processor.
sm_11 - Select the sm_11 processor.
sm_12 - Select the sm_12 processor.
sm_13 - Select the sm_13 processor.
sm_20 - Select the sm_20 processor.
sm_21 - Select the sm_21 processor.
sm_30 - Select the sm_30 processor.
sm_35 - Select the sm_35 processor.
Available features for this target:
ptx30 - Use PTX version 3.0.
ptx31 - Use PTX version 3.1.
sm_10 - Target SM 1.0.
sm_11 - Target SM 1.1.
sm_12 - Target SM 1.2.
sm_13 - Target SM 1.3.
sm_20 - Target SM 2.0.
sm_21 - Target SM 2.1.
sm_30 - Target SM 3.0.
sm_35 - Target SM 3.5.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167699 91177308-0d34-0410-b5e6-96231b3b80d8
Transforms/InstCombine/memcmp-1.ll has a test case that looks like:
@foo = constant [4 x i8] c"foo\00"
@hel = constant [4 x i8] c"hel\00"
...
%mem1 = getelementptr [4 x i8]* @hel, i32 0, i32 0
%mem2 = getelementptr [4 x i8]* @foo, i32 0, i32 0
%ret = call i32 @memcmp(i8* %mem1, i8* %mem2, i32 3)
ret i32 %ret
; CHECK: ret i32 2
The folded return value (2 above) is computed using the system memcmp
that the compiler is linked with. This can return different values on
different systems. The test was originally written on an OS X 10.7.5
x86-64 box and passed. However, it failed on one of the x86-64 FreeBSD
buildbots because the system memcpy on that machine returned a different
value (1 instead of 2).
I fixed the test by checking the folding constants with regexes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167691 91177308-0d34-0410-b5e6-96231b3b80d8
This patch migrates the memset optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167689 91177308-0d34-0410-b5e6-96231b3b80d8
This patch migrates the memmove optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167687 91177308-0d34-0410-b5e6-96231b3b80d8
This patch migrates the memcpy optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167686 91177308-0d34-0410-b5e6-96231b3b80d8
This patch migrates the memcmp optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167683 91177308-0d34-0410-b5e6-96231b3b80d8
This patch migrates the strstr optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167682 91177308-0d34-0410-b5e6-96231b3b80d8
This patch migrates the strcspn optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167675 91177308-0d34-0410-b5e6-96231b3b80d8
Several of the simplifiers migrated from the simplify-libcalls pass to
the instcombine pass were not correctly checking the target library
information to gate the simplifications. This patch ensures that the
check is made.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167660 91177308-0d34-0410-b5e6-96231b3b80d8
mov lr, pc
b.w _foo
The "mov" instruction doesn't set bit zero to one, it's putting incorrect
value in lr. It messes up backtraces.
rdar://12663632
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167657 91177308-0d34-0410-b5e6-96231b3b80d8
The RegMaskSlots contains 'r' slots while NewIdx and OldIdx are 'B'
slots. This broke the checks in the assertions.
This fixes PR14302.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167625 91177308-0d34-0410-b5e6-96231b3b80d8
If the arrays are found to be disjoint then we run the vectorized version of
the loop. If they are not, we run the scalar code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167608 91177308-0d34-0410-b5e6-96231b3b80d8
Improve ARM build attribute emission for architectures types.
This also changes the default architecture emitted for a generic CPU to "v7".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167574 91177308-0d34-0410-b5e6-96231b3b80d8
- Add RTM code generation support throught 3 X86 intrinsics:
xbegin()/xend() to start/end a transaction region, and xabort() to abort a
tranaction region
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167573 91177308-0d34-0410-b5e6-96231b3b80d8
This patch migrates the strspn optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167568 91177308-0d34-0410-b5e6-96231b3b80d8
values in a map that can be passed to consumers. Add a testcase that
ensures this works for llvm-dwarfdump.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167558 91177308-0d34-0410-b5e6-96231b3b80d8