This patch lowers the llvm.floor, llvm.ceil, llvm.trunc, and
llvm.nearbyint to Altivec instruction when using 4 single-precision
float vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168086 91177308-0d34-0410-b5e6-96231b3b80d8
PPC64 target. The five tests modified herein test code generation that is
sensitive to the code model selected. So I've added -code-model=small to
the RUN commands for each.
Since small code model is the default, this has no effect for now; but this
prepares us for eventually changing the default to medium code model for PPC64.
Test changes verified with small and medium code model as default on
powerpc64-unknown-linux-gnu. All tests continue to pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167999 91177308-0d34-0410-b5e6-96231b3b80d8
The stack realignment code was fixed to work when there is stack realignment and
a dynamic alloca is present so this shouldn't cause correctness issues anymore.
Note that this also enables generation of AVX instructions for memset
under the assumptions:
- Unaligned loads/stores are always fast on CPUs supporting AVX
- AVX is not slower than SSE
We may need some tweaked heuristics if one of those assumptions turns out not to
be true.
Effectively reverts r58317. Part of PR2962.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167967 91177308-0d34-0410-b5e6-96231b3b80d8
This patch changes the definition of negative from -0..-255 to -1..-255. I am changing this because of
a bug that we had in some of the patterns that assumed that "subs" of zero does not set the carry flag.
rdar://12028498
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167963 91177308-0d34-0410-b5e6-96231b3b80d8
Loads from i1 become loads from i8 followed by trunc
Stores to i1 become zext to i8 followed by store to i8
Fixes PR13291
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167948 91177308-0d34-0410-b5e6-96231b3b80d8
eh table and handler data if there are no landing pads in the function.
Patch by Logan Chien with some cleanups from me.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167945 91177308-0d34-0410-b5e6-96231b3b80d8
temporarily as it is breaking the gdb bots.
This reverts commit r167806/e7ff4c14b157746b3e0228d2dce9f70712d1c126.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167886 91177308-0d34-0410-b5e6-96231b3b80d8
chain is correctly setup.
As an example, if the original load must happen before later stores, we need
to make sure the constructed VZEXT_LOAD is constrained to be before the stores.
rdar://12684358
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167859 91177308-0d34-0410-b5e6-96231b3b80d8
physical register as candidate for common subexpression elimination
in MachineCSE.
This fixes a bug on PowerPC in MultiSource/Applications/oggenc/oggenc
caused by MachineCSE invalidly merging two separate DYNALLOC insns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167855 91177308-0d34-0410-b5e6-96231b3b80d8
If we have a type 'int a[1]' and a type 'int b[0]', the generated DWARF is the
same for both of them because we use the 'upper_bound' attribute. Instead use
the 'count' attrbute, which gives the correct number of elements in the array.
<rdar://problem/12566646>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167806 91177308-0d34-0410-b5e6-96231b3b80d8
This adds support for weak DAG edges to the general scheduling
infrastructure in preparation for MachineScheduler support for
heuristics based on weak edges.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167738 91177308-0d34-0410-b5e6-96231b3b80d8
- Fix operand order for atomic sub, where the minuend is the value
loaded from memory and the subtrahend is the parameter specified.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167718 91177308-0d34-0410-b5e6-96231b3b80d8
Each SM and PTX version is modeled as a subtarget feature/CPU. Additionally,
PTX 3.1 is added as the default PTX version to be out-of-the-box compatible
with CUDA 5.0.
Available CPUs for this target:
sm_10 - Select the sm_10 processor.
sm_11 - Select the sm_11 processor.
sm_12 - Select the sm_12 processor.
sm_13 - Select the sm_13 processor.
sm_20 - Select the sm_20 processor.
sm_21 - Select the sm_21 processor.
sm_30 - Select the sm_30 processor.
sm_35 - Select the sm_35 processor.
Available features for this target:
ptx30 - Use PTX version 3.0.
ptx31 - Use PTX version 3.1.
sm_10 - Target SM 1.0.
sm_11 - Target SM 1.1.
sm_12 - Target SM 1.2.
sm_13 - Target SM 1.3.
sm_20 - Target SM 2.0.
sm_21 - Target SM 2.1.
sm_30 - Target SM 3.0.
sm_35 - Target SM 3.5.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167699 91177308-0d34-0410-b5e6-96231b3b80d8
mov lr, pc
b.w _foo
The "mov" instruction doesn't set bit zero to one, it's putting incorrect
value in lr. It messes up backtraces.
rdar://12663632
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167657 91177308-0d34-0410-b5e6-96231b3b80d8
The RegMaskSlots contains 'r' slots while NewIdx and OldIdx are 'B'
slots. This broke the checks in the assertions.
This fixes PR14302.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167625 91177308-0d34-0410-b5e6-96231b3b80d8
Improve ARM build attribute emission for architectures types.
This also changes the default architecture emitted for a generic CPU to "v7".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167574 91177308-0d34-0410-b5e6-96231b3b80d8
- Add RTM code generation support throught 3 X86 intrinsics:
xbegin()/xend() to start/end a transaction region, and xabort() to abort a
tranaction region
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167573 91177308-0d34-0410-b5e6-96231b3b80d8
misched is disabled by default. With -enable-misched, these heuristics
balance the schedule to simultaneously avoid saturating processor
resources, expose ILP, and minimize register pressure. I've been
analyzing the performance of these heuristics on everything in the
llvm test suite in addition to a few other benchmarks. I would like
each heuristic check to be verified by a unit test, but I'm still
trying to figure out the best way to do that. The heuristics are still
in considerable flux, but as they are refined we should be rigorous
about unit testing the improvements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167527 91177308-0d34-0410-b5e6-96231b3b80d8
to be extended to a full register. This is modeled in the IR by marking
the return value (or argument) with a signext or zeroext attribute.
However, while these attributes are respected for function arguments,
they are currently ignored for function return values by the PowerPC
back-end. This patch updates PPCCallingConv.td to ask for the promotion
to i64, and fixes LowerReturn and LowerCallResult to implement it.
The new test case verifies that both arguments and return values are
properly extended when passing them; and also that the optimizers
understand incoming argument and return values are in fact guaranteed
by the ABI to be extended.
The patch caused a spurious breakage in CodeGen/PowerPC/coalesce-ext.ll,
since the test case used a "ret" instruction to create a use of an i32
value at the end of the function (to set up data flow as required for
what the test is intended to test). Since there's now an implicit
promotion to i64, that data flow no longer works as expected. To fix
this, this patch now adds an extra "add" to ensure we have an appropriate
use of the i32 value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167396 91177308-0d34-0410-b5e6-96231b3b80d8
The Z constraint specifies an r+r memory address, and the y modifier expands
to the "r, r" in the asm string. For this initial implementation, the base
register is forced to r0 (which has the special meaning of 0 for r+r addressing
on PowerPC) and the full address is taken in the second register. In the
future, this should be improved.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167388 91177308-0d34-0410-b5e6-96231b3b80d8
This patch expands the SEXTLOAD, ZEXTLOAD, and EXTLOAD operations for
vector types when altivec is enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167386 91177308-0d34-0410-b5e6-96231b3b80d8
The adc/sbb optimization is to able to convert following expression
into a single adc/sbb instruction:
(ult) ... = x + 1 // where the ult is unsigned-less-than comparison
(ult) ... = x - 1
This change is to flip the "x >u y" (i.e. ugt comparison) in order
to expose the adc/sbb opportunity.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167180 91177308-0d34-0410-b5e6-96231b3b80d8