Commit Graph

7693 Commits

Author SHA1 Message Date
Evan Cheng
8a7186dbc2 Let targets provide hooks that compute known zero and ones for any_extend
and extload's. If they are implemented as zero-extend, or implicitly
zero-extend, then this can enable more demanded bits optimizations. e.g.

define void @foo(i16* %ptr, i32 %a) nounwind {
entry:
  %tmp1 = icmp ult i32 %a, 100
  br i1 %tmp1, label %bb1, label %bb2
bb1:
  %tmp2 = load i16* %ptr, align 2
  br label %bb2
bb2:
  %tmp3 = phi i16 [ 0, %entry ], [ %tmp2, %bb1 ]
  %cmp = icmp ult i16 %tmp3, 24
  br i1 %cmp, label %bb3, label %exit
bb3:
  call void @bar() nounwind
  br label %exit
exit:
  ret void
}

This compiles to the followings before:
        push    {lr}
        mov     r2, #0
        cmp     r1, #99
        bhi     LBB0_2
@ BB#1:                                 @ %bb1
        ldrh    r2, [r0]
LBB0_2:                                 @ %bb2
        uxth    r0, r2
        cmp     r0, #23
        bhi     LBB0_4
@ BB#3:                                 @ %bb3
        bl      _bar
LBB0_4:                                 @ %exit
        pop     {lr}
        bx      lr

The uxth is not needed since ldrh implicitly zero-extend the high bits. With
this change it's eliminated.

rdar://12771555


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169459 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06 01:28:01 +00:00
Andrew Trick
f3329c419b RegisterPressureTracker: fix findUseBetween to handle DebugValue
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169427 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-05 21:37:50 +00:00
Andrew Trick
553c42cefc RegisterPresssureTracker: Track live physical register by unit.
This is much simpler to reason about, more efficient, and
fixes some corner cases involving implicit super-register defs.
Fixed rdar://12797931.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169425 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-05 21:37:42 +00:00
Justin Holewinski
2f1086137d [NVPTX] Fix crash with unnamed struct arguments
Patch by Eric Holk

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169418 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-05 20:50:28 +00:00
Jyotsna Verma
61b632d9f7 Use multiclass to define store instructions with base+immediate offset
addressing mode and immediate stored value.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169408 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-05 19:32:03 +00:00
Elena Demikhovsky
226e0e6264 Simplified BLEND pattern matching for shuffles.
Generate VPBLENDD for AVX2 and VPBLENDW for v16i16 type on AVX2.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169366 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-05 09:24:57 +00:00
Evan Cheng
4e54480531 Add x86 isel lowering logic to form bit test with inverted condition. e.g.
x ^ -1.

Patch by David Majnemer.
rdar://12755626


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169339 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-05 00:10:38 +00:00
Evan Cheng
c8e7045c8a ARM custom lower ctpop for vector types. Patch by Pete Couperus.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169325 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04 22:41:50 +00:00
Bill Wendling
9493dae613 Use the 'count' attribute to calculate the upper bound of an array.
The count attribute is more accurate with regards to the size of an array. It
also obviates the upper bound attribute in the subrange. We can also better
handle an unbound array by setting the count to -1 instead of the lower bound to
1 and upper bound to 0.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169312 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04 21:34:03 +00:00
Bill Schmidt
d7802bf0dd This patch introduces initial-exec model support for thread-local storage
on 64-bit PowerPC ELF.

The patch includes code to handle external assembly and MC output with the
integrated assembler.  It intentionally does not support the "old" JIT.

For the initial-exec TLS model, the ABI requires the following to calculate
the address of external thread-local variable x:

 Code sequence            Relocation                  Symbol
  ld 9,x@got@tprel(2)      R_PPC64_GOT_TPREL16_DS      x
  add 9,9,x@tls            R_PPC64_TLS                 x

The register 9 is arbitrary here.  The linker will replace x@got@tprel
with the offset relative to the thread pointer to the generated GOT
entry for symbol x.  It will replace x@tls with the thread-pointer
register (13).

The two test cases verify correct assembly output and relocation output
as just described.

PowerPC-specific selection node variants are added for the two
instructions above:  LD_GOT_TPREL and ADD_TLS.  These are inserted
when an initial-exec global variable is encountered by
PPCTargetLowering::LowerGlobalTLSAddress(), and later lowered to
machine instructions LDgotTPREL and ADD8TLS.  LDgotTPREL is a pseudo
that uses the same LDrs support added for medium code model's LDtocL,
with a different relocation type.

The rest of the processing is straightforward.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169281 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04 16:18:08 +00:00
Bill Wendling
a7645a3c66 Add a 'count' field to the DWARF subrange.
The count field is necessary because there isn't a difference between the 'lo'
and 'hi' attributes for a one-element array and a zero-element array. When the
count is '0', we know that this is a zero-element array. When it's >=1, then
it's a normal constant sized array. When it's -1, then the array is unbounded.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169218 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04 06:20:49 +00:00
Manman Ren
69261a6442 Stack Alignment: when creating stack objects in MachineFrameInfo, make sure
the alignment is clamped to TargetFrameLowering.getStackAlignment if the target
does not support stack realignment or the option "realign-stack" is off.

This will cause miscompile if the address is treated as aligned and add is
replaced with or in DAGCombine.

Added a bool StackRealignable to TargetFrameLowering to check whether stack
realignment is implemented for the target. Also added a bool RealignOption
to MachineFrameInfo to check whether the option "realign-stack" is on.

rdar://12713765


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169197 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04 00:52:33 +00:00
Nadav Rotem
a569a80e58 Allow merging multiple store sequences on the same chain.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169111 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-02 17:14:09 +00:00
Eli Bendersky
e469364244 Fix an invalid regex in the test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169108 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-02 15:46:02 +00:00
Andrew Trick
657b75b994 misched: Fix RegisterPressureTracker handling of DebugVals.
Assertion failed: (TopRPTracker.getPos() == RegionBegin && "bad initial Top tracker").
rdar://12790302.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169072 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-01 01:22:49 +00:00
Andrew Trick
177d87ac8d misched: Fix the DAG builder to handle an undef operand at ExitSU.
Assertion failed: (VNI && "No value to read by operand")
rdar://12790267.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169071 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-01 01:22:44 +00:00
Andrew Trick
30fe61aa35 misched: Fix LiveInterval update to better handle DebugVal.
Assertion failed: (itr != mi2iMap.end() && "Instruction not found in maps.")
rdar://12777252.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169070 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-01 01:22:41 +00:00
Andrew Trick
67bdd42d1e misched: fix RegionBegin when DebugValues get shuffled to the top.
assert (RemainingInstrs == 0 && "Instruction count mismatch!")

rdar://12776937.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169069 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-01 01:22:38 +00:00
Jakob Stoklund Olesen
8c3dccde92 Simplify REG_SEQUENCE lowering.
The TwoAddressInstructionPass takes the machine code out of SSA form by
expanding REG_SEQUENCE instructions into copies. It is no longer
necessary to rewrite the registers used by a REG_SEQUENCE instruction
because the new coalescer algorithm can do it now.

REG_SEQUENCE is just converted to a sequence of sub-register copies now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169067 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-01 01:06:44 +00:00
Chad Rosier
89d86118c5 test/CodeGen/PowerPC/vec_mul.ll: Add a triple. Thanks, Hal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169026 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-30 19:15:10 +00:00
Sebastian Pop
cb4953089b Codegen failure for vmull with small vectors
Codegen was failing with an assertion because of unexpected vector
operands when legalizing the selection DAG for a MUL instruction.

The asserting code was legalizing multiplies for vectors of size 128
bits. It uses a custom lowering to try and detect cases where it can
use a VMULL instruction instead of a VMOVL + VMUL.  The code was
looking for input operands to the MUL that had been sign or zero
extended. If it found the extended operands it would drop the
sign/zero extension and use the original vector size as input to a
VMULL instruction.

The code assumed that the original input vector was 64 bits so that
after dropping the extension it would fit directly into a D register
and could be used as an operand of a VMULL instruction. The input
code that trigger the failure used a vector of <4 x i8> that was
sign extended to <4 x i32>. It was not safe to drop the sign
extension in this case because the original vector is only 32 bits
wide. The fix is to insert a sign extension for the vector to reach
the required 64 bit size. In this particular example, the vector would
need to be sign extented to a <4 x i16>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169024 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-30 19:08:04 +00:00
Chad Rosier
75cbb00727 test/CodeGen/PowerPC/vec_mul.ll: Fix register operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169020 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-30 18:29:01 +00:00
NAKAMURA Takumi
09af5b8c3e test/CodeGen/PowerPC: Add explicit -march=ppc32.
FIXME: Please add another RUN line if you would like to check also on ppc64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168999 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-30 13:28:31 +00:00
Adhemerval Zanella
375cbe4143 This patch fixes the Altivec addend construction for the fused multiply-add
instruction (vmaddfp) to conform with IEEE to ensure the sign of a zero
result when resulting product is -0.0.

The -0.0 vector addend to vmaddfp is generated by a creating a vector
with full bits sets and then shifting each elements by 31-bits to the
left, resulting in a vector of 0x80000000 (or -0.0 as float).

The 'buildvec_canonicalize.ll' was adjusted to reflect this change and
the 'vec_mul.ll' was complemented with the float vector multiplication
test.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168998 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-30 13:05:44 +00:00
Bill Wendling
7360116048 Handle the situation where CodeGenPrepare removes a reference to a BB that has
the last invoke instruction in the function. This also removes the last landing
pad in an function. This is fine, but with SjLj EH code, we've already placed a
bunch of code in the 'entry' block, which expects the landing pad to stick
around.

When we get to the situation where CGP has removed the last landing pad, go
ahead and nuke the SjLj instructions from the 'entry' block.
<rdar://problem/12721258>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168930 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-29 19:38:06 +00:00
Silviu Baranga
35b3df6e31 Added atomic 64 min/max/umin/umax instrinsics support in the ARM backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168886 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-29 14:41:25 +00:00
Justin Holewinski
7f128ea00c Teach the legalizer how to handle operands for VSELECT nodes
If we need to split the operand of a VSELECT, it must be the mask operand. We
split the entire VSELECT operand with EXTRACT_SUBVECTOR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168883 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-29 14:26:28 +00:00
Justin Holewinski
3d200255d5 Allow targets to prefer TypeSplitVector over TypePromoteInteger when computing the legalization method for vectors
For some targets, it is desirable to prefer scalarizing <N x i1> instead of promoting to a larger legal type, such as <N x i32>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168882 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-29 14:26:24 +00:00
Jakob Stoklund Olesen
89bea17af2 Avoid rewriting instructions twice.
This could cause miscompilations in targets where sub-register
composition is not always idempotent (ARM).

<rdar://problem/12758887>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168837 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-29 00:26:11 +00:00
Nadav Rotem
90e11dc8ad When combining consecutive stores allow loads in between the stores, if the loads do not alias.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168832 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-29 00:00:08 +00:00
Benjamin Kramer
350c00843b ARM: Implement CanLowerReturn so large vectors get expanded into sret.
Fixes 14337.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168809 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-28 20:55:10 +00:00
Andrew Trick
8b1496c922 misched: Analysis that partitions the DAG into subtrees.
This is a simple, cheap infrastructure for analyzing the shape of a
DAG. It recognizes uniform DAGs that take the shape of bottom-up
subtrees, such as the included matrix multiplication example. This is
useful for heuristics that balance register pressure with ILP. Two
canonical expressions of the heuristic are implemented in scheduling
modes: -misched-ilpmin and -misched-ilpmax.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168773 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-28 05:13:28 +00:00
Andrew Trick
8f82a08673 misched: better alias analysis.
This fixes a hole in the "cheap" alias analysis logic implemented within
the DAG builder itself, regardless of whether proper alias analysis is
enabled. It now handles this pattern produced by LSR+CodeGenPrepare.

%sunkaddr1 = ptrtoint * %obj to i64
%sunkaddr2 = add i64 %sunkaddr1, %lsr.iv
%sunkaddr3 = inttoptr i64 %sunkaddr2 to i32*
store i32 %v, i32* %sunkaddr3

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168768 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-28 03:42:49 +00:00
Bill Schmidt
daa65f5e08 This patch makes medium code model the default for 64-bit PowerPC ELF.
When the CodeGenInfo is to be created for the PPC64 target machine,
a default code-model selection is converted to CodeModel::Medium
provided we are not targeting the Darwin OS.  Defaults for Darwin
are unaffected.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168747 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-27 23:36:26 +00:00
Chad Rosier
92a6e532b8 Add -verify-machineinstrs to these fast-isel test cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168723 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-27 20:49:56 +00:00
Manman Ren
39834da697 CSE: allow PerformTrivialCoalescing to check copies across basic block
boundaries.

Given the following case:
BB0
  %vreg1<def> = SUBrr %vreg0, %vreg7
  %vreg2<def> = COPY %vreg7
BB1
  %vreg10<def> = SUBrr %vreg0, %vreg2
We should be able to CSE between SUBrr in BB0 and SUBrr in BB1.

rdar://12462006


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168717 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-27 18:58:41 +00:00
Manman Ren
f365d3984e X86: do not fold load instructions such as [V]MOVS[S|D] to other instructions
when the destination register is wider than the memory load.

These load instructions load from m32 or m64 and set the upper bits to zero,
while the folded instructions may accept m128.

rdar://12721174


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168710 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-27 18:09:26 +00:00
Bill Schmidt
34a9d4b3b9 This patch implements medium code model support for 64-bit PowerPC.
The default for 64-bit PowerPC is small code model, in which TOC entries
must be addressable using a 16-bit offset from the TOC pointer.  Additionally,
only TOC entries are addressed via the TOC pointer.

With medium code model, TOC entries and data sections can all be addressed
via the TOC pointer using a 32-bit offset.  Cooperation with the linker
allows 16-bit offsets to be used when these are sufficient, reducing the
number of extra instructions that need to be executed.  Medium code model
also does not generate explicit TOC entries in ".section toc" for variables
that are wholly internal to the compilation unit.

Consider a load of an external 4-byte integer.  With small code model, the
compiler generates:

	ld 3, .LC1@toc(2)
	lwz 4, 0(3)

	.section	.toc,"aw",@progbits
.LC1:
	.tc ei[TC],ei

With medium model, it instead generates:

	addis 3, 2, .LC1@toc@ha
	ld 3, .LC1@toc@l(3)
	lwz 4, 0(3)

	.section	.toc,"aw",@progbits
.LC1:
	.tc ei[TC],ei

Here .LC1@toc@ha is a relocation requesting the upper 16 bits of the
32-bit offset of ei's TOC entry from the TOC base pointer.  Similarly,
.LC1@toc@l is a relocation requesting the lower 16 bits.  Note that if
the linker determines that ei's TOC entry is within a 16-bit offset of
the TOC base pointer, it will replace the "addis" with a "nop", and
replace the "ld" with the identical "ld" instruction from the small
code model example.

Consider next a load of a function-scope static integer.  For small code
model, the compiler generates:

	ld 3, .LC1@toc(2)
	lwz 4, 0(3)

	.section	.toc,"aw",@progbits
.LC1:
	.tc test_fn_static.si[TC],test_fn_static.si
	.type	test_fn_static.si,@object
	.local	test_fn_static.si
	.comm	test_fn_static.si,4,4

For medium code model, the compiler generates:

	addis 3, 2, test_fn_static.si@toc@ha
	addi 3, 3, test_fn_static.si@toc@l
	lwz 4, 0(3)

	.type	test_fn_static.si,@object
	.local	test_fn_static.si
	.comm	test_fn_static.si,4,4

Again, the linker may replace the "addis" with a "nop", calculating only
a 16-bit offset when this is sufficient.

Note that it would be more efficient for the compiler to generate:

	addis 3, 2, test_fn_static.si@toc@ha
        lwz 4, test_fn_static.si@toc@l(3)

The current patch does not perform this optimization yet.  This will be
addressed as a peephole optimization in a later patch.

For the moment, the default code model for 64-bit PowerPC will remain the
small code model.  We plan to eventually change the default to medium code
model, which matches current upstream GCC behavior.  Note that the different
code models are ABI-compatible, so code compiled with different models will
be linked and execute correctly.

I've tested the regression suite and the application/benchmark test suite in
two ways:  Once with the patch as submitted here, and once with additional
logic to force medium code model as the default.  The tests all compile
cleanly, with one exception.  The mandel-2 application test fails due to an
unrelated ABI compatibility with passing complex numbers.  It just so happens
that small code model was incredibly lucky, in that temporary values in 
floating-point registers held the expected values needed by the external
library routine that was called incorrectly.  My current thought is to correct
the ABI problems with _Complex before making medium code model the default,
to avoid introducing this "regression."

Here are a few comments on how the patch works, since the selection code
can be difficult to follow:

The existing logic for small code model defines three pseudo-instructions:
LDtoc for most uses, LDtocJTI for jump table addresses, and LDtocCPT for
constant pool addresses.  These are expanded by SelectCodeCommon().  The
pseudo-instruction approach doesn't work for medium code model, because
we need to generate two instructions when we match the same pattern.
Instead, new logic in PPCDAGToDAGISel::Select() intercepts the TOC_ENTRY
node for medium code model, and generates an ADDIStocHA followed by either
a LDtocL or an ADDItocL.  These new node types correspond naturally to
the sequences described above.

The addis/ld sequence is generated for the following cases:
 * Jump table addresses
 * Function addresses
 * External global variables
 * Tentative definitions of global variables (common linkage)

The addis/addi sequence is generated for the following cases:
 * Constant pool entries
 * File-scope static global variables
 * Function-scope static variables

Expanding to the two-instruction sequences at select time exposes the
instructions to subsequent optimization, particularly scheduling.

The rest of the processing occurs at assembly time, in
PPCAsmPrinter::EmitInstruction.  Each of the instructions is converted to
a "real" PowerPC instruction.  When a TOC entry needs to be created, this
is done here in the same manner as for the existing LDtoc, LDtocJTI, and
LDtocCPT pseudo-instructions (I factored out a new routine to handle this).

I had originally thought that if a TOC entry was needed for LDtocL or
ADDItocL, it would already have been generated for the previous ADDIStocHA.
However, at higher optimization levels, the ADDIStocHA may appear in a 
different block, which may be assembled textually following the block
containing the LDtocL or ADDItocL.  So it is necessary to include the
possibility of creating a new TOC entry for those two instructions.

Note that for LDtocL, we generate a new form of LD called LDrs.  This
allows specifying the @toc@l relocation for the offset field of the LD
instruction (i.e., the offset is replaced by a SymbolLo relocation).
When the peephole optimization described above is added, we will need
to do similar things for all immediate-form load and store operations.

The seven "mcm-n.ll" test cases are kept separate because otherwise the
intermingling of various TOC entries and so forth makes the tests fragile
and hard to understand.

The above assumes use of an external assembler.  For use of the
integrated assembler, new relocations are added and used by
PPCELFObjectWriter.  Testing is done with "mcm-obj.ll", which tests for
proper generation of the various relocations for the same sequences
tested with the external assembler.






git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168708 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-27 17:35:46 +00:00
Ulrich Weigand
dba37a3c43 Never use .lcomm on platforms where it does not accept an alignment
argument.  Instead, use a pair of .local and .comm directives.

This avoids spurious differences between binaries built by the
integrated assembler vs. those built by the external assembler,
since the external assembler may impose alignment requirements
on .lcomm symbols where the integrated assembler does not.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168704 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-27 16:11:16 +00:00
Craig Topper
020669d53f Revert accidental commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168687 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-27 08:17:04 +00:00
Craig Topper
af87dae12c Make PrintReg constructor explicit to prevent weird implicit conversions from accidentally being triggered.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168686 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-27 08:14:24 +00:00
Craig Topper
2cf4fb4884 Add test cases for r168417.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168681 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-27 07:19:54 +00:00
Chad Rosier
277068fe40 Extend test case for r168657.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168658 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-27 01:10:48 +00:00
NAKAMURA Takumi
cb84142195 llvm/test/CodeGen/X86/2012-07-15-broadcastfold.ll: Loosen expression corresponding to r168627. Win32 and *bsd were affected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168651 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-27 00:48:27 +00:00
Chad Rosier
1243922fc1 Remove the X86 Maximal Stack Alignment Check pass as it is no longer necessary.
This pass was conservative in that it always reserved the FP to enable dynamic
stack realignment, which allowed the RA to use aligned spills for vector
registers.  This happens even when spills were not necessary.  The RA has 
since been improved to use unaligned spills when necessary.

The new behavior is to realign the stack if the frame pointer was already
reserved for some other reason, but don't reserve the frame pointer just
because a function contains vector virtual registers.

Part of rdar://12719844

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168627 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-26 22:55:05 +00:00
Jakub Staszak
d642baf4be Normalize splat 256bit vectors with 8 elements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168600 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-26 19:24:31 +00:00
Eli Bendersky
a5cf16fc51 Rewrite test to not use a FileCheck variable and redefine it on the same line.
In preparation for the FileCheck functionality change which will allow using
a variable later on the same line.

No functionality change.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168588 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-26 14:09:46 +00:00
Benjamin Kramer
915558e775 PPC: MCize most of the darwin PIC emission.
The last remaining bit is "bcl 20, 31, AnonSymbol", which I couldn't find the
instruction definition for. Only whitespace changes in assembly output.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168541 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-24 13:18:25 +00:00
Akira Hatanaka
f09a03776d [mips] Generate big GOT code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168460 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-21 20:40:38 +00:00
Anton Korobeynikov
0ae6124034 Add support for varargs functions for msp430.
Patch by Job Noorman!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168440 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-21 17:28:27 +00:00
Anton Korobeynikov
6cbeb4d839 Add support for byval args. Patch by Job Noorman!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168439 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-21 17:23:03 +00:00
Tim Northover
310f248c22 Fix physical register liveness calculations:
+ Take account of clobbers
+ Give outputs priority over inputs since they happen later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168360 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-20 09:56:11 +00:00
Elena Demikhovsky
4fe5405bdd Intel OCL built-ins calling conventions now support MacOS 32-bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168359 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-20 09:37:57 +00:00
Anton Korobeynikov
2386fc8daa Factor out type info emission into separate routine.
It turned out that ARM wants different layout of type infos.
This is yet another patch in attempt to fix PR7187 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168325 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-19 21:06:26 +00:00
Jakob Stoklund Olesen
e42561ad0c Handle mixed normal and early-clobber defs on inline asm.
PR14376.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168320 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-19 19:31:10 +00:00
Andrew Trick
410fe6fe19 Use a full triple for a PPC test case for asm syntax.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168283 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-18 06:21:03 +00:00
Andrew Trick
953663a9aa Silence the buildbots for this test while I figure out the triple
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168249 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-17 03:39:26 +00:00
Andrew Trick
e1f663933a Broaden isSchedulingBoundary to check aliases of SP.
On PPC the stack pointer is X1, but ADJCALLSTACK writes R1.

Fixes PR14315: Register regmask dependency problem with misched.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168248 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-17 03:35:11 +00:00
Eli Friedman
43147afd71 Mark FP_EXTEND form v2f32 to v2f64 as "expand" for ARM NEON. Patch by Pete Couperus.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168240 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-17 01:52:46 +00:00
Chad Rosier
0a63b6ac79 [fast-isel] Add the -verify-machineinstrs to these test cases. The remaining
test cases require fixes to fast-isel before the verifier can be enabled.
Part of rdar://12594152

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168233 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-17 00:42:06 +00:00
Akira Hatanaka
94e472832f Initial implementation of MipsTargetLowering::isLegalAddressingMode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168230 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-17 00:25:41 +00:00
Weiming Zhao
e56764bad1 Remove hard coded registers in ARM ldrexd and strexd instructions
This patch replaces the hard coded GPR pair [R0, R1] of
Intrinsic:arm_ldrexd and [R2, R3] of Intrinsic:arm_strexd with
even/odd GPRPair reg class.
Similar to the lowering of atomic_64 operation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168207 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-16 21:55:34 +00:00
Anton Korobeynikov
b1a392e7c5 Make sure FABS on v2f32 and v4f32 is legal on ARM NEON
This fixes PR14359


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168200 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-16 21:15:20 +00:00
Richard Osborne
ccc015d431 Fix handling of aliases to functions.
An alias to a function should use pc relative addressing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168199 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-16 21:12:38 +00:00
Justin Holewinski
2085d00d09 [NVPTX] Order global variables in def-use order before emiting them in the final assembly
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168198 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-16 21:03:51 +00:00
NAKAMURA Takumi
e0827d8880 llvm/test/CodeGen/X86/hipe-cc*.ll: Add explicit -mcpu, or they don't expect to pass on Atom.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168171 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-16 16:07:37 +00:00
Duncan Sands
dc7f174b5e Add the Erlang/HiPE calling convention, patch by Yiannis Tsiouris.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168166 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-16 12:36:39 +00:00
Craig Topper
d577552c66 Use roundps/pd for llvm.ceil, llvm.trunc, llvm.rint, and llvm.nearbyint of vector types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168141 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-16 06:37:56 +00:00
Akira Hatanaka
a032dbd62f [mips] Fix delay slot filler so that instructions with register operand $1 are
allowed in branch delay slot.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168131 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-16 02:39:34 +00:00
Eli Friedman
846ce8ea67 Mark FP_ROUND for converting NEON v2f64 to v2f32 as expand. Add a missing
case to vector legalization so this actually works.

Patch by Pete Couperus.  Fixes PR12540.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168107 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-15 22:44:27 +00:00
Adhemerval Zanella
e95ed2b7af PowerPC: Lowering floor intrinsic for Altivec
This patch lowers the llvm.floor, llvm.ceil, llvm.trunc, and
llvm.nearbyint to Altivec instruction when using 4 single-precision
float vectors.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168086 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-15 20:56:03 +00:00
Bill Schmidt
527388dea5 This patch is in preparation for adding medium code model support to the
PPC64 target.  The five tests modified herein test code generation that is
sensitive to the code model selected.  So I've added -code-model=small to
the RUN commands for each.

Since small code model is the default, this has no effect for now; but this
prepares us for eventually changing the default to medium code model for PPC64.

Test changes verified with small and medium code model as default on
powerpc64-unknown-linux-gnu.  All tests continue to pass.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167999 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-14 23:23:27 +00:00
Jakub Staszak
0e52f46e48 Make sure to not get AVX code on an AVX-capable host. Revealed in r167967.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167989 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-14 22:24:01 +00:00
NAKAMURA Takumi
d0a4a4bf5c test/CodeGen/Hexagon/postinc-load.ll: Suppress it for now. It triggered the failure on i686 hosts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167988 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-14 22:22:37 +00:00
Eric Christopher
06b423452c Remove the CellSPU port.
Approved by Chris Lattner.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167984 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-14 22:09:20 +00:00
NAKAMURA Takumi
9292136787 llvm/test/CodeGen/X86/memset.ll: FileCheck-ize, and add another case on +avx.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167975 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-14 21:01:40 +00:00
Jyotsna Verma
cb02fa9d7f Added multiclass for post-increment load instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167974 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-14 20:38:48 +00:00
Benjamin Kramer
7c6e8cd7cc Force CPU in test so we don't accidentally get AVX code on an AVX-capable host.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167973 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-14 20:31:42 +00:00
Benjamin Kramer
2dbe929685 X86: Enable SSE memory intrinsics even when stack alignment is less than 16 bytes.
The stack realignment code was fixed to work when there is stack realignment and
a dynamic alloca is present so this shouldn't cause correctness issues anymore.

Note that this also enables generation of AVX instructions for memset
under the assumptions:
- Unaligned loads/stores are always fast on CPUs supporting AVX
- AVX is not slower than SSE
We may need some tweaked heuristics if one of those assumptions turns out not to
be true.

Effectively reverts r58317. Part of PR2962.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167967 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-14 20:08:40 +00:00
Nadav Rotem
50b66387e3 The code pattern "imm0_255_neg" is used for checking if an immediate value is a small negative number.
This patch changes the definition of negative from -0..-255 to -1..-255. I am changing this because of
a bug that we had in some of the patterns that assumed that "subs" of zero does not set the carry flag.

rdar://12028498



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167963 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-14 19:39:15 +00:00
Justin Holewinski
a20067b5d4 [NVPTX] Implement custom lowering of loads/stores for i1
Loads from i1 become loads from i8 followed by trunc
Stores to i1 become zext to i8 followed by store to i8

Fixes PR13291

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167948 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-14 19:19:16 +00:00
Anton Korobeynikov
062a6c8380 Fix really stupid ARM EHABI info generation bug: we should not emit
eh table and handler data if there are no landing pads in the function.
Patch by Logan Chien with some cleanups from me.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167945 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-14 19:13:30 +00:00
Rafael Espindola
8e2b8ae3b1 Handle DAG CSE adding new uses during ReplaceAllUsesWith. Fixes PR14333.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167912 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-14 05:08:56 +00:00
Anton Korobeynikov
25efd6d556 Use TARGET2 relocation for TType references on ARM.
Do some cleanup of the code while here.

Inspired by patch by Logan Chien!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167904 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-14 01:47:00 +00:00
Eric Christopher
242343d1ab Revert "Use the 'count' attribute instead of the 'upper_bound' attribute."
temporarily as it is breaking the gdb bots.

This reverts commit r167806/e7ff4c14b157746b3e0228d2dce9f70712d1c126.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167886 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-13 23:30:43 +00:00
Manman Ren
2adc503f29 X86: when constructing VZEXT_LOAD from other loads, makes sure its output
chain is correctly setup.

As an example, if the original load must happen before later stores, we need
to make sure the constructed VZEXT_LOAD is constrained to be before the stores.

rdar://12684358


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167859 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-13 19:13:05 +00:00
Ulrich Weigand
b64e2115de Do not consider a machine instruction that uses and defines the same
physical register as candidate for common subexpression elimination
in MachineCSE.

This fixes a bug on PowerPC in MultiSource/Applications/oggenc/oggenc
caused by MachineCSE invalidly merging two separate DYNALLOC insns.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167855 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-13 18:40:58 +00:00
Duncan Sands
b2df01ab2a Codegen support for arbitrary vector getelementptrs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167830 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-13 13:01:58 +00:00
Bill Wendling
e7ff4c14b1 Use the 'count' attribute instead of the 'upper_bound' attribute.
If we have a type 'int a[1]' and a type 'int b[0]', the generated DWARF is the
same for both of them because we use the 'upper_bound' attribute. Instead use
the 'count' attrbute, which gives the correct number of elements in the array.
<rdar://problem/12566646>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167806 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-13 02:31:47 +00:00
Andrew Trick
f546ac5f9b Cleanup the main RegisterCoalescer loop.
Block priorities still apply outside loops.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167793 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-13 00:34:44 +00:00
Michael Liao
01c6de341c Fix test case added in patch fixing PR14314
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167769 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-12 22:33:18 +00:00
Andrew Trick
ae692f2bae misched: Infrastructure for weak DAG edges.
This adds support for weak DAG edges to the general scheduling
infrastructure in preparation for MachineScheduler support for
heuristics based on weak edges.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167738 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-12 19:28:57 +00:00
Michael Liao
dd3383fd09 Fix PR14314
- Fix operand order for atomic sub, where the minuend is the value
  loaded from memory and the subtrahend is the parameter specified.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167718 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-12 06:49:17 +00:00
Justin Holewinski
08e9cb46fe [NVPTX] Add more precise PTX/SM target attributes
Each SM and PTX version is modeled as a subtarget feature/CPU. Additionally,
PTX 3.1 is added as the default PTX version to be out-of-the-box compatible
with CUDA 5.0.

Available CPUs for this target:

  sm_10 - Select the sm_10 processor.
  sm_11 - Select the sm_11 processor.
  sm_12 - Select the sm_12 processor.
  sm_13 - Select the sm_13 processor.
  sm_20 - Select the sm_20 processor.
  sm_21 - Select the sm_21 processor.
  sm_30 - Select the sm_30 processor.
  sm_35 - Select the sm_35 processor.

Available features for this target:

  ptx30 - Use PTX version 3.0.
  ptx31 - Use PTX version 3.1.
  sm_10 - Target SM 1.0.
  sm_11 - Target SM 1.1.
  sm_12 - Target SM 1.2.
  sm_13 - Target SM 1.3.
  sm_20 - Target SM 2.0.
  sm_21 - Target SM 2.1.
  sm_30 - Target SM 3.0.
  sm_35 - Target SM 3.5.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167699 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-12 03:16:43 +00:00
Evan Cheng
785500618a Convert an improper CodeGen test to a MC test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167663 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-10 04:30:40 +00:00
Evan Cheng
2f69102b8d xfail a bad test. This is a MC test but it's dependent on a codegen optimization which is now disabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167658 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-10 02:34:36 +00:00
Evan Cheng
b341fac05a Disable the Thumb no-return call optimization:
mov lr, pc
b.w _foo

The "mov" instruction doesn't set bit zero to one, it's putting incorrect
value in lr. It messes up backtraces.

rdar://12663632


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167657 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-10 02:09:05 +00:00
Craig Topper
9c7ae01f39 Cleanup pcmp(e/i)str(m/i) instruction definitions and load folding support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167652 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-10 01:23:36 +00:00
Justin Holewinski
89443ff7ae [NVPTX] Use ABI alignment for parameters when alignment is not specified.
Affects SM 2.0+.  Fixes bug 13324.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167646 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-09 23:50:24 +00:00
Jakob Stoklund Olesen
722c9a7925 Fix assertions in updateRegMaskSlots().
The RegMaskSlots contains 'r' slots while NewIdx and OldIdx are 'B'
slots. This broke the checks in the assertions.

This fixes PR14302.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167625 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-09 19:18:49 +00:00
Amara Emerson
214fd3d244 Recommit modified r167540.
Improve ARM build attribute emission for architectures types.
This also changes the default architecture emitted for a generic CPU to "v7".


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167574 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-08 09:51:45 +00:00
Michael Liao
be02a90de1 Add support of RTM from TSX extension
- Add RTM code generation support throught 3 X86 intrinsics:
  xbegin()/xend() to start/end a transaction region, and xabort() to abort a
  tranaction region



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167573 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-08 07:28:54 +00:00
Akira Hatanaka
e90a3bcae1 [mips] Custom-lower ISD::FRAME_TO_ARGS_OFFSET node.
Patch by Sasa Stankovic.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167548 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-07 19:10:58 +00:00
Andrew Trick
3b87f6204f misched: Heuristics based on the machine model.
misched is disabled by default. With -enable-misched, these heuristics
balance the schedule to simultaneously avoid saturating processor
resources, expose ILP, and minimize register pressure. I've been
analyzing the performance of these heuristics on everything in the
llvm test suite in addition to a few other benchmarks. I would like
each heuristic check to be verified by a unit test, but I'm still
trying to figure out the best way to do that. The heuristics are still
in considerable flux, but as they are refined we should be rigorous
about unit testing the improvements.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167527 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-07 07:05:09 +00:00
Ulrich Weigand
86aef0a4f0 On PowerPC64, integer return values (as well as arguments) are supposed
to be extended to a full register.   This is modeled in the IR by marking
the return value (or argument) with a signext or zeroext attribute.

However, while these attributes are respected for function arguments,
they are currently ignored for function return values by the PowerPC
back-end.  This patch updates PPCCallingConv.td to ask for the promotion
to i64, and fixes LowerReturn and LowerCallResult to implement it.

The new test case verifies that both arguments and return values are
properly extended when passing them; and also that the optimizers
understand incoming argument and return values are in fact guaranteed
by the ABI to be extended.

The patch caused a spurious breakage in CodeGen/PowerPC/coalesce-ext.ll,
since the test case used a "ret" instruction to create a use of an i32
value at the end of the function (to set up data flow as required for
what the test is intended to test).  Since there's now an implicit
promotion to i64, that data flow no longer works as expected.  To fix
this, this patch now adds an extra "add" to ensure we have an appropriate
use of the i32 value.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167396 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-05 19:39:45 +00:00
Hal Finkel
827b7a070d Add support for the PowerPC-specific inline asm Z constraint and y modifier.
The Z constraint specifies an r+r memory address, and the y modifier expands
to the "r, r" in the asm string. For this initial implementation, the base
register is forced to r0 (which has the special meaning of 0 for r+r addressing
on PowerPC) and the full address is taken in the second register. In the
future, this should be improved.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167388 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-05 18:18:42 +00:00
Adhemerval Zanella
cfe09ed28d [PATCH] PowerPC: Expand load extend vector operations
This patch expands the SEXTLOAD, ZEXTLOAD, and EXTLOAD operations for
vector types when altivec is enabled.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167386 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-05 17:15:56 +00:00
Akira Hatanaka
3c77033a90 [mips] Set flag neverHasSideEffects flag on floating point conversion
instructions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167348 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-03 00:53:12 +00:00
Akira Hatanaka
3c9c1ab7b7 [mips] Set flag isAsCheapAsAMove flag on instruction LUi.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167345 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-03 00:26:02 +00:00
Akira Hatanaka
11a45c214c [mips] Stop reserving register AT and use register scavenger when a scratch
register is needed.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167341 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-03 00:05:43 +00:00
Akira Hatanaka
fb60a4e823 [mips] Fix bug in test case. Disable machine LICM to prevent instruction from
being moved out of a basic block.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167322 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-02 21:46:42 +00:00
Quentin Colombet
43934aee71 Vext Lowering was missing opportunities
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167318 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-02 21:32:17 +00:00
Akira Hatanaka
5c87b732f2 [mips] Use register number instead of name to print register $AT.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167315 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-02 21:26:03 +00:00
Akira Hatanaka
173192fa71 [mips] Delete MipsFunctionInfo::EmitNOAT. Unconditionally print directive
"set .noat" so that the assembler doesn't issue warnings when register $AT is
used.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167310 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-02 20:56:25 +00:00
NAKAMURA Takumi
b75111f1ef test/CodeGen/X86/fp-fast.ll: Add +avx.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167207 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-01 02:13:45 +00:00
Owen Anderson
607ebde651 Add a few more simple fast-math constant propagations and cancellations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167200 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-01 02:00:53 +00:00
Shuxin Yang
a5526a9bff (For X86) Enhancement to add-carray/sub-borrow (adc/sbb) optimization.
The adc/sbb optimization is to able to convert following expression
into a single adc/sbb instruction:
  (ult) ... = x + 1 // where the ult is unsigned-less-than comparison
  (ult) ... = x - 1

  This change is to flip the "x >u y" (i.e. ugt comparison) in order 
to expose the adc/sbb opportunity.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167180 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-31 23:11:48 +00:00
Akira Hatanaka
497204a94b [mips] Set isAsCheapAsAMove flag on ADDiu and DADDiu, which enables
re-materialization of immediate loads.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167153 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-31 18:37:55 +00:00
Akira Hatanaka
39fa606992 Test case for r167039. Check that tail-call optimization is disabled for
mips16.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167139 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-31 17:25:23 +00:00
Reed Kotler
9441125d63 Implement ADJCALLSTACKUP and ADJCALLSTACKDOWN
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167107 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-31 05:21:10 +00:00
Bill Schmidt
42d43351b2 This patch addresses an ABI compatibility issue with empty aggregate
parameters.  Examples of these are:

  struct { } a;
  union { } b[256];
  int a[0];

An empty aggregate has an address, although dereferencing that address is
pointless.  When passed as a parameter, an empty aggregate does not consume
a protocol register, nor does it consume a doubleword in the parameter save
area.  Passing an empty aggregate by reference passes an address just as
for any other aggregate.  Returning an empty aggregate uses GPR3 as a hidden
address of the return value location, just as for any other aggregate.

The patch modifies PPCTargetLowering::LowerFormalArguments_64SVR4 and
PPCTargetLowering::LowerCall_64SVR4 to properly skip empty aggregate
parameters passed by value.  The handling of return values and by-reference
parameters was already correct.

Built on powerpc64-unknown-linux-gnu and tested with no new regressions.
A test case is included to test proper handling of empty aggregate
parameters on both sides of the function call protocol.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167090 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-31 01:15:05 +00:00
Manman Ren
dfd0b9b460 X86 SSE: update rsqrtss and rcpss to use two source operands and
the first source operand is tied to the destination operand.

This is to accurately model the corresponding instructions where the upper
bits are unmodified.

rdar://12558838
PR14221


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167064 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 23:53:59 +00:00
Manman Ren
4c74a956b2 X86 MMX: optimize transfer from mmx to i32
We used to generate a store (movq) + a load.
Now we use movd.

rdar://9946746


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167056 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 22:15:38 +00:00
Akira Hatanaka
2f34d754d0 [mips] Allow tail-call optimization for vararg functions and functions which
use the caller's stack.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167048 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 20:16:31 +00:00
Adhemerval Zanella
c83b5dc625 PowerPC: Expand FSRQT for vector types
This patch expands FSQRT for floating point vector types when altivec is
used.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167034 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 18:29:42 +00:00
Quentin Colombet
9a419f656e Change ForceSizeOpt attribute into MinSize attribute
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167020 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 16:32:52 +00:00
Adhemerval Zanella
5f41fd685b PowerPC: More support for Altivec compare operations
This patch adds more support for vector type comparisons using altivec.
It adds correct support for v16i8, v8i16, v4i32, and v4f32 vector
types for comparison operators ==, !=, >, >=, <, and <=.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167015 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 13:50:19 +00:00
Hans Wennborg
04d7d13d30 Use TargetTransformInfo to control switch-to-lookup table transformation
When the switch-to-lookup tables transform landed in SimplifyCFG, it
was pointed out that this could be inappropriate for some targets.
Since there was no way at the time for the pass to know anything about
the target, an awkward reverse-transform was added in CodeGenPrepare
that turned lookup tables back into switches for some targets.

This patch uses the new TargetTransformInfo to determine if a
switch should be transformed, and removes
CodeGenPrepare::ConvertLoadToSwitch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167011 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 11:23:25 +00:00
Reed Kotler
c09856b535 Change mips16 delay slot jumps to non delay slot forms by default.
We will make them delay slot forms if there is something that can be
placed in the delay slot during a separate pass. Mips16 extended instructions
cannot be placed in delay slots.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166990 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 00:54:49 +00:00
Jakub Staszak
a24262a0f5 Re-commit r166971. I reverted it to quickly, when buildbots didn't have a chance
to test it with chapni's fix (-mattr=+avx).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166985 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 00:01:57 +00:00
Jakub Staszak
c1ed096b6b Revert r166971. It causes buildbot failure. To be investigated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166979 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 23:13:50 +00:00
NAKAMURA Takumi
926dd447f1 llvm/test/CodeGen/X86/vec_shuffle-30.ll: Try to unbreak builds - assuming +avx.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166974 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 22:45:18 +00:00
Jakub Staszak
6d317824a5 Allow to fold vector load if there is more than one bitcast, so in the case:
%0 = load <8 x i16>* %dest
%1 = shufflevector <8 x i16> %0, <8 x i16> %in,
      <8 x i32> < i32 0, i32 1, i32 2, i32 3, i32 13, i32 undef, i32 14, i32 14>
store <8 x i16> %1, <8 x i16>* %dest

We get:
  vmovlpd (%eax), %xmm0, %xmm0

instead of:
  vmovaps (%eax), %xmm1
  vmovsd  %xmm1, %xmm0, %xmm0

No extra test-case is added. I just fixed the existing one
(also it uses FileCheck now).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166971 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 21:56:35 +00:00
Bill Schmidt
e6c56433de This patch solves a problem with passing varargs parameters under the PPC64
ELF ABI.

A varargs parameter consisting of a single-precision floating-point value,
or of a single-element aggregate containing a single-precision floating-point
value, must be passed in the low-order (rightmost) four bytes of the
doubleword stack slot reserved for that parameter.  If there are GPR protocol
registers remaining, the parameter must also be mirrored in the low-order
four bytes of the reserved GPR.

Prior to this patch, such parameters were being passed in the high-order
four bytes of the stack slot and the mirrored GPR.

The patch adds a new test case to verify the correct code generation.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166968 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 21:18:16 +00:00
Reed Kotler
576b1dbbef Implement patterns for extloadi8 and extloadi16
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166960 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 19:39:04 +00:00
Ulrich Weigand
e669c930a6 In various places throughout the code generator, there were special
checks to avoid performing compile-time arithmetic on PPCDoubleDouble.

Now that APFloat supports arithmetic on PPCDoubleDouble, those checks
are no longer needed, and we can treat the type like any other.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166958 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 18:35:49 +00:00
Chad Rosier
53e216b304 Remove redundant test case from r166949, per Eli's suggestion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166953 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 18:18:26 +00:00
Chad Rosier
2fbc239e4f [ms-inline asm] Add support for the [] operator. Essentially, [expr1][expr2] is
equivalent to [expr1 + expr2].  See test cases for more examples.
rdar://12470392

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166949 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 18:01:54 +00:00
Michael Liao
2a2263e744 Fix PR14204
- Add missing pattern on X86ISD::VZEXT from VR256 to VR256 when AVX2 is enabled.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166947 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 17:57:12 +00:00
Jakob Stoklund Olesen
573303e62a Completely disallow partial copies in adjustCopiesBackFrom().
Partial copies can show up even when CoalescerPair.isPartial() returns
false. For example:

   %vreg24:dsub_0<def> = COPY %vreg31:dsub_0; QPR:%vreg24,%vreg31

Such a partial-partial copy is not good enough for the transformation
adjustCopiesBackFrom() needs to do.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166944 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 17:51:52 +00:00
Ulrich Weigand
78dab643e0 Allow i32/i64 for 'f' constraint on PowerPC.
This fixes PR12757.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166943 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 17:49:34 +00:00
Reed Kotler
8834a20d5d Expand all atomic ops for mips16.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166935 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 16:16:54 +00:00
Preston Gurd
a836563e32 This patch addresses a problem with the Post RA scheduler generating an
incorrect instruction sequence due to it not being aware that an
inline assembly instruction may reference memory.

This patch fixes the problem by causing the scheduler to always assume that any
inline assembly code instruction could access memory. This is necessary because
the internal representation of the inline instruction does not include
any information about memory accesses.
 
This should fix PR13504.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166929 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 15:01:23 +00:00
Bill Schmidt
01d013ec04 This patch adds alignment information for long double to the 64-bit PowerPC
ELF subtarget.

The existing logic is used as a fallback to avoid any changes to the Darwin
ABI.  PPC64 ELF now has two possible data layout strings: one for FreeBSD,
which requires 8-byte alignment, and a default string that requires
16-byte alignment.

I've added a test for PPC64 Linux to verify the 16-byte alignment.  If
somebody wants to add a separate test for FreeBSD, that would be great.

Note that there is a companion patch to update the alignment information
in Clang, which I am committing now as well.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166928 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 14:59:36 +00:00
Reed Kotler
3a9f4568fb Implement brind operator for mips16.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166903 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-28 23:08:07 +00:00
Reed Kotler
f99998a2b0 This patch is for the implementation of mips16 complex pattern addr16.
Previously mips16 was sharing the pattern addr which is used for mips32
and mips64. This had a number of problems:
1) Storing and loading byte and halfword quantities for mips16 has particular
problems due to the primarily non mips16 nature of SP. When we must
load/store byte/halfword stack objects in a function, we must create a mips16
alias register for SP. This functionality is tested in stchar.ll.
2) We need to have an FP register under certain conditions (such as 
dynamically sized alloca). We use mips16 register S0 for this purpose.
In this case, we also use this register when accessing frame objects so this
issue also affects the complex pattern addr16. This functionality is
tested in alloca16.ll.

The Mips16InstrInfo.td has been updated to use addr16 instead of addr.

The complex pattern C++ function for addr has been copied to addr16 and
updated to reflect the above issues.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166897 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-28 06:02:37 +00:00
Jakob Stoklund Olesen
163f67f4d9 Never attempt to join an early-clobber def with a regular kill.
This fixes PR14194.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166880 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-27 17:41:27 +00:00
Quentin Colombet
80acd97266 [code size][ARM] Emit regular call instructions instead of the move, branch sequence
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166854 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-27 01:10:17 +00:00
Reed Kotler
7797e8f901 Implement MipsHi for mips16
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166852 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-27 00:57:14 +00:00
Akira Hatanaka
21a9a98b77 [mips] Do not tail-call optimize vararg functions or functions with byval
arguments.

This is rather conservative and should be fixed later to be more aggressive.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166851 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-27 00:56:56 +00:00
Akira Hatanaka
4618e0b574 [mips] Make sure FuncArg doesn't advance when OrigArgIndex is the same as in the
previous iteration.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166850 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-27 00:44:39 +00:00
Jakob Stoklund Olesen
17f42e02a1 Revert r163298 "Optimize codegen for VSETLNi{8,16,32} operating on Q registers."
Keep the integer_insertelement test case, the new coalescer can handle
this kind of lane insertion without help from pseudo-instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166835 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 23:39:46 +00:00
Reed Kotler
25424154f4 implement mips16 tls global addr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166827 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 22:57:32 +00:00
Jakob Stoklund Olesen
cd275f5687 Add GPRPair Register class to ARM.
Some instructions in ARM require 2 even-odd paired GPRs. This
patch adds support for such register class.

Patch by Weiming Zhao!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166816 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 21:29:15 +00:00
Reed Kotler
a81be80b0e Implement carry for subtract/add for mips16
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166755 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 04:46:26 +00:00
Reed Kotler
28a6214b59 implement large (>16 bit) constant loading.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166749 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 03:09:34 +00:00
Reed Kotler
30675b12fd fix test setgek.ll so that it will not give false "make check"
failure in some cases


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166747 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 01:29:42 +00:00
Reed Kotler
4c5a6dab17 implement mips16 patterns for select nodes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166721 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 21:33:30 +00:00
Michael Liao
8d7cd1d8fc Add test for ATOM ISA SSSE3
- Remove SSE4.1 feature in other ATOM-based test cases



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166699 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 17:50:05 +00:00
Bill Schmidt
37900c5dcb This patch addresses a PPC64 ELF issue with passing parameters consisting of
structs having size 3, 5, 6, or 7.  Such a struct must be passed and received
as right-justified within its register or memory slot.  The problem is only
present for structs that are passed in registers.

Previously, as part of a patch handling all structs of size less than 8, I
added logic to rotate the incoming register so that the struct was left-
justified prior to storing the whole register.  This was incorrect because
the address of the parameter had already been adjusted earlier to point to
the right-adjusted value in the storage slot.  Essentially I had accidentally
accounted for the right-adjustment twice.

In this patch, I removed the incorrect logic and reorganized the code to make
the flow clearer.

The removal of the rotates changes the expected code generation, so test case
structsinregs.ll has been modified to reflect this.  I also added a new test
case, jaggedstructs.ll, to demonstrate that structs of these sizes can now
be properly received and passed.

I've built and tested the code on powerpc64-unknown-linux-gnu with no new
regressions.  I also ran the GCC compatibility test suite and verified that
earlier problems with these structs are now resolved, with no new regressions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166680 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 13:38:09 +00:00
Elena Demikhovsky
c0cd72204d The test avx-intel-ocl.ll failed. I can't reproduce on any of my machines. I added -mcpu flag, may be it will fix the problem
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166669 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 08:38:42 +00:00
Chad Rosier
b3009eec47 [ms-inline asm] Add back-end test case for r166632. Make sure we emit the
correct .s output as well as get the correct encoding by the integrated
assembler.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166638 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 23:10:28 +00:00
Evan Cheng
d258eb3ec5 Fix a miscompilation caused by a typo. When turning a adde with negative value
into a sbc with a positive number, the immediate should be complemented, not
negated. Also added a missing pattern for ARM codegen.

rdar://12559385


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166613 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 19:53:01 +00:00
Elena Demikhovsky
3575222175 Special calling conventions for Intel OpenCL built-in library.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166566 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 14:46:16 +00:00
Michael Liao
1a5cc710ee Teach DAG combine to fold (buildvec (Xint2fp x)) to (Xint2fp (buildvec x))
- If more than 1 elemennts are defined and target supports the vectorized
  conversion, use the vectorized one instead to reduce the strength on
  conversion operation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166546 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 04:14:18 +00:00
Michael Liao
991b6a22b6 Add custom conversion from v2u32 to v2f32 in 32-bit mode
- As there's no 64-bit GPRs in 32-bit mode, a custom conversion from v2u32 to
  v2f32 is added to improve the efficiency of the code generated.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166545 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 04:09:32 +00:00
Akira Hatanaka
2ef5bd3ba6 [mips] Make sure sret argument is returned in register V0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166539 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 02:10:54 +00:00
Rafael Espindola
847a9c6d77 Change x86_fastcallcc to require inreg markers. This allows it to known
the difference from "int x" (which should go in registers and
"struct y {int x;}" (which should not).

Clang will be updated in the next patches.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166536 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 01:58:48 +00:00
Michael Liao
0787274b70 Fix PR14161
- Check index being extracted to be constant 0 before simplfiying.
  Otherwise, retain the original sequence.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166504 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-23 21:40:15 +00:00
Michael Liao
d9d09600ee Enable lowering ZERO_EXTEND/ANY_EXTEND to PMOVZX from SSE4.1
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166486 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-23 17:34:00 +00:00
Reed Kotler
293e5d0e0a implement setXX patterns
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166459 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-23 01:35:48 +00:00
Bill Wendling
8f47fc8f00 When a block ends in an indirect branch, add its successors to the machine basic block.
The CFG of the machine function needs to know that the targets of the indirect
branch are successors to the indirect branch.
<rdar://problem/12529625>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166448 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-22 23:30:04 +00:00
Akira Hatanaka
30580cea43 [mips] Use 64-bit registers to return an sret pointer if target ABI is N64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166344 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 22:11:40 +00:00
Akira Hatanaka
2b861be96e [mips] Add code to do tail call optimization.
Currently, it is enabled only if option "enable-mips-tail-calls" is given and
all of the callee's arguments are passed in registers.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166342 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 21:47:33 +00:00
Shuxin Yang
970755e519 This patch is to fix radar://8426430. It is about llvm support of __builtin_debugtrap()
which is supposed to consistently raise SIGTRAP across all systems. In contrast,
__builtin_trap() behave differently on different systems. e.g. it raises SIGTRAP on ARM, and
SIGILL on X86. The purpose of __builtin_debugtrap() is to consistently provide "trap"
functionality, in the mean time preserve the compatibility with on gcc on __builtin_trap().

  The X86 backend is already able to handle debugtrap(). This patch is to:
  1) make front-end recognize "__builtin_debugtrap()" (emboddied in the one-line change to Clang).
  2) In DAG legalization phase, by default, "debugtrap" will be replaced with "trap", which
     make the __builtin_debugtrap() "available" to all existing ports without the hassle of
     changing their code.
  3) If trap-function is specified (via -trap-func=xyz to llc), both __builtin_debugtrap() and
     __builtin_trap() will be expanded into the function call of the specified trap function.
    This behavior may need change in the future.

  The provided testing-case is to make sure 2) and 3) are working for ARM port, and we
already have a testing case for x86. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166300 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 20:11:16 +00:00
Michael Liao
facace808c Lower BUILD_VECTOR to SHUFFLE + INSERT_VECTOR_ELT for X86
- If INSERT_VECTOR_ELT is supported (above SSE2, either by custom
  sequence of legal insn), transform BUILD_VECTOR into SHUFFLE +
  INSERT_VECTOR_ELT if most of elements could be built from SHUFFLE with few
  (so far 1) elements being inserted.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166288 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 17:15:18 +00:00
Stepan Dyatkovskiy
0d3c8d5d16 ARM:
Removed extra stack frame object for fixed byval arguments,
VarArgsStyleRegisters invocation was reworked due to some improper usage in
past. PR14099 also demonstrates it.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166273 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 08:23:06 +00:00
Sebastian Pop
83ba06afa8 Clear unknown mem ops when merging stack slots (pr14090)
When merging stack slots, if StackColoring::remapInstructions gets a
value back from GetUnderlyingObject that it does not know about or is
not itself a stack slot, clear the memory operand in case it aliases
the merged slot. This prevents the introduction of incorrect aliasing
information.

Author:    Matthew Curtis <mcurtis@codeaurora.org>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166216 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-18 19:53:48 +00:00
Nadav Rotem
1c5bf3f429 In SimplifySelectOps we pulled two loads through a select node despite the fact that one was dependent on the other.
rdar://12513091



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166196 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-18 18:06:48 +00:00
Ulrich Weigand
6c28a7eec8 This patch fixes failures in the SingleSource/Regression/C/uint64_to_float
test case on PowerPC caused by rounding errors when converting from a 64-bit
integer to a single-precision floating point. The reason for this are
double-rounding effects, since on PowerPC we have to convert to an
intermediate double-precision value first, which gets rounded to the
final single-precision result.

The patch fixes the problem by preparing the 64-bit integer so that the
first conversion step to double-precision will always be exact, and the
final rounding step will result in the correctly-rounded single-precision
result.  The generated code sequence is equivalent to what GCC would generate.

When -enable-unsafe-fp-math is in effect, that extra effort is omitted
and we accept possible rounding errors (just like GCC does as well).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166178 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-18 13:16:11 +00:00
Michael Liao
07edaf3801 Revert part of r166049 back and enable test case in r166125.
- Folding (trunc (concat ... X )) to (concat ... (trunc X) ...) is valid
  when '...' are all 'undef's.
- r166125 relies on this transformation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166155 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 23:45:54 +00:00
Michael Liao
6e0c2b36d6 Disable extract-concat test case temporarily
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166141 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 23:08:19 +00:00
Michael Liao
4031e9018b Revert r166049
- In general, it's unsafe for this transformation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166135 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 22:41:15 +00:00
Reed Kotler
95a2bb4cdf Add conditional branch instructions and their patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166134 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 22:29:54 +00:00
Michael Liao
13429e224c Teach DAG combine to fold (extract_subvec (concat v1, ..) i) to v_i
- If the extracted vector has the same type of all vectored being concatenated
  together, it should be simplified directly into v_i, where i is the index of
  the element being extracted.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166125 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 20:48:33 +00:00
Anton Korobeynikov
e4b33a115b Fix fallout from RegInfo => FrameLowering refactoring on MSP430.
Patch by Job Noorman!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166108 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 17:37:11 +00:00
Michael Liao
281ae5abf5 Fix setjmp on models with non-Small code model nor non-Static relocation model
- MBB address is only valid as an immediate value in Small & Static
  code/relocation models. On other models, LEA is needed to load IP address of
  the restore MBB.
- A minor fix of MBB in MC lowering is added as well to enable target
  relocation flag being propagated into MC.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166084 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 02:22:27 +00:00
Jakob Stoklund Olesen
320db3f805 Avoid rematerializing a redef immediately after the old def.
PR14098 contains an example where we would rematerialize a MOV8ri
immediately after the original instruction:

  %vreg7:sub_8bit<def> = MOV8ri 9; GR32_ABCD:%vreg7
  %vreg22:sub_8bit<def> = MOV8ri 9; GR32_ABCD:%vreg7

Besides being pointless, it is also wrong since the original instruction
only redefines part of the register, and the value read by the new
instruction is wrong.

The problem was the LiveRangeEdit::allUsesAvailableAt() didn't
special-case OrigIdx == UseIdx and found the wrong SSA value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166068 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 22:51:58 +00:00
Jakob Stoklund Olesen
cdcdfd2cab Revert r166046 "Switch back to the old coalescer for now to fix the 32 bit bit"
A fix for PR14098, including the test case is in the next commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166067 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 22:51:55 +00:00
Michael Liao
272ea03239 Teach DAG combine to fold (trunc (fptoXi x)) to (fptoXi x)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166049 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 19:38:35 +00:00
Rafael Espindola
6f7cccd2e2 Switch back to the old coalescer for now to fix the 32 bit bit
llvm+clang+compiler-rt bootstrap.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166046 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 19:34:06 +00:00
Bill Schmidt
7a6cb15a92 This patch addresses PR13949.
For the PowerPC 64-bit ELF Linux ABI, aggregates of size less than 8
bytes are to be passed in the low-order bits ("right-adjusted") of the
doubleword register or memory slot assigned to them.  A previous patch
addressed this for aggregates passed in registers.  However, small
aggregates passed in the overflow portion of the parameter save area are
still being passed left-adjusted.

The fix is made in PPCTargetLowering::LowerCall_Darwin_Or_64SVR4 on the
caller side, and in PPCTargetLowering::LowerFormalArguments_64SVR4 on
the callee side.  The main fix on the callee side simply extends
existing logic for 1- and 2-byte objects to 1- through 7-byte objects,
and correcting a constant left over from 32-bit code.  There is also a
fix to a bogus calculation of the offset to the following argument in
the parameter save area.

On the caller side, again a constant left over from 32-bit code is
fixed.  Additionally, some code for 1, 2, and 4-byte objects is
duplicated to handle the 3, 5, 6, and 7-byte objects for SVR4 only.  The
LowerCall_Darwin_Or_64SVR4 logic is getting fairly convoluted trying to
handle both ABIs, and I propose to separate this into two functions in a
future patch, at which time the duplication can be removed.

The patch adds a new test (structsinmem.ll) to demonstrate correct
passing of structures of all seven sizes.  Eight dummy parameters are
used to force these structures to be in the overflow portion of the
parameter save area.

As a side effect, this corrects the case when aggregates passed in
registers are saved into the first eight doublewords of the parameter
save area:  Previously they were stored left-justified, and now are
properly stored right-justified.  This requires changing the expected
output of existing test case structsinregs.ll.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166022 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 13:30:53 +00:00
Stepan Dyatkovskiy
b52ba9f8a8 Issue:
Stack is formed improperly for long structures passed as byval arguments for
EABI mode.

If we took AAPCS reference, we can found the next statements:

A: "If the argument requires double-word alignment (8-byte), the NCRN (Next
Core Register Number) is rounded up to the next even register number." (5.5
Parameter Passing, Stage C, C.3).

B: "The alignment of an aggregate shall be the alignment of its most-aligned
component." (4.3 Composite Types, 4.3.1 Aggregates).

So if we have structure with doubles (9 double fields) and 3 Core unused
registers (r1, r2, r3): caller should use r2 and r3 registers only.
Currently r1,r2,r3 set is used, but it is invalid.

Callee VA routine should also use r2 and r3 regs only. All is ok here. This
behaviour is guessed by rounding up SP address with ADD+BFC operations.

Fix:
Main fix is in ARMTargetLowering::HandleByVal. If we detected AAPCS mode and
8 byte alignment, we waste odd registers then.

P.S.:
I also improved LDRB_POST_IMM regression test. Since ldrb instruction will
not generated by current regression test after this patch. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166018 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 07:16:47 +00:00
NAKAMURA Takumi
e26874556b Reapply r165661, Patch by Shuxin Yang <shuxin.llvm@gmail.com>.
Original message:

The attached is the fix to radar://11663049. The optimization can be outlined by following rules:

   (select (x != c), e, c) -> select (x != c), e, x),
   (select (x == c), c, e) -> select (x == c), x, e)
where the <c> is an integer constant.

 The reason for this change is that : on x86, conditional-move-from-constant needs two instructions;
however, conditional-move-from-register need only one instruction.

  While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place. The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase.

  The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource".

Original message since r165661:

My previous change has a bug: I negated the condition code of a CMOV, and go ahead creating a new CMOV using the *ORIGINAL* condition code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166017 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 06:28:34 +00:00
Rafael Espindola
0cead1cfef Fix the cpu name and add -verify-machineinstrs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166003 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 01:13:06 +00:00
Andrew Trick
27c28cef11 misched: Added handleMove support for updating all kill flags, not just for allocatable regs.
This is a medium term workaround until we have a more robust solution
in the form of a register liveness utility for postRA passes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166001 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 00:22:51 +00:00
Michael Liao
6c0e04c823 Add __builtin_setjmp/_longjmp supprt in X86 backend
- Besides used in SjLj exception handling, __builtin_setjmp/__longjmp is also
  used as a light-weight replacement of setjmp/longjmp which are used to
  implementation continuation, user-level threading, and etc. The support added
  in this patch ONLY addresses this usage and is NOT intended to support SjLj
  exception handling as zero-cost DWARF exception handling is used by default
  in X86.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165989 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 22:39:43 +00:00
Jim Grosbach
64ba635209 ARM: v1i64 and v2i64 VBSL intrinsic support.
rdar://12502028

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165981 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 21:23:40 +00:00
Andrew Trick
874c0a6ec7 Check output of the misched unit tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165959 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 20:33:14 +00:00
Rafael Espindola
32522201a8 Add a cpu to try to fix the atom builder.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165956 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 19:25:43 +00:00
Rafael Espindola
3da6f0392a Add testcase for pr14088.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165954 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 19:00:10 +00:00
Andrew Trick
bb20b24224 misched tests: add a triple to speculatively fix windows builders.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165952 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 18:21:08 +00:00
Andrew Trick
1e94e98b0e misched: ILP scheduler for experimental heuristics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165950 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 18:02:27 +00:00
Silviu Baranga
bb1078ea13 Fixed PR13938: the ARM backend was crashing because it couldn't select a VDUPLANE node with the vector input size different from the output size. This was bacause the BUILD_VECTOR lowering code didn't check that the size of the input vector was correct for using VDUPLANE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165929 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 09:41:32 +00:00
Jakob Stoklund Olesen
d86296a4ae Drop <def,dead> flags when merging into an unused lane.
The new coalescer can merge a dead def into an unused lane of an
otherwise live vector register.

Clear the <dead> flag when that happens since the flag refers to the
full virtual register which is still live after the partial dead def.

This fixes PR14079.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165877 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-13 17:26:47 +00:00
Jakob Stoklund Olesen
af89690760 Allow for loops in LiveIntervals::pruneValue().
It is possible that the live range of the value being pruned loops back
into the kill MBB where the search started. When that happens, make sure
that the beginning of KillMBB is also pruned.

Instead of starting a DFS at KillMBB and skipping the root of the
search, start a DFS at each KillMBB successor, and allow the search to
loop back to KillMBB.

This fixes PR14078.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165872 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-13 16:15:31 +00:00
Benjamin Kramer
f8b65aaf39 X86: Fix accidentally swapped operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165871 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-13 12:50:19 +00:00
Benjamin Kramer
444dccecfc X86: Promote i8 cmov when both operands are coming from truncates of the same width.
X86 doesn't have i8 cmovs so isel would emit a branch. Emitting branches at this
level is often not a good idea because it's too late for many optimizations to
kick in. This solution doesn't add any extensions (truncs are free) and tries
to avoid introducing partial register stalls by filtering direct copyfromregs.

I'm seeing a ~10% speedup on reading a random .png file with libpng15 via
graphicsmagick on x86_64/westmere, but YMMV depending on the microarchitecture.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165868 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-13 10:39:49 +00:00
Manman Ren
e6c3cc8dc5 ARM: tail-call inside a function where part of a byval argument is on caller's
local frame causes problem.

For example:
void f(StructToPass s) {
  g(&s, sizeof(s));
}
will cause problem with tail-call since part of s is passed via registers and
saved in f's local frame. When g tries to access s, part of s may be corrupted
since f's local frame is popped out before the tail-call.

The current fix is to disable tail-call if getVarArgsRegSaveSize is not 0 for
the caller. This is a conservative approach, if we can prove the address of
s or part of s is not taken and passed to g, it should be okay to perform
tail-call.

rdar://12442472


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165853 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 23:39:43 +00:00
Jakob Stoklund Olesen
2bbb07d13c Fix buildbots: -misched=shuffle is only available in +Asserts builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165846 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 23:01:33 +00:00
Jim Grosbach
4346fa9437 ARM: Mark VSELECT as 'expand'.
The backend already pattern matches to form VBSL when it can. We may want to
teach it to use the vbsl intrinsics at some point to prevent machine licm from
mucking with this, but using the Expand is completely correct.

http://llvm.org/bugs/show_bug.cgi?id=13831
http://llvm.org/bugs/show_bug.cgi?id=13961

Patch by Peter Couperus <peter.couperus@st.com>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165845 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 22:59:21 +00:00
Jakob Stoklund Olesen
ad5e969ba9 Use a transposed algorithm for handleMove().
Completely update one interval at a time instead of collecting live
range fragments to be updated. This avoids building data structures,
except for a single SmallPtrSet of updated intervals.

Also share code between handleMove() and handleMoveIntoBundle().

Add support for moving dead defs across other live values in the
interval. The MI scheduler can do that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165824 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 21:31:57 +00:00
Jakob Stoklund Olesen
795f951c6d Fix coalescing with IMPLICIT_DEF values.
PHIElimination inserts IMPLICIT_DEF instructions to guarantee that all
PHI predecessors have a live-out value. These IMPLICIT_DEF values are
not considered to be real interference when coalescing virtual
registers:

  %vreg1 = IMPLICIT_DEF
  %vreg2 = MOV32r0

When joining %vreg1 and %vreg2, the IMPLICIT_DEF instruction and its
value number should simply be erased since the %vreg2 value number now
provides a live-out value for the PHI predecesor block.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165813 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 18:03:04 +00:00
NAKAMURA Takumi
20ce6e6562 llvm/test/CodeGen/PowerPC/2012-10-12-bitcast.ll: Try to fix failure on non-ppc hosts, to add -mattr=+altivec.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165803 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 16:01:08 +00:00
Ulrich Weigand
7bbb9c7b4a Fix big-endian codegen bug in DAGTypeLegalizer::ExpandRes_BITCAST
On PowerPC, a bitcast of <16 x i8> to i128 may run through a code
path in ExpandRes_BITCAST that attempts to do an intermediate
bitcast to a <4 x i32> vector, and then construct the Hi and Lo parts
of the resulting i128 by pairing up two of those i32 vector elements
each.  The code already recognizes that on a big-endian system, the
first two vector elements form the Hi part, and the final two vector
elements form the Lo part (vice-versa from the little-endian situation).

However, we also need to take endianness into account when forming each
of those separate pairs:  on a big-endian system, vector element 0 is
the *high* part of the pair making up the Hi part of the result, and
vector element 1 is the low part of the pair.  The code currently always
uses vector element 0 as the low part and vector element 1 as the high
part, as is appropriate for little-endian platforms only.

This patch fixes this by swapping the vector elements as they are
paired up as appropriate.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165802 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 15:42:58 +00:00
Reed Kotler
7d90d4d709 Div, Rem int/unsigned int
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165783 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 02:01:09 +00:00
Evan Cheng
d36696c4e0 Legalizer optimize a pair of div / mod to a call to divrem libcall if they are
not legal. However, it should use a div instruction + mul + sub if divide is
legal. The rem legalization code was missing a check and incorrectly uses a
divrem libcall even when div is legal.

rdar://12481395


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165778 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 01:15:47 +00:00
Jakob Stoklund Olesen
ebba49395c Pass an explicit operand number to addLiveIns.
Not all instructions define a virtual register in their first operand.
Specifically, INLINEASM has a different format.

<rdar://problem/12472811>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165721 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-11 16:46:07 +00:00
Bill Schmidt
a867f37897 This patch addresses PR13947.
For function calls on the 64-bit PowerPC SVR4 target, each parameter
is mapped to as many doublewords in the parameter save area as
necessary to hold the parameter.  The first 13 non-varargs
floating-point values are passed in registers; any additional
floating-point parameters are passed in the parameter save area.  A
single-precision floating-point parameter (32 bits) must be mapped to
the second (rightmost, low-order) word of its assigned doubleword
slot.

Currently LLVM violates this ABI requirement by mapping such a
parameter to the first (leftmost, high-order) word of its assigned
doubleword slot.  This is internally self-consistent but will not
interoperate correctly with libraries compiled with an ABI-compliant
compiler.

This patch corrects the problem by adjusting the parameter addressing
on both sides of the calling convention.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165714 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-11 15:38:20 +00:00
NAKAMURA Takumi
e0297196ed Revert r165661, "Patch by Shuxin Yang <shuxin.llvm@gmail.com>."
It broke stage2 clang and test-suite/MultiSource/Benchmarks/mediabench/g721/g721encode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165692 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-11 02:02:05 +00:00
Evan Cheng
6b61491de3 Add isel patterns for v2f32 / v4f32 neon.vbsl intrinsics. rdar://12471808
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165673 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 23:06:34 +00:00
Bill Schmidt
e0bc37579b Add -mattr=+altivec and remove XFAIL.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165666 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 22:25:11 +00:00
Bill Schmidt
3a55b64e5e XFAIL for all targets pending investigation
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165664 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 21:52:10 +00:00
Nadav Rotem
87255a431b Patch by Shuxin Yang <shuxin.llvm@gmail.com>.
Original message:

The attached is the fix to radar://11663049. The optimization can be outlined by following rules:

   (select (x != c), e, c) -> select (x != c), e, x),
   (select (x == c), c, e) -> select (x == c), x, e)
where the <c> is an integer constant.

 The reason for this change is that : on x86, conditional-move-from-constant needs two instructions;
however, conditional-move-from-register need only one instruction.

  While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place. The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase.

  The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource".



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165661 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 21:31:55 +00:00
Bill Schmidt
26160f4e64 When generating spill and reload code for vector registers on PowerPC,
the compiler makes use of GPR0.  However, there are two flavors of
GPR0 defined by the target:  the 32-bit GPR0 (R0) and the 64-bit GPR0
(X0).  The spill/reload code makes use of R0 regardless of whether we
are generating 32- or 64-bit code.

This patch corrects the problem in the obvious manner, using X0 and
ADDI8 for 64-bit and R0 and ADDI for 32-bit.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165658 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 21:25:01 +00:00
Bill Schmidt
a5d0ab5553 The PowerPC VRSAVE register has been somewhat of an odd beast since
the Altivec extensions were introduced.  Its use is optional, and
allows the compiler to communicate to the operating system which
vector registers should be saved and restored during a context switch.
In practice, this information is ignored by the various operating
systems using the SVR4 ABI; the kernel saves and restores the entire
register state.  Setting the VRSAVE register is no longer performed by
the AIX XL compilers, the IBM i compilers, or by GCC on Power Linux
systems.  It seems best to avoid this logic within LLVM as well.

This patch avoids generating code to update and restore VRSAVE for the
PowerPC SVR4 ABIs (32- and 64-bit).  The code remains in place for the
Darwin ABI.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165656 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 20:54:15 +00:00
Michael Liao
4e2c56bdcb Specify CPU model to avoid breaking ATOM builds
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165638 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 18:04:52 +00:00
Michael Liao
44c2d61b67 Add support for FP_ROUND from v2f64 to v2f32
- Due to the current matching vector elements constraints in
  ISD::FP_ROUND, rounding from v2f64 to v4f32 (after legalization from
  v2f32) is scalarized. Add a customized v2f32 widening to convert it
  into a target-specific X86ISD::VFPROUND to work around this
  constraints.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165631 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 16:53:28 +00:00
Stepan Dyatkovskiy
2c2cb3c09f Fix for LDRB instruction:
SDNode for LDRB_POST_IMM is invalid: number of registers added to SDNode fewer
that described in .td.

7 ops is needed, but SDNode with only 6 is created.

In more details:
In ARMInstrInfo.td, in multiclass AI2_ldridx, in definition _POST_IMM, offset
operand is defined as am2offset_imm. am2offset_imm is complex parameter type,
and actually it consists from dummy register and imm itself. As I understood
trick with dummy reg was made for AsmParser. In ARMISelLowering.cpp, this dummy
register was not added to SDNode, and it cause crash in Peephole Optimizer pass.

The problem fixed by setting up additional dummy reg when emitting
LDRB_POST_IMM instruction.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165617 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 11:43:40 +00:00
Stepan Dyatkovskiy
661afe75e8 Issue description:
SchedulerDAGInstrs::buildSchedGraph ignores dependencies between FixedStack
objects and byval parameters. So loading byval parameters from stack may be
inserted *before* it will be stored, since these operations are treated as
independent.

Fix:
Currently ARMTargetLowering::LowerFormalArguments saves byval registers with
FixedStack MachinePointerInfo. To fix the problem we need to store byval
registers with MachinePointerInfo referenced to first the "byval" parameter.

Also commit adds two new fields to the InputArg structure: Function's argument
index and InputArg's part offset in bytes relative to the start position of
Function's argument. E.g.: If function's argument is 128 bit width and it was
splitted onto 32 bit regs, then we got 4 InputArg structs with same arg index,
but different offset values. 



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165616 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 11:37:36 +00:00
Akira Hatanaka
97d9f081a9 Implement MipsTargetLowering::CanLowerReturn.
Patch by Sasa Stankovic. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165585 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 01:27:09 +00:00
Evan Cheng
e61e516a51 When expanding atomic load arith instructions, do not lose target flags. rdar://12453106
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165568 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-09 23:48:33 +00:00
Jakob Stoklund Olesen
6be75ae196 Don't crash on extra evil irreducible control flow.
When the CFG contains a loop with multiple entry blocks, the traces
computed by MachineTraceMetrics don't always have the same nice
properties. Loop back-edges are normally excluded from traces, but
MachineLoopInfo doesn't recognize loops with multiple entry blocks, so
those back-edges may be included.

Avoid asserting when that happens by adding an isEarlierInSameTrace()
function that accurately determines if a dominating block is part of the
same trace AND is above the currrent block in the trace.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165434 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-08 22:06:44 +00:00
Adhemerval Zanella
1c7d69bbe2 PR12716: PPC crashes on vector compare
Vector compare using altivec 'vcmpxxx' instructions have as third argument
a vector register instead of CR one, different from integer and float-point
compares. This leads to a failure in code generation, where 'SelectSETCC'
expects a DAG with a CR register and gets vector register instead.

This patch changes the behavior by just returning a DAG with the 
vector compare instruction based on the type. The patch also adds a testcase
for all vector types llvm defines.

It also included a fix on signed 5-bits predicates printing, where
signed values were not handled correctly as signed (char are unsigned by
default for PowerPC). This generates 'vspltisw' (vector splat)
instruction with SIM out of range.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165419 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-08 18:59:53 +00:00
Adhemerval Zanella
51aaadb7bd Add floating-point to and from integer conversion
This patch add altivec support for v4i32 to v4f32 and for v4f32 to
v4i32 vector rounding conversion.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165409 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-08 17:27:24 +00:00
Benjamin Kramer
dcf2420b07 X86: fcmov doesn't handle all possible EFLAGS, fall back to a branch for the others.
Otherwise it will try to use SSE patterns and fail horribly if sse is disabled.
Fixes PR14035.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165377 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-07 15:34:27 +00:00
Reed Kotler
dfb8dbb4fd Patch for integer multiply, signed/unsigned, long/long long.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165322 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-05 18:27:54 +00:00
Rafael Espindola
7c0278e14e Convert to unix line endings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165308 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-05 13:32:38 +00:00
Evan Cheng
2a2947885a Follow up to r165072. Try a different approach: only move the load when it's going to be folded into the call. rdar://12437604
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165287 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-05 01:48:22 +00:00
Nadav Rotem
ea2c50c041 When merging connsecutive stores, use vectors to store the constant zero.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165267 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-04 22:35:15 +00:00
Jim Grosbach
837c28a840 ARM: locate user-defined text sections next to default text.
Make sure functions located in user specified text sections (via the
section attribute) are located together with the default text sections.
Otherwise, for large object files, the relocations for call instructions
are more likely to be out of range. This becomes even more likely in the
presence of LTO.

rdar://12402636

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165254 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-04 21:33:24 +00:00
Chad Rosier
34448ae393 [ms-inline asm] Add support in the X86AsmPrinter for printing memory references
in the Intel syntax.

The MC layer supports emitting in the Intel syntax, but this would require the
inline assembly MachineInstr to be lowered to an MCInst before emission.  This
is potential future work, but for now emitting directly from the MachineInstr
suffices.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165173 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-03 22:06:44 +00:00
Nadav Rotem
2e7d38192d Fix a cycle in the DAG. In this code we replace multiple loads with a single load and
multiple stores with a single load. We create the wide loads and stores (and their chains)
before we remove the scalar loads and stores and fix the DAG chain. We attempted to merge
loads with a different chain. When that happened, the assumption that it is safe to RAUW
broke and a cycle was introduced.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165148 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-03 19:30:31 +00:00
Nadav Rotem
c653de6c0f A DAGCombine optimization for mergeing consecutive stores to memory. The optimization
is not profitable in many cases because modern processors perform multiple stores
in parallel and merging stores prior to merging requires extra work. We handle two main cases:

1. Store of multiple consecutive constants:
  q->a = 3;
  q->4 = 5;
In this case we store a single legal wide integer.

2. Store of multiple consecutive loads:
  int a = p->a;
  int b = p->b;
  q->a = a;
  q->b = b;
In this case we load/store either ilegal vector registers or legal wide integer registers.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165125 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-03 16:11:15 +00:00
Silviu Baranga
541a858f1a Fixed a bug in the ExecutionDependencyFix pass that caused dependencies to not propagate through implicit defs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165102 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-03 08:29:36 +00:00
Jakob Stoklund Olesen
0d141f867d The early if conversion pass is ready to be used as an opt-in.
Enable the pass by default for targets that request it, and change the
-enable-early-ifcvt to the opposite -disable-early-ifcvt.

There are still some x86 regressions when enabling early if-conversion
because of the missing machine models. Disable the pass for x86 until
machine models are added.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165075 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-03 00:51:32 +00:00
Evan Cheng
2b87e06d26 Fix a serious X86 instruction selection bug. In
X86DAGToDAGISel::PreprocessISelDAG(), isel is moving load inside
callseq_start / callseq_end so it can be folded into a call. This can
create a cycle in the DAG when the call is glued to a copytoreg. We
have been lucky this hasn't caused too many issues because the pre-ra
scheduler has special handling of call sequences. However, it has
caused a crash in a specific tailcall case.

rdar://12393897


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165072 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-02 23:49:13 +00:00
Nick Lewycky
b09649b335 Make sure to put our sret argument into %rax on x86-64. Fixes PR13563!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165063 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-02 22:45:06 +00:00
Jakob Stoklund Olesen
27cb347d0e Make sure the whole live range is covered when values are pruned twice.
JoinVals::pruneValues() calls LIS->pruneValue() to avoid conflicts when
overlapping two different values. This produces a set of live range end
points that are used to reconstruct the live range (with SSA update)
after joining the two registers.

When a value is pruned twice, the set of end points was insufficient:

  v1 = DEF
  v1 = REPLACE1
  v1 = REPLACE2
  KILL v1

The end point at KILL would only reconstruct the live range from
REPLACE2 to KILL, leaving the range REPLACE1-REPLACE2 dead.

Add REPLACE2 as an end point in this case so the full live range is
reconstructed.

This fixes PR13999.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165056 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-02 21:46:39 +00:00
Benjamin Kramer
fba80d9e97 Fix broken tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165019 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-02 15:49:34 +00:00