Commit Graph

4668 Commits

Author SHA1 Message Date
Evan Cheng
48575f6ea7 Making use of VFP / NEON floating point multiply-accumulate / subtraction is
difficult on current ARM implementations for a few reasons.
1. Even though a single vmla has latency that is one cycle shorter than a pair
   of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause
   additional pipeline stall. So it's frequently better to single codegen
   vmul + vadd.
2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to
   stall for 4 cycles. We need to schedule them apart.
3. A vmla followed vmla is a special case. Obvious issuing back to back RAW
   vmla + vmla is very bad. But this isn't ideal either:
     vmul
     vadd
     vmla
   Instead, we want to expand the second vmla:
     vmla
     vmul
     vadd
   Even with the 4 cycle vmul stall, the second sequence is still 2 cycles
   faster.

Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough
but it isn't the optimial solution. This patch attempts to make it possible to
use vmla / vmls in cases where it is profitable.

A. Add missing isel predicates which cause vmla to be codegen'ed.
B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to
   compute a fmul and a fmla.
C. Add additional isel checks for vmla, avoid cases where vmla is feeding into
   fp instructions (except for the #3 exceptional case).
D. Add ARM hazard recognizer to model the vmla / vmls hazards.
E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the
   vmla / vmls will trigger one of the special hazards.

Work in progress, only A+B are enabled.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120960 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-05 22:04:16 +00:00
Chris Lattner
9637d5b22e Teach X86ISelLowering that the second result of X86ISD::UMUL is a flags
result.  This allows us to compile:

void *test12(long count) {
      return new int[count];
}

into:

test12:
	movl	$4, %ecx
	movq	%rdi, %rax
	mulq	%rcx
	movq	$-1, %rdi
	cmovnoq	%rax, %rdi
	jmp	__Znam                  ## TAILCALL

instead of:

test12:
	movl	$4, %ecx
	movq	%rdi, %rax
	mulq	%rcx
	seto	%cl
	testb	%cl, %cl
	movq	$-1, %rdi
	cmoveq	%rax, %rdi
	jmp	__Znam

Of course it would be even better if the regalloc inverted the cmov to 'cmovoq',
which would eliminate the need for the 'movq %rdi, %rax'.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120936 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-05 07:49:54 +00:00
Chris Lattner
b20e0b1fdd it turns out that when ".with.overflow" intrinsics were added to the X86
backend that they were all implemented except umul.  This one fell back
to the default implementation that did a hi/lo multiply and compared the
top.  Fix this to check the overflow flag that the 'mul' instruction
sets, so we can avoid an explicit test.  Now we compile:

void *func(long count) {
      return new int[count];
}

into:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	seto	%cl                     ## encoding: [0x0f,0x90,0xc1]
	testb	%cl, %cl                ## encoding: [0x84,0xc9]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL

instead of:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL

Other than the silly seto+test, this is using the o bit directly, so it's going in the right
direction.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120935 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-05 07:30:36 +00:00
Chris Lattner
777dd07394 fix the rest of the linux miscompares :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120933 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-05 02:08:07 +00:00
Chris Lattner
96908b17ae generalize the previous check to handle -1 on either side of the
select, inserting a not to compensate.  Add a missing isZero check
that I lost somehow.

This improves codegen of:

void *func(long count) {
      return new int[count];
}

from:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL
                                        ## encoding: [0xeb,A]

to:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	cmpq	$1, %rdx                ## encoding: [0x48,0x83,0xfa,0x01]
	sbbq	%rdi, %rdi              ## encoding: [0x48,0x19,0xff]
	notq	%rdi                    ## encoding: [0x48,0xf7,0xd7]
	orq	%rax, %rdi              ## encoding: [0x48,0x09,0xc7]
	jmp	__Znam                  ## TAILCALL
                                        ## encoding: [0xeb,A]



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120932 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-05 02:00:51 +00:00
Chris Lattner
c8c20d1486 relax this to handle linux defaulting to -static.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120930 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-05 01:31:13 +00:00
Chris Lattner
a2b5600e61 Improve an integer select optimization in two ways:
1. generalize 
    (select (x == 0), -1, 0) -> (sign_bit (x - 1))
to:
    (select (x == 0), -1, y) -> (sign_bit (x - 1)) | y

2. Handle the identical pattern that happens with !=:
   (select (x != 0), y, -1) -> (sign_bit (x - 1)) | y

cmov is often high latency and can't fold immediates or
memory operands.  For example for (x == 0) ? -1 : 1, before 
we got:

< 	testb	%sil, %sil
< 	movl	$-1, %ecx
< 	movl	$1, %eax
< 	cmovel	%ecx, %eax

now we get:

> 	cmpb	$1, %sil
> 	sbbl	%eax, %eax
> 	orl	$1, %eax




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120929 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-05 01:23:24 +00:00
Chris Lattner
bced6a1b8f merge some tests into select.ll and make them more specific.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120928 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-05 01:13:58 +00:00
Chris Lattner
bbdabf411b rename test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120927 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-05 01:02:23 +00:00
Chris Lattner
63d7c17ff1 remove two tests that aren't really testing anything.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120926 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-05 01:02:13 +00:00
Benjamin Kramer
1292c22645 Add patterns for the x86 popcnt instruction.
- Also adds a new POPCNT subtarget feature that is currently enabled if the target
  supports SSE4.2 (nehalem) or SSE4A (barcelona).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120917 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-04 20:32:23 +00:00
Bob Wilson
c24130bade The Thumb tADDrSPi instruction is not valid when the destination is SP.
Check for that and try narrowing it to tADDspi instead.  Radar 8724703.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120892 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-04 04:40:19 +00:00
Jim Grosbach
41ad0c4c73 When using the 'push' mnemonic for Thumb2 stmdb, be explicit when it's the
32-bit wide version by adding the .w suffix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120838 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-03 20:33:01 +00:00
Devang Patel
3fda44f276 Hide tests, that check .loc, .file in output assembly, from darwin9 buildbot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120750 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-02 23:29:58 +00:00
Devang Patel
ee4854faf3 Use set directive for StartMinusEndExpr.
This is a fix for llvm-gcc-i386-darwin9 buildbot failure.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120742 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-02 21:32:30 +00:00
Evan Cheng
fabdafbacb Fix test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120730 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-02 20:17:34 +00:00
Evan Cheng
1bf891ae6e Fix and re-enable tail call optimization of expanded libcalls.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120622 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-01 22:59:46 +00:00
Owen Anderson
9d63d90de5 Add correct encodings for STRD and LDRD, including fixup support. Additionally, update these to unified syntax.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120589 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-01 19:18:46 +00:00
Evan Cheng
28cd48fffb Speculatively disable x86 portion of r120501 to appease the x86_64 buildbot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120549 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-01 03:27:20 +00:00
Jason W Kim
85fed5e0c5 ARM/MC/ELF relocation "hello world" for movw/movt.
Lifted adjustFixupValue() from Darwin for sharing w ELF.
Test added
TODO:
  refactor ELFObjectWriter::RecordRelocation more.
  Possibly share more code with Darwin?
  Lots more relocations...



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120534 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-01 02:40:06 +00:00
Evan Cheng
3d2125c9db Enable sibling call optimization of libcalls which are expanded during
legalization time. Since at legalization time there is no mapping from
SDNode back to the corresponding LLVM instruction and the return
SDNode is target specific, this requires a target hook to check for
eligibility. Only x86 and ARM support this form of sibcall optimization
right now.
rdar://8707777


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120501 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-30 23:55:39 +00:00
Che-Liang Chiou
21d8b9bcad ptx: add command-line options for gpu target and ptx version
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120423 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-30 10:14:14 +00:00
Eric Christopher
c459d06ae6 Not all platforms use _<func>. Duh.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120418 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-30 09:23:54 +00:00
Eric Christopher
228232b282 Rewrite mwait and monitor support and custom lower arguments.
Fixes PR8573.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120404 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-30 07:20:12 +00:00
Bob Wilson
6c4c982f83 Add support for NEON VLD3-dup instructions.
The encoding for alignment in VLD4-dup instructions is still a work in progress.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120356 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-30 00:00:35 +00:00
Evan Cheng
1e0eab122b Mark Darwin call instructions as using "r7" to prevent the frame-register
assignment instructions from being moved below / above calls.
rdar://8690640


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120339 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-29 22:43:27 +00:00
Benjamin Kramer
59127b2a4e Add missing colon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120336 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-29 22:39:38 +00:00
Benjamin Kramer
8ad87ab166 Fix some broken CHECK lines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120332 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-29 22:34:55 +00:00
Bob Wilson
86c6d80a7a Add support for NEON VLD3-dup instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120312 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-29 19:35:29 +00:00
Kalle Raiskila
9363f739cd Handle lshr for i128 correctly on SPU also when
shiftamount > 7.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120288 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-29 14:44:28 +00:00
Kalle Raiskila
c2ebfd454c Enable PostRA scheduling for SPU.
This speeds up selected test cases with up to
5% - no slowdowns observed.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120286 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-29 10:30:25 +00:00
Bob Wilson
b1dfa7a8e0 Add support for NEON VLD2-dup instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120236 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-28 06:51:26 +00:00
Rafael Espindola
5bf7c534cf Lower TLS_addr32 and TLS_addr64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120225 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-27 20:43:02 +00:00
Bob Wilson
2a0e97431e Add NEON VLD1-dup instructions (load 1 element to all lanes).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120194 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-27 06:35:16 +00:00
Kalle Raiskila
702a4046a9 Allow for 'fcmp ogt' in SPU.
Fix by Visa Putkinen!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120090 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-24 11:42:17 +00:00
Bob Wilson
626613d5e8 Recognize sign/zero-extended constant BUILD_VECTORs for VMULL operations.
We need to check if the individual vector elements are sign/zero-extended
values.  For now this only handles constants values.  Radar 8687140.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120034 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-23 19:38:38 +00:00
Kalle Raiskila
0cc5b1f60e Division by pow-of-2 is not cheap on SPU, do it with
shifts.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120022 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-23 13:27:59 +00:00
Chris Lattner
2e1a75d6f4 filecheckize
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119987 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-23 02:26:52 +00:00
Evan Cheng
ab5c703fdb Fix epilogue codegen to avoid leaving the stack pointer in an invalid
state. Previously Thumb2 would restore sp from fp like this:
mov sp, r7
sub, sp, #4
If an interrupt is taken after the 'mov' but before the 'sub', callee-saved
registers might be clobbered by the interrupt handler. Instead, try
restoring directly from sp:
add sp, #4
Or, if necessary (with VLA, etc.) use a scratch register to compute sp and
then restore it:
sub.w r4, r7, #8
mov sp, r7
rdar://8465407


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119977 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-22 18:12:04 +00:00
Kalle Raiskila
d87e571e62 Fix a bug with extractelement on SPU.
In the attached testcase, the element was
never extracted (missing rotate).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119973 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-22 16:28:26 +00:00
Benjamin Kramer
ce750f0332 Implement the "if (X == 6 || X == 4)" -> "if ((X|2) == 6)" optimization.
This currently only catches the most basic case, a two-case switch, but can be
extended later.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119964 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-22 09:45:38 +00:00
Wesley Peck
46a928b864 Implement branch analysis in the MBlaze backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119951 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-21 21:53:36 +00:00
Andrew Trick
b9e6fe1e3a Removing the useless test that I added recently. It was meant as an example, but not complicated enough to merit another test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119898 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-20 07:26:51 +00:00
Dale Johannesen
76eb5f2401 Prefetch has a MemOperand now. FileCheckize a test.
This finishes up 8460971.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119848 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-19 21:49:38 +00:00
Mon P Wang
cab98e3168 Make isScalarToVector to return false if the node is a scalar. This will prevent
DAGCombine from making an illegal transformation of bitcast of a scalar to a
vector into a scalar_to_vector.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119819 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-19 19:08:12 +00:00
Tanya Lattner
9684a7c128 Fix bug in DAGCombiner for ARM that was trying to do a ShiftCombine on illegal types (vector should be split first).
Added test case.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119749 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-18 22:06:46 +00:00
Duncan Sands
dcfd3a798f The DAGCombiner was threading select over pairs of extending loads even
if the extension types were not the same.  The result was that if you
fed a select with sext and zext loads, as in the testcase, then it
would get turned into a zext (or sext) of the select, which is wrong
in the cases when it should have been an sext (resp. zext).  Reported
and diagnosed by Sebastien Deldon.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119728 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-18 20:05:18 +00:00
Eric Christopher
8b3ca6216d Rewrite stack callee saved spills and restores to use push/pop instructions.
Remove movePastCSLoadStoreOps and associated code for simple pointer
increments. Update routines that depended upon other opcodes for save/restore.

Adjust all testcases accordingly.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119725 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-18 19:40:05 +00:00
Rafael Espindola
5c0556341e Change CodeGen to use .loc directives. This produces a lot more readable output
and testing is easier.  A good example is the unknown-location.ll test that
now can just look for ".loc 1 0 0".  We also don't use a DW_LNE_set_address for
every address change anymore.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119613 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-18 02:04:25 +00:00
Dale Johannesen
b4ac2858da Do not throw away alignment when generating the DAG for
memset; we may need it to decide between MOVAPS and MOVUPS
later.  Adjust a test that was looking for wrong code.
PR 3866 / 8675131.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119605 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-18 01:35:23 +00:00