Commit Graph

5257 Commits

Author SHA1 Message Date
Benjamin Kramer
b20a8fc8a6 X86: Try to use a smaller encoding by transforming (X << C1) & C2 into (X & (C2 >> C1)) & C1. (Part of PR5039)
This tends to happen a lot with bitfield code generated by clang. A simple example for x86_64 is
uint64_t foo(uint64_t x) { return (x&1) << 42; }
which used to compile into bloated code:
	shlq	$42, %rdi               ## encoding: [0x48,0xc1,0xe7,0x2a]
	movabsq	$4398046511104, %rax    ## encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x00,0x04,0x00,0x00]
	andq	%rdi, %rax              ## encoding: [0x48,0x21,0xf8]
	ret                             ## encoding: [0xc3]

with this patch we can fold the immediate into the and:
	andq	$1, %rdi                ## encoding: [0x48,0x83,0xe7,0x01]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	shlq	$42, %rax               ## encoding: [0x48,0xc1,0xe0,0x2a]
	ret                             ## encoding: [0xc3]

It's possible to save another byte by using 'andl' instead of 'andq' but I currently see no way of doing
that without making this code even more complicated. See the TODOs in the code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129990 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-22 15:30:40 +00:00
Evan Cheng
db6cbe1ff1 In Thumb2 mode, lower frame indix references to:
add <rd>, sp, #<imm8>
ldr <rd>, [sp, #<imm8>]
When the offset from sp is multiple of 4 and in range of 0-1020.
This saves code size by utilizing 16-bit instructions.

rdar://9321541


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129971 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-22 01:42:52 +00:00
Devang Patel
71f3f1146f Fix DWARF description of Q registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129952 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-21 23:22:35 +00:00
Devang Patel
27f5acb7d4 Fix DWARF description of S registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129947 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-21 22:48:26 +00:00
Devang Patel
8859df5911 Test case for r129922
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129934 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-21 20:16:43 +00:00
Daniel Dunbar
63c21deee1 Revert r1296656, "Fix rdar://9289512 - not folding load into compare at -O0...",
which broke a couple GCC test suite tests at -O0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129914 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-21 16:14:46 +00:00
Che-Liang Chiou
5efde18dd2 ptx: fix parameter ordering
This patch depends on the prior fix r129908 that changes to use std::find,
rather than std::binary_search, on unordered array.

Patch by Dan Bailey


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129909 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-21 10:56:58 +00:00
Evan Cheng
c8578948c9 Remove -use-divmod-libcall. Let targets opt in when they are available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129884 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-20 22:20:12 +00:00
Stuart Hastings
2575a9c606 Un-XFAIL this test for ARM. <rdar://problem/7662569>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129875 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-20 21:47:45 +00:00
Justin Holewinski
e1fee48cd0 PTX: Add intrinsics to list of built-in intrinsics, which allows them to be
used by Clang.  To help Clang integration, the PTX target has been split
     into two targets: ptx32 and ptx64, depending on the desired pointer size.

- Add GCCBuiltin class to all intrinsics
- Split PTX target into ptx32 and ptx64

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129851 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-20 15:37:17 +00:00
Eric Christopher
abbbfbd672 Rewrite the expander for umulo/smulo to remember to sign extend the input
manually and pass all (now) 4 arguments to the mul libcall. Add a new
ExpandLibCall for just this (copied gratuitously from type legalization).

Fixes rdar://9292577


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129842 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-20 01:19:45 +00:00
Daniel Dunbar
d285139e0e llc: Eliminate a use of getDarwinMajorNumber().
- As before, there is a minor semantic change here (evidenced by the test
   change) for Darwin triples that have no version component. I debated changing
   the default behavior of isOSVersionLT, but decided it made more sense for
   triples to be explicit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129805 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-19 20:46:13 +00:00
Daniel Dunbar
ebc5066b9b CodeGen: Eliminate a use of getDarwinMajorNumber().
- There is a minor semantic change here (evidenced by the test change) for
   Darwin triples that have no version component. I debated changing the default
   behavior of isOSVersionLT, but decided it made more sense for triples to be
   explicit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129802 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-19 20:32:39 +00:00
Bob Wilson
84c5eed15b This patch combines several changes from Evan Cheng for rdar://8659675.
Making use of VFP / NEON floating point multiply-accumulate / subtraction is
difficult on current ARM implementations for a few reasons.
1. Even though a single vmla has latency that is one cycle shorter than a pair
   of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause
   additional pipeline stall. So it's frequently better to single codegen
   vmul + vadd.
2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to
   stall for 4 cycles. We need to schedule them apart.
3. A vmla followed vmla is a special case. Obvious issuing back to back RAW
   vmla + vmla is very bad. But this isn't ideal either:
     vmul
     vadd
     vmla
   Instead, we want to expand the second vmla:
     vmla
     vmul
     vadd
   Even with the 4 cycle vmul stall, the second sequence is still 2 cycles
   faster.

Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough
but it isn't the optimial solution. This patch attempts to make it possible to
use vmla / vmls in cases where it is profitable.

A. Add missing isel predicates which cause vmla to be codegen'ed.
B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to
   compute a fmul and a fmla.
C. Add additional isel checks for vmla, avoid cases where vmla is feeding into
   fp instructions (except for the #3 exceptional case).
D. Add ARM hazard recognizer to model the vmla / vmls hazards.
E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the
   vmla / vmls will trigger one of the special hazards.

Enable these fp vmlx codegen changes for Cortex-A9.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129775 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-19 18:11:57 +00:00
Bob Wilson
cd70496ad1 Add -mcpu=cortex-a9-mp. It's cortex-a9 with MP extension. rdar://8648637.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129774 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-19 18:11:52 +00:00
Bob Wilson
5dde893c2b Avoid some 's' 16-bit instruction which partially update CPSR
(and add false dependency) when it isn't dependent on last CPSR defining
instruction. rdar://8928208

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129773 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-19 18:11:49 +00:00
Bob Wilson
f6a4d3c2f3 Avoid write-after-write issue hazards for Cortex-A9.
Add a avoidWriteAfterWrite() target hook to identify register classes that
suffer from write-after-write hazards. For those register classes, try to avoid
writing the same register in two consecutive instructions.

This is currently disabled by default.  We should not spill to avoid hazards!
The command line flag -avoid-waw-hazard can be used to enable waw avoidance.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129772 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-19 18:11:45 +00:00
Eli Friedman
3762046dbf Add support for FastISel'ing varargs calls.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129765 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-19 17:22:22 +00:00
Jakob Stoklund Olesen
ea70a06fa7 Tighten test case a bit.
Ideally, we would match an S-register to its containing D-register, but that
requires arithmetic (divide by 2).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129756 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-19 06:14:45 +00:00
Chris Lattner
832e494359 Implement support for x86 fastisel of small fixed-sized memcpys, which are generated
en-mass for C++ PODs.  On my c++ test file, this cuts the fast isel rejects by 10x 
and shrinks the generated .s file by 5%


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129755 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-19 05:52:03 +00:00
Chris Lattner
b44101c140 Implement support for fast isel of calls of i1 arguments, even though they are illegal,
when they are a truncate from something else.  This eliminates fully half of all the 
fastisel rejections on a test c++ file I'm working with, which should make a substantial
improvement for -O0 compile of c++ code.

This fixed rdar://9297003 - fast isel bails out on all functions taking bools


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129752 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-19 05:09:50 +00:00
Chris Lattner
e03b8d3162 Handle i1/i8/i16 constant integer arguments to calls by prepromoting them.
Before we would bail out on i1 arguments all together, now we just bail on
non-constant ones.  Also, we used to emit extraneous code.  e.g. test12 was:

	movb	$0, %al
	movzbl	%al, %edi
	callq	_test12

and test13 was:
	movb	$0, %al
	xorl	%edi, %edi
	movb	%al, 7(%rsp)
	callq	_test13f

Now we get:

	movl	$0, %edi
	callq	_test12
and:
	movl	$0, %edi
	callq	_test13f



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129751 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-19 04:42:38 +00:00
Chris Lattner
c76d121807 be layout aware, to produce:
testb	$1, %al
	je	LBB0_2
## BB#1:                                ## %if.then
	movb	$0, %al

instead of:

	testb	$1, %al
	jne	LBB0_1
	jmp	LBB0_2
LBB0_1:                                 ## %if.then
	movb	$0, %al

how 'bout that.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129749 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-19 04:26:32 +00:00
Chris Lattner
90cb88a9b4 fix rdar://9297006 - fast isel bails out on trunc to i1 -> bools cry,
a common cause of fast isel rejects on c++ code.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129748 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-19 04:22:17 +00:00
Jakob Stoklund Olesen
775c3e5824 Make tests register allocation independent again.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129739 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-19 00:14:43 +00:00
Evan Cheng
b58a340fa2 Do not lose mem_operands while lowering VLD / VST intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129738 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-19 00:04:03 +00:00
Eric Christopher
d574bb5a6e Fix a bug where we were counting the alias sets as completely used
registers for fast allocation a different way. This has us updating
used registers only when we're using that exact register.

Fixes rdar://9207598



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129711 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-18 19:26:25 +00:00
Chris Lattner
f051c1a29d while we're at it, handle 'sdiv exact' of a power of 2 also,
this fixes a few rejects on c++ iterator loops.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129694 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-18 07:00:40 +00:00
Chris Lattner
090ca9108b fix rdar://9297011 - udiv by power of two causing fast-isel rejects
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129693 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-18 06:55:51 +00:00
Chris Lattner
1518afddea Implement major new fastisel functionality: the matcher can now handle immediates with
value constraints on them (when defined as ImmLeaf's).  This is particularly important
for X86-64, where almost all reg/imm instructions take a i64immSExt32 immediate operand,
which has a value constraint.  Before this patch we ended up iseling the examples into
such amazing code as:

	movabsq	$7, %rax
	imulq	%rax, %rdi
	movq	%rdi, %rax
	ret

now we produce:

	imulq	$7, %rdi, %rax
	ret

This dramatically shrinks the generated code at -O0 on x86-64.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129691 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-18 06:22:33 +00:00
Chris Lattner
1023643d50 relax this test to just check that the lock prefix is encoded properly,
and to not rely on the register allocator's arbitrary operand choices.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129690 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-18 06:15:35 +00:00
Chris Lattner
602fc06817 1. merge fast-isel-shift-imm.ll into fast-isel-x86-64.ll
2. implement rdar://9289501 - fast isel should fold trivial multiplies to shifts
3. teach tblgen to handle shift immediates that are different sizes than the 
   shifted operands, eliminating some code from the X86 fast isel backend.
4. Have FastISel::SelectBinaryOp use (the poorly named) FastEmit_ri_ function
   instead of FastEmit_ri to simplify code.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129666 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-17 20:23:29 +00:00
Chris Lattner
0a1c997c27 fix an x86 fast isel issue where we'd completely give up on folding an address
when we have a global variable base an an index.  Instead, just give up on
folding the global variable.

Before we'd geenrate:

_test:                                  ## @test
## BB#0:
	movq	_rtx_length@GOTPCREL(%rip), %rax
	leaq	(%rax), %rax
	addq	%rdi, %rax
	movzbl	(%rax), %eax
	ret

now we generate:

_test:                                  ## @test
## BB#0:
	movq	_rtx_length@GOTPCREL(%rip), %rax
	movzbl	(%rax,%rdi), %eax
	ret

The difference is even more significant when there is a scale
involved.

This fixes rdar://9289558 - total fail with addr mode formation at -O0/x86-64


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129664 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-17 17:47:38 +00:00
Chris Lattner
685090f598 fix an oversight which caused us to compile the testcase (and other
less trivial things) into a dummy lea.  Before we generated:

_test:                                  ## @test
	movq	_G@GOTPCREL(%rip), %rax
	leaq	(%rax), %rax
	ret

now we produce:

_test:                                  ## @test
	movq	_G@GOTPCREL(%rip), %rax
	ret

This is part of rdar://9289558



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129662 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-17 17:12:08 +00:00
Chris Lattner
fd3f635103 Fix rdar://9289512 - not folding load into compare at -O0
The basic issue here is that bottom-up isel is matching the branch
and compare, and was failing to fold the load into the branch/compare
combo.  Fixing this (by allowing folding into any instruction of a
sequence that is selected) allows us to produce things like:


cmpb    $0, 52(%rax)
je      LBB4_2

instead of:

movb    52(%rax), %cl
cmpb    $0, %cl
je      LBB4_2

This makes the generated -O0 code run a bit faster, but also speeds up
compile time by putting less pressure on the register allocator and 
generating less code.

This was one of the biggest classes of missing load folding.  Implementing
this shrinks 176.gcc's c-decl.s (as a random example) by about 4% in (verbose-asm)
line count.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129656 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-17 06:35:44 +00:00
Eli Friedman
2f108f81c1 Remove working entry from README.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129654 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-17 02:36:27 +00:00
Chris Lattner
fff65b354f fix rdar://9289583 - fast isel should handle non-canonical commutative binops
allowing us to fold the immediate into the 'and' in this case:

int test1(int i) {
  return 8&i;
}



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129653 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-17 01:16:47 +00:00
Eli Friedman
e545d38a28 PR9055: extend the fix to PR4050 (r70179) to apply to zext and anyext.
Returning a new node makes the code try to replace the old node, which
in the included testcase is killed by CSE.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129650 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-16 23:25:34 +00:00
Evan Cheng
65279cb9bd Fix divmod libcall lowering. Convert to {S|U}DIVREM first and then expand the node to a libcall. rdar://9280991
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129633 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-16 03:08:26 +00:00
Akira Hatanaka
fc693036a4 Re-enable test o32_cc_vararg.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129616 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-15 22:23:09 +00:00
Cameron Zwarich
0cb11ac32f Add ORR and EOR to the CMP peephole optimizer. It's hard to get isel to generate
a case involving EOR, so I only added a test for ORR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129610 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-15 21:24:38 +00:00
Rafael Espindola
27c4ba16ae Add this test back for Darwin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129607 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-15 21:06:27 +00:00
Cameron Zwarich
b485de5d8c The AND instruction leaves the V flag unmodified, so it falls victim to the same
problem as all of the other instructions we fold with CMPs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129602 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-15 20:45:00 +00:00
Cameron Zwarich
ca3f6a3925 Add missing register forms of instructions to the ARM CMP-folding code. This
fixes <rdar://problem/9287901>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129599 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-15 20:28:28 +00:00
Akira Hatanaka
99a2e98edd Add pass that expands pseudo instructions into target instructions after register allocation. Define pseudos that get expanded into mtc1 or mfc1 instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129594 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-15 19:52:08 +00:00
Rafael Espindola
f0adba9a7e Add 129518 back with a fix for when we are producing eh just because of debug info.
Change ELF systems to use CFI for producing the EH tables. This reduces the
size of the clang binary in Debug builds from 690MB to 679MB.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129571 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-15 15:11:06 +00:00
NAKAMURA Takumi
bcb8c6d09e Revert r129518, "Change ELF systems to use CFI for producing the EH tables. This reduces the"
It broke several builds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129557 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-15 03:35:57 +00:00
Evan Cheng
9eec66e604 Fix another fcopysign lowering bug. If src is f64 and destination is f32, don't
forget to right shift the source by 32 first. rdar://9287902


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129556 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-15 01:31:00 +00:00
Michael J. Spencer
4babeeeeed Add 3DNow! intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129551 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-15 00:32:41 +00:00
Evan Cheng
06b2a60ef9 Follow up on r127913. Fix Thumb revsh isel. rdar://9286766
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129548 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-14 23:27:44 +00:00
Rafael Espindola
3dae6e7333 Change ELF systems to use CFI for producing the EH tables. This reduces the
size of the clang binary in Debug builds from 690MB to 679MB.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129518 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-14 15:18:53 +00:00
Andrew Trick
12f0dc6bb5 In the pre-RA scheduler, maintain cmp+br proximity.
This is done by pushing physical register definitions close to their
use, which happens to handle flag definitions if they're not glued to
the branch. This seems to be generally a good thing though, so I
didn't need to add a target hook yet.

The primary motivation is to generate code closer to what people
expect and rule out missed opportunity from enabling macro-op
fusion. As a side benefit, we get several 2-5% gains on x86
benchmarks. There is one regression:
SingleSource/Benchmarks/Shootout/lists slows down be -10%. But this is
an independent scheduler bug that will be tracked separately.
See rdar://problem/9283108.

Incidentally, pre-RA scheduling is only half the solution. Fixing the
later passes is tracked by:
<rdar://problem/8932804> [pre-RA-sched] on x86, attempt to schedule CMP/TEST adjacent with condition jump

Fixes:
<rdar://problem/9262453> Scheduler unnecessary break of cmp/jump fusion


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129508 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-14 05:15:06 +00:00
Bill Wendling
d336de318e As Dan pointed out, movzbl, movsbl, and friends are nicer than their alias
(movzx/movsx) because they give more information. Revert that part of the patch.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129498 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-14 01:46:37 +00:00
Bill Wendling
c6df9883da Have the X86 back-end emit the alias instead of what's being aliased. In most
cases, it's much nicer and more informative reading the alias.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129497 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-14 01:11:51 +00:00
Cameron Zwarich
5af60ce2a8 Fix a typo in an ARM-specific DAG combine. This fixes <rdar://problem/9278274>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129468 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-13 21:01:19 +00:00
Cameron Zwarich
1335022e19 Fix a regression caused by r102515 where explicit alignment on globals is
ignored. There was a test to catch this, but it was just blindly updated in
a large change. This fixes another part of <rdar://problem/9275290>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129466 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-13 20:36:04 +00:00
Cameron Zwarich
eb04a33dc7 Fix an obvious problem with an alignment computation. AsmPrinter actually does
the max itself, so it is not easy to write a test case for this, but I added a
test case that would fail if the code in AsmPrinter were removed.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129432 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-13 09:02:43 +00:00
Cameron Zwarich
d8b88d8558 If a global variable has a specified alignment that is less than the preferred
alignment for its type, use the minimum of the specified alignment and the ABI
alignment. This fixes <rdar://problem/9275290>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129428 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-13 06:03:16 +00:00
Andrew Trick
87896d9368 Recommit r129383. PreRA scheduler heuristic fixes: VRegCycle, TokenFactor latency.
Additional fixes:
Do something reasonable for subtargets with generic
itineraries by handle node latency the same as for an empty
itinerary. Now nodes default to unit latency unless an itinerary
explicitly specifies a zero cycle stage or it is a TokenFactor chain.

Original fixes:
UnitsSharePred was a source of randomness in the scheduler: node
priority depended on the queue data structure. I rewrote the recent
VRegCycle heuristics to completely replace the old heuristic without
any randomness. To make the ndoe latency adjustments work, I also
needed to do something a little more reasonable with TokenFactor. I
gave it zero latency to its consumers and always schedule it as low as
possible.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129421 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-13 00:38:32 +00:00
Bill Wendling
f93f7b2446 Reapply r129401 with patch for clang.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129419 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-13 00:36:11 +00:00
Eric Christopher
d31df87f41 Temporarily revert r129408 to see if it brings the bots back.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129417 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-13 00:20:59 +00:00
Eric Christopher
8e98de9979 Fix a bug where we were counting the alias sets as completely used
registers for fast allocation.

Fixes rdar://9207598


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129408 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-12 23:23:14 +00:00
Bill Wendling
f9b2dc66c8 Revert r129401 for now. Clang is using the old way of doing things.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129403 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-12 22:59:27 +00:00
Bill Wendling
d5f323d70b Remove the unaligned load intrinsics in favor of using native unaligned loads.
Now that we have a first-class way to represent unaligned loads, the unaligned
load intrinsics are superfluous.

First part of <rdar://problem/8460511>.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129401 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-12 22:46:31 +00:00
Andrew Trick
c558bf3972 Revert 129383. It causes some targets to hit a scheduler assert.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129385 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-12 20:14:07 +00:00
Andrew Trick
3eb4319313 PreRA scheduler heuristic fixes: VRegCycle, TokenFactor latency.
UnitsSharePred was a source of randomness in the scheduler: node
priority depended on the queue data structure. I rewrote the recent
VRegCycle heuristics to completely replace the old heuristic without
any randomness. To make these heuristic adjustments to node latency work,
I also needed to do something a little more reasonable with TokenFactor. I
gave it zero latency to its consumers and always schedule it as low as
possible.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129383 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-12 19:54:36 +00:00
Cameron Zwarich
d0aacbcc2e Split a store of a VMOVDRR into two integer stores to avoid mixing NEON and ARM
stores of arguments in the same cache line. This fixes the second half of
<rdar://problem/8674845>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129345 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-12 02:24:17 +00:00
Wesley Peck
3d820baf19 Add scheduling information for the MBlaze backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129311 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-11 22:31:52 +00:00
Evan Cheng
d67705faaa Look pass copies when determining whether hoisting would end up inserting more copies. rdar://9266679
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129297 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-11 21:09:18 +00:00
Chris Lattner
008c93e884 look for the verboten argument slot access in any order, thanks to Frits
for pointing this out


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129217 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-09 17:00:34 +00:00
Chris Lattner
0403b473dd Fix a bug where RecursivelyDeleteTriviallyDeadInstructions could
delete the instruction pointed to by CGP's current instruction
iterator, leading to a crash on the testcase.  This fixes PR9578.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129200 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-09 07:05:44 +00:00
Chris Lattner
b99e000d79 fix two completely broken tests, which were matching due to PR9629.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129195 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-09 06:34:38 +00:00
Chris Lattner
9f264597d8 remove a bunch of CHECK lines that aren't checking what
they thought they were, because alternation was expanding
wrong in {{}}'s.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129194 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-09 06:31:06 +00:00
Chris Lattner
b3452ea35c have dag combine zap "store undef", which can be formed during call lowering
with undef arguments.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129185 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-09 02:32:02 +00:00
Chris Lattner
4ae6a4d696 don't test for codegen of 'store undef'
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129184 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-09 02:31:26 +00:00
Evan Cheng
4da0c7c0c9 Change -arm-trap-func= into a non-arm specific option. Now Intrinsic::trap is lowered into a call to the specified trap function at sdisel time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129152 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-08 21:37:21 +00:00
Evan Cheng
274d8d4eba Add option to emit @llvm.trap as a function call instead of a trap instruction. rdar://9249183.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129107 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-07 20:31:12 +00:00
Andrew Trick
5469976506 Added a check in the preRA scheduler for potential interference on a
induction variable. The preRA scheduler is unaware of induction vars,
so we look for potential "virtual register cycles" instead.

Fixes <rdar://problem/8946719> Bad scheduling prevents coalescing


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129100 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-07 19:54:57 +00:00
Akira Hatanaka
9777e7afd4 Fix handling of functions with internal linkage.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129099 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-07 19:51:44 +00:00
Tanya Lattner
0433b21c98 Prevent ARM DAG Combiner from doing an AND or OR combine on an illegal vector type (vectors of size 3). Also included test cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129074 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-07 15:24:20 +00:00
Evan Cheng
2c69f8eec6 Change -arm-divmod-libcall to a target neutral option.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129045 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-07 00:58:44 +00:00
Owen Anderson
df298c9ea6 Teach the ARM peephole optimizer that RSB, RSC, ADC, and SBC can be used for folded comparisons, just like ADD and SUB.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129038 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-06 23:35:59 +00:00
Jakob Stoklund Olesen
c5ddb74089 These tests no longer require linear scan because reserved register coalescing is now universal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128936 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-05 21:40:41 +00:00
Jakob Stoklund Olesen
cfafc54040 Run LiveDebugVariables in RegAllocBasic and RegAllocGreedy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128935 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-05 21:40:37 +00:00
Jakob Stoklund Olesen
57b0fb7850 Fix one more batch of X86 tests to be register allocation dependent.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128919 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-05 20:20:30 +00:00
Jakob Stoklund Olesen
3520019931 When dead code elimination removes all but one use, try to fold the single def into the remaining use.
Rematerialization can leave single-use loads behind that we might as well fold whenever possible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128918 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-05 20:20:26 +00:00
Johnny Chen
d05a824e1a Fix test-llvm failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128906 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-05 18:41:40 +00:00
Stuart Hastings
ac42a19217 ARM doesn't support byval yet. XFAIL this test until it does.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128891 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-05 17:16:21 +00:00
Jakob Stoklund Olesen
b793bc1cca Ensure all defs referring to a virtual register are marked dead by addRegisterDead().
There can be multiple defs for a single virtual register when they are defining
sub-registers.

The missing <dead> flag was stopping the inline spiller from eliminating dead
code after rematerialization.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128888 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-05 16:53:50 +00:00
Rafael Espindola
8439747236 Print visibility info for external variables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128887 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-05 15:51:32 +00:00
Eric Christopher
4fccc86237 Fix up testcase for previous commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128870 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-05 00:56:01 +00:00
Jakob Stoklund Olesen
b0e47cdf3d Fix register-dependent X86 tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128867 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-05 00:32:44 +00:00
Jakob Stoklund Olesen
4662a9f270 Allow coalescing with reserved physregs in certain cases:
When a virtual register has a single value that is defined as a copy of a
reserved register, permit that copy to be joined. These virtual register are
usually copies of the stack pointer:

  %vreg75<def> = COPY %ESP; GR32:%vreg75
  MOV32mr %vreg75, 1, %noreg, 0, %noreg, %vreg74<kill>
  MOV32mi %vreg75, 1, %noreg, 8, %noreg, 0
  MOV32mi %vreg75<kill>, 1, %noreg, 4, %noreg, 0
  CALLpcrel32 ...

Coalescing these virtual registers early decreases register pressure.
Previously, they were coalesced by RALinScan::attemptTrivialCoalescing after
register allocation was completed.

The lower register pressure causes the mcinst-lowering-cmp0.ll test case to fail
because it depends on linear scan spilling a particular register.

I am deleting 2008-08-05-SpillerBug.ll because it is counting the number of
instructions emitted, and its revision history shows the 'correct' count being
edited many times.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128845 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-04 21:00:03 +00:00
Jakob Stoklund Olesen
866bdadd83 Disable the PowerPC/Atomics-64 test.
The code inserted by PPCTargetLowering::EmitInstrWithCustomInserter for ppc64 is
wrong, and I don't know how to fix it. It seems to be using the correct register
classes for pointers, but it inserts all 32-bit instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128835 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-04 17:57:26 +00:00
Jakob Stoklund Olesen
92d098642d Fix PowerPC tests to be register allocator independent.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128827 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-04 17:07:03 +00:00
Che-Liang Chiou
357be5e4ae ptx: support setp's 4-operand format
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128767 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-02 08:51:39 +00:00
Cameron Zwarich
4071a71112 Do some peephole optimizations to remove pointless VMOVs from Neon to integer
registers that arise from argument shuffling with the soft float ABI. These
instructions are particularly slow on Cortex A8. This fixes one half of
<rdar://problem/8674845>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128759 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-02 02:40:43 +00:00
Jim Grosbach
9a3507f091 LDRD/STRD instructions should print both Rt and Rt2 in the asm string.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128736 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-01 20:26:57 +00:00
Akira Hatanaka
20ada98de8 Add code for analyzing FP branches. Clean up branch Analysis functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128718 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-01 17:39:08 +00:00
Evan Cheng
5b76c63f83 Add test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128707 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-01 06:27:25 +00:00
Evan Cheng
df269b9129 FileCheck'ify test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128706 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-01 03:36:33 +00:00
Jakob Stoklund Olesen
c3178f85b5 Fix Thumb and Thumb2 tests to be register allocator independent.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128690 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-31 23:31:50 +00:00
Jakob Stoklund Olesen
1db952d0c6 Provide a legal pointer register class when targeting thumb1.
The LocalStackSlotAllocation pass was creating illegal registers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128687 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-31 23:02:15 +00:00
Jakob Stoklund Olesen
0caa420042 Fix SystemZ tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128686 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-31 23:02:12 +00:00
Jakob Stoklund Olesen
ca6fd009ad Fix ARM tests to be register allocator independent.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128680 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-31 22:14:03 +00:00
Evan Cheng
463d358f1d Distribute (A + B) * C to (A * C) + (B * C) to make use of NEON multiplier
accumulator forwarding:
vadd d3, d0, d1
vmul d3, d3, d2
=>
vmul d3, d0, d2
vmla d3, d1, d2


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128665 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-31 19:38:48 +00:00
Jakob Stoklund Olesen
a6f7499244 Fix Mips, Sparc, and XCore tests that were dependent on register allocation.
Add an extra run with -regalloc=basic to keep them honest.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128654 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-31 18:42:43 +00:00
Akira Hatanaka
1d6b38d9d3 Added support for FP conditional move instructions and fixed bugs in handling of FP comparisons.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128650 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-31 18:26:17 +00:00
Jakob Stoklund Olesen
280ea1a746 Don't completely eliminate identity copies that also modify super register liveness.
Turn them into noop KILL instructions instead. This lets the scavenger know when
super-registers are killed and defined.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128645 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-31 17:55:25 +00:00
Jakob Stoklund Olesen
8e53aca51a Mark all uses as <undef> when joining a copy.
This way, shrinkToUses() will ignore the instruction that is about to be
deleted, and we avoid leaving invalid live ranges that SplitKit doesn't like.

Fix a misunderstanding in MachineVerifier about <def,undef> operands. The
<undef> flag is valid on def operands where it has the same meaning as <undef>
on a use operand. It only applies to sub-register defines which also read the
full register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128642 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-31 17:23:25 +00:00
Richard Osborne
e8f3533323 Add XCore intrinsics for initializing / starting / synchronizing threads.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128633 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-31 15:13:13 +00:00
Jakob Stoklund Olesen
312babc93f Pick a conservative register class when creating a small live range for remat.
The rematerialized instruction may require a more constrained register class
than the register being spilled. In the test case, the spilled register has been
inflated to the DPR register class, but we are rematerializing a load of the
ssub_0 sub-register which only exists for DPR_VFP2 registers.

The register class is reinflated after spilling, so the conservative choice is
only temporary.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128610 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-31 03:54:44 +00:00
Evan Cheng
ee2e0e347e Don't try to create zero-sized stack objects.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128586 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-30 23:44:13 +00:00
Cameron Zwarich
c0e6d780cd Add a ARM-specific SD node for VBSL so that forms with a constant first operand
can be recognized. This fixes <rdar://problem/9183078>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128584 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-30 23:01:21 +00:00
Evan Cheng
92e3916c3b Add intrinsics @llvm.arm.neon.vmulls and @llvm.arm.neon.vmullu.* back. Frontends
was lowering them to sext / uxt + mul instructions. Unfortunately the
optimization passes may hoist the extensions out of the loop and separate them.
When that happens, the long multiplication instructions can be broken into
several scalar instructions, causing significant performance issue.

Note the vmla and vmls intrinsics are not added back. Frontend will codegen them
as intrinsics vmull* + add / sub. Also note the isel optimizations for catching
mul + sext / zext are not changed either.

First part of rdar://8832507, rdar://9203134


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128502 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-29 23:06:19 +00:00
Cameron Zwarich
3007d3331b Add Neon SINT_TO_FP and UINT_TO_FP lowering from v4i16 to v4f32. Fixes
<rdar://problem/8875309> and <rdar://problem/9057191>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128492 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-29 21:41:55 +00:00
Rafael Espindola
5d71de6014 Reduce test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128445 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-29 02:18:54 +00:00
Evan Cheng
78fe9ababe Optimizing (zext A + zext B) * C, to (VMULL A, C) + (VMULL B, C) during
isel lowering to fold the zero-extend's and take advantage of no-stall
back to back vmul + vmla:
 vmull q0, d4, d6
 vmlal q0, d5, d6
is faster than
 vaddl q0, d4, d5
 vmovl q1, d6                                                                                                                                                                             
 vmul  q0, q0, q1

This allows us to vmull + vmlal for:
    f = vmull_u8(   vget_high_u8(s), c);
    f = vmlal_u8(f, vget_low_u8(s),  c);

rdar://9197392


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128444 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-29 01:56:09 +00:00
Bill Wendling
2d930db24f In some cases, the "fail BB dominator" may be null after the BB was split (and
becomes reachable when before it wasn't). Check to make sure that it's not null
before trying to use it.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128434 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-28 23:02:18 +00:00
Jakob Stoklund Olesen
adb877d62e Collect and coalesce DBG_VALUE instructions before emitting the function.
Correctly terminate the range of register DBG_VALUEs when the register is
clobbered or when the basic block ends.

The code is now ready to deal with variables that are sometimes in a register
and sometimes on the stack. We just need to teach emitDebugLoc to say 'stack
slot'.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128327 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-26 02:19:36 +00:00
Eric Christopher
29aeed1bf8 Fix the bfi handling for or (and a mask) (and b mask). We need the two
masks to match inversely for the code as is to work. For the example given
we actually want:

bfi r0, r2, #1, #1

not #0, however, given the way the pattern is written it's not possible
at the moment.

Fixes rdar://9177502


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128320 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-26 01:21:03 +00:00
Jakob Stoklund Olesen
15a3ea0628 Emit less labels for debug info and stop emitting .loc directives for DBG_VALUEs.
The .dot directives don't need labels, that is a leftover from when we created
line number info manually.

Instructions following a DBG_VALUE can share its label since the DBG_VALUE
doesn't produce any code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128284 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-25 17:20:59 +00:00
Devang Patel
439e0c79f5 Move test in x86 specific area.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128245 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-24 22:39:09 +00:00
Devang Patel
23670e5b95 Keep track of directory namd and fIx regression caused by Rafael's patch r119613.
A better approach would be to move source id handling inside MC.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128233 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-24 20:30:50 +00:00
NAKAMURA Takumi
a2e0762fae Target/X86: [PR8777][PR8778] Tweak alloca/chkstk for Windows targets.
FIXME: Some cleanups would be needed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128206 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-24 07:07:00 +00:00
Cameron Zwarich
6e8ffc1c4d Do early taildup of ret in CodeGenPrepare for potential tail calls that have a
void return type. This fixes PR9487.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128197 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-24 04:52:10 +00:00
Devang Patel
36dca60f5c Enable GlobalMerge on darwin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128183 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-23 23:34:19 +00:00
Andrew Trick
f6c39412dd Revert r128175.
I'm backing this out for the second time. It was supposed to be fixed by r128164, but the mingw self-host must be defeating the fix.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128181 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-23 23:11:02 +00:00
Evan Cheng
2c33915628 Cmp peephole optimization isn't always safe for signed arithmetics.
int tries = INT_MAX;    
while (tries > 0) {
      tries--;
}

The check should be:
        subs    r4, #1
        cmp     r4, #0
        bgt     LBB0_1

The subs can set the overflow V bit when r4 is INT_MAX+1 (which loop
canonicalization apparently does in this case). cmp #0 would have cleared
it while not changing the N and Z bits. Since BGT is dependent on the V
bit, i.e. (N == V) && !Z, it is not safe to eliminate the cmp #0.

rdar://9172742


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128179 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-23 22:52:04 +00:00
Eli Friedman
b141099c14 PR9535: add support for splitting and scalarizing vector ISD::FP_ROUND.
Also cleaning up some duplicated code while I'm here.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128176 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-23 22:18:48 +00:00
Andrew Trick
d8fa01fbd7 Reapply Eli's r127852 now that the pre-RA scheduler can spill EFLAGS.
(target-specific branchless method for double-width relational comparisons on x86)


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128175 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-23 22:16:02 +00:00
Jakob Stoklund Olesen
28cf1156c9 Reapply r128045 and r128051 with fixes.
This will extend the ranges of debug info variables in registers until they are
clobbered.

Fix 1: Don't mistake DBG_VALUE instructions referring to incoming arguments on
the stack with DBG_VALUE instructions referring to variables in the frame
pointer. This fixes the gdb test-suite failure.

Fix 2: Don't trace through copies to physical registers setting up call
arguments. These registers are call clobbered, and the source register is more
likely to be a callee-saved register that can be extended through the call
instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128114 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-22 22:33:08 +00:00
Andrew Trick
c1dbd5d9c3 Revert r128045 and r128051, debug info enhancements.
Temporarily reverting these to see if we can get llvm-objdump to link. Hopefully this is not the problem.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128097 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-22 19:18:42 +00:00
Che-Liang Chiou
5e0872e099 ptx: add analyze/insert/remove branch
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128084 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-22 14:12:00 +00:00
Jakob Stoklund Olesen
e17232ee4d Dont emit 'DBG_VALUE %noreg, ...' to terminate user variable ranges.
These ranges get completely jumbled by the post-ra scheduler, and it is not
really reasonable to expect it to make sense of them.

Instead, teach DwarfDebug to notice when user variables in registers are
clobbered, and terminate the ranges there.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128045 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-22 00:21:41 +00:00
Dan Gohman
b55d6b6a7e Fix fast-isel address mode folding to avoid folding instructions
outside of the current basic block. This fixes PR9500, rdar://9156159.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128041 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-22 00:04:35 +00:00
Rafael Espindola
7c18fa87a4 Write the section table and the section data in the same order that
gun as does. This makes it a lot easier to compare the output of both
as the addresses are now a lot closer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127972 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-20 18:44:20 +00:00
Daniel Dunbar
7a90e04fc7 Revert r127953, "SimplifyCFG has stopped duplicating returns into predecessors
to canonicalize IR", it broke a lot of things.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127954 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-19 21:47:14 +00:00
Evan Cheng
ae16d6b972 SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IR
to have single return block (at least getting there) for optimizations. This
is general goodness but it would prevent some tailcall optimizations.
One specific case is code like this:
int f1(void);
int f2(void);
int f3(void);
int f4(void);
int f5(void);
int f6(void);
int foo(int x) {
  switch(x) {
  case 1: return f1();
  case 2: return f2();
  case 3: return f3();
  case 4: return f4();
  case 5: return f5();
  case 6: return f6();
  }
}

=>
LBB0_2:                                 ## %sw.bb
  callq   _f1
  popq    %rbp
  ret
LBB0_3:                                 ## %sw.bb1
  callq   _f2
  popq    %rbp
  ret
LBB0_4:                                 ## %sw.bb3
  callq   _f3
  popq    %rbp
  ret

This patch teaches codegenprep to duplicate returns when the return value
is a phi and where the phi operands are produced by tail calls followed by
an unconditional branch:

sw.bb7:                                           ; preds = %entry
  %call8 = tail call i32 @f5() nounwind
  br label %return
sw.bb9:                                           ; preds = %entry
  %call10 = tail call i32 @f6() nounwind
  br label %return
return:
  %retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ]
  ret i32 %retval.0

This allows codegen to generate better code like this:

LBB0_2:                                 ## %sw.bb
        jmp     _f1                     ## TAILCALL
LBB0_3:                                 ## %sw.bb1
        jmp     _f2                     ## TAILCALL
LBB0_4:                                 ## %sw.bb3
        jmp     _f3                     ## TAILCALL

rdar://9147433


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127953 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-19 17:17:39 +00:00
Nadav Rotem
06cc324b9d Add support for legalizing UINT_TO_FP of vectors on platforms which do
not have native support for this operation (such as X86).
The legalized code uses two vector INT_TO_FP operations and is faster
than scalarizing.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127951 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-19 13:09:10 +00:00
Andrew Trick
f6325b9700 FileCheckize a test.
(one-by-one until valgrind is happy)


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127925 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-19 00:41:39 +00:00
Evan Cheng
3f30af3f45 Match a few more obvious patterns to revsh. rdar://9147637.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127913 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-18 21:52:42 +00:00
Eli Friedman
b6192d2a9f Revert r127852; it's apparently causing an ICE on mingw.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127909 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-18 21:12:29 +00:00
Justin Holewinski
8af78c9cf8 PTX: Fix various codegen issues
- Emit mad instead of mad.rn for shader model 1.0
- Emit explicit mov.u32 instructions for reading global variables
- (most PTX instructions cannot take global variable immediates)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127895 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-18 19:24:28 +00:00
Che-Liang Chiou
8902ecb682 ptx: fix parameter order that is reversed
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127874 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-18 11:23:56 +00:00
Che-Liang Chiou
88d3367baa ptx: add unconditional and conditional branch
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127873 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-18 11:08:52 +00:00
Eli Friedman
b4b8b0cc90 Add a target-specific branchless method for double-width relational
comparisons on x86.  Essentially, the way this works is that SUB+SBB sets
the relevant flags the same way a double-width CMP would.

This is a substantial improvement over the generic lowering in LLVM. The output
is also shorter than the gcc-generated output; I haven't done any detailed
benchmarking, though.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127852 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-18 02:34:11 +00:00
Benjamin Kramer
1c10b8de46 BuildUDIV: If the divisor is even we can simplify the fixup of the multiplied value by introducing an early shift.
This allows us to compile "unsigned foo(unsigned x) { return x/28; }" into
	shrl	$2, %edi
	imulq	$613566757, %rdi, %rax
	shrq	$32, %rax
	ret

instead of
	movl    %edi, %eax
	imulq   $613566757, %rax, %rcx
	shrq    $32, %rcx
	subl    %ecx, %eax
	shrl    %eax
	addl    %ecx, %eax
	shrl    $4, %eax

on x86_64

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127829 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-17 20:39:14 +00:00
Richard Osborne
11bd0784d9 Add XCore intrinsic for setpsc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127821 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-17 18:42:05 +00:00
NAKAMURA Takumi
1aa7f7a997 test/CodeGen/X86/h-registers-1.ll: Add explicit -mtriple=x86_64-linux. It does not need to be checked on x86_64-win32 (aka Win64).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127800 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-17 04:24:40 +00:00