Commit Graph

1622 Commits

Author SHA1 Message Date
Arnold Schwaighofer
d9316dacf5 ARM NEON: Handle v16i8 and v8i16 reverse shuffles
Lower reverse shuffles to a vrev64 and a vext instruction instead of the default
legalization of storing and loading to the stack. This is important because we
generate reverse shuffles in the loop vectorizer when we reverse store to an
array.

  uint8_t Arr[N];
  for (i = 0; i < N; ++i)
    Arr[N - i - 1] = ...

radar://13171760

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174929 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-12 01:58:32 +00:00
Bob Wilson
8f637adbd3 Revert 172027 and 174336. Remove diagnostics about over-aligned stack objects.
Aside from the question of whether we report a warning or an error when we
can't satisfy a requested stack object alignment, the current implementation
of this is not good.  We're not providing any source location in the diagnostics
and the current warning is not connected to any warning group so you can't
control it.  We could improve the source location somewhat, but we can do a
much better job if this check is implemented in the front-end, so let's do that
instead.  <rdar://problem/13127907>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174741 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-08 20:35:15 +00:00
Manman Ren
9c5861fdbd Attempt to recover gdb bot after r174445.
Failure: undefined symbol 'Lline_table_start0'.
Root-cause: we use a symbol subtraction to calculate at_stmt_list, but
the line table entries are not dumped in the assembly.
Fix: use zero instead of a symbol subtraction for Compile Unit 0.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174479 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-06 00:59:41 +00:00
Manman Ren
43213cf1ac Dwarf: support for LTO where a single object file can have multiple line tables
We generate one line table for each compilation unit in the object file.
Reviewed by Eric and Kevin.

rdar://problem/13067005


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174445 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-05 21:52:47 +00:00
Chad Rosier
1e45487dfd [SjLj Prepare] When demoting an invoke instructions to the stack, if the normal
edge is critical, then split it so we can insert the store.
rdar://13126179

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174418 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-05 18:23:10 +00:00
Logan Chien
b0c899666a Link .ARM.exidx with corresponding text section.
The sh_link in the ELF section header of .ARM.exidx should
be filled with the section index of the corresponding text
section.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174372 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-05 14:18:59 +00:00
Manman Ren
91b978e157 [Stack Alignment] emit warning instead of a hard error
Per discussion in rdar://13127907, we should emit a hard error only if
people write code where the requested alignment is larger than achievable
and assumes the low bits are zeros. A warning should be good enough when
we are not sure if the source code assumes the low bits are zeros.

rdar://13127907


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174336 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-04 23:45:08 +00:00
Eli Bendersky
0f156af831 Add a special ARM trap encoding for NaCl.
More details in this thread: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130128/163783.html

Patch by JF Bastien



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173943 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-30 16:30:19 +00:00
Logan Chien
620d5bd8e4 Add missing header and test cases for r173939.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173941 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-30 15:48:50 +00:00
Tim Northover
0adfdedacb Fix 64-bit atomic operations in Thumb mode.
The ARM and Thumb variants of LDREXD and STREXD have different constraints and
take different operands. Previously the code expanding atomic operations didn't
take this into account and asserted in Thumb mode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173780 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-29 09:06:13 +00:00
Silviu Baranga
4a9256f265 Fixed the condition codes for the atomic64 min/umin code generation on ARM. If the sutraction of the higher 32 bit parts gives a 0 result, we need to do the store operation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173437 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-25 10:39:49 +00:00
Jakob Stoklund Olesen
d32eea9636 Remove some register allocation order dependencies.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172874 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-19 00:03:32 +00:00
Tim Northover
5f2801bd65 Simplify writing floating types to assembly.
This removes previous special cases for each floating-point type in favour of a
shared codepath.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172189 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-11 10:36:13 +00:00
Manman Ren
86441169da Stack Alignment: throw error if we can't satisfy the minimal alignment
requirement when creating stack objects in MachineFrameInfo.

Add CreateStackObjectWithMinAlign to throw error when the minimal alignment
can't be achieved and to clamp the alignment when the preferred alignment
can't be achieved. Same is true for CreateVariableSizedObject.
Will not emit error in CreateSpillStackObject or CreateStackObject.

As long as callers of CreateStackObject do not assume the object will be
aligned at the requested alignment, we should not have miscompile since
later optimizations which look at the object's alignment will have the correct
information.

rdar://12713765


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172027 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-10 01:10:10 +00:00
Andrew Trick
47579cf390 MIsched: add an ILP window property to machine model.
This was an experimental option, but needs to be defined
per-target. e.g. PPC A2 needs to aggressively hide latency.

I converted some in-order scheduling tests to A2. Hal is working on
more test cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171946 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-09 03:36:49 +00:00
Tim Northover
a67352d401 Specify complete triple for fp128 tests.
This avoids FileCheck failing over different comment characters in
assembly (notably powerpc64 on Linux vs Darwin) and should fix David's
build-bot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171886 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-08 19:36:33 +00:00
Tim Northover
90f011e0ba Allow the asm printer to print fp128 values properly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171866 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-08 16:56:23 +00:00
Silviu Baranga
e97165901e Make the MergeGlobals pass correctly handle the address space qualifiers of the global variables. We partition the set of globals by their address space, and apply the same the trasnformation as before to merge them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171730 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-07 12:31:25 +00:00
Bob Wilson
103b4a571e Revert "Adding support for llvm.arm.neon.vaddl[su].* and"
This reverts r170694.  The operations can be represented in IR without
adding any new intrinsics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170765 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-20 21:09:38 +00:00
Renato Golin
332bd79951 Adding support for llvm.arm.neon.vaddl[su].* and
llvm.arm.neon.vsub[su].* intrinsics.

Patch by Pete Couperus <pjcoup@gmail.com>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170694 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-20 13:52:11 +00:00
Evan Cheng
733c6b1db1 LLVM sdisel normalize bit extraction of the form:
((x & 0xff00) >> 8) << 2
to
 (x >> 6) & 0x3fc

This is general goodness since it folds a left shift into the mask. However,
the trailing zeros in the mask prevents the ARM backend from using the bit
extraction instructions. And worse since the mask materialization may require
an addition instruction. This comes up fairly frequently when the result of 
the bit twiddling is used as memory address. e.g.

 = ptr[(x & 0xFF0000) >> 16]

We want to generate:
  ubfx   r3, r1, #16, #8
  ldr.w  r3, [r0, r3, lsl #2]

vs.
  mov.w  r9, #1020
  and.w  r2, r9, r1, lsr #14
  ldr    r2, [r0, r2]

Add a late ARM specific isel optimization to
ARMDAGToDAGISel::PreprocessISelDAG(). It folds the left shift to the
'base + offset' address computation; change the mask to one which doesn't have
trailing zeros and enable the use of ubfx.

Note the optimization has to be done late since it's target specific and we
don't want to change the DAG normalization. It's also fairly restrictive
as shifter operands are not always free. It's only done for lsh 1 / 2. It's
known to be free on some cpus and they are most common for address
computation.

This is a slight win for blowfish, rijndael, etc.

rdar://12870177


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170581 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-19 20:16:09 +00:00
Quentin Colombet
b519351b87 Disable ARM partial flag dependency optimization at -Oz
To not over constrain the scheduler for ARM in thumb mode, some optimizations  for code size reduction, specific to ARM thumb, are blocked when they add a dependency (like write after read dependency).

Disables this check when code size is the priority, i.e., code is compiled with -Oz.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170462 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-18 22:47:16 +00:00
Andrew Trick
04f52e1300 MISched: add dependence to ExitSU to model live-out latency.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170454 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-18 20:53:01 +00:00
Evan Cheng
376642ed62 Some enhancements for memcpy / memset inline expansion.
1. Teach it to use overlapping unaligned load / store to copy / set the trailing
   bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies.
2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g.
   x86 and ARM.
3. When memcpy from a constant string, do *not* replace the load with a constant
   if it's not possible to materialize an integer immediate with a single
   instruction (required a new target hook: TLI.isIntImmLegal()).
4. Use unaligned load / stores more aggressively if target hooks indicates they
   are "fast".
5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8.
   Also increase the threshold to something reasonable (8 for memset, 4 pairs
   for memcpy).

This significantly improves Dhrystone, up to 50% on ARM iOS devices.

rdar://12760078


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169791 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-10 23:21:26 +00:00
Tim Northover
6eb3e87df0 Added Mapping Symbols for ARM ELF
Before this patch, when you objdump an LLVM-compiled file, objdump tried to
decode data-in-code sections as if they were code.  This patch adds the missing
Mapping Symbols, as defined by "ELF for the ARM Architecture" (ARM IHI 0044D).

Patch based on work by Greg Fitzgerald.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169609 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-07 16:50:23 +00:00
Dmitri Gribenko
00e97c2126 Fix typos in CHECK lines.
Patch by Alexander Zinenko.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169547 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06 21:24:47 +00:00
Evan Cheng
824ec7d01a Properly fix the tes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169464 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06 02:29:29 +00:00
NAKAMURA Takumi
e1ab8e3e73 llvm/test/CodeGen/ARM/extload-knownzero.ll: Try to unbreak, to add -O0. I guess Chad expects fastisel here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169463 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06 02:22:58 +00:00
Chad Rosier
c9758b1366 [arm fast-isel] Make the fast-isel implementation of memcpy respect alignment.
rdar://12821569

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169460 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06 01:34:31 +00:00
Evan Cheng
8a7186dbc2 Let targets provide hooks that compute known zero and ones for any_extend
and extload's. If they are implemented as zero-extend, or implicitly
zero-extend, then this can enable more demanded bits optimizations. e.g.

define void @foo(i16* %ptr, i32 %a) nounwind {
entry:
  %tmp1 = icmp ult i32 %a, 100
  br i1 %tmp1, label %bb1, label %bb2
bb1:
  %tmp2 = load i16* %ptr, align 2
  br label %bb2
bb2:
  %tmp3 = phi i16 [ 0, %entry ], [ %tmp2, %bb1 ]
  %cmp = icmp ult i16 %tmp3, 24
  br i1 %cmp, label %bb3, label %exit
bb3:
  call void @bar() nounwind
  br label %exit
exit:
  ret void
}

This compiles to the followings before:
        push    {lr}
        mov     r2, #0
        cmp     r1, #99
        bhi     LBB0_2
@ BB#1:                                 @ %bb1
        ldrh    r2, [r0]
LBB0_2:                                 @ %bb2
        uxth    r0, r2
        cmp     r0, #23
        bhi     LBB0_4
@ BB#3:                                 @ %bb3
        bl      _bar
LBB0_4:                                 @ %exit
        pop     {lr}
        bx      lr

The uxth is not needed since ldrh implicitly zero-extend the high bits. With
this change it's eliminated.

rdar://12771555


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169459 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06 01:28:01 +00:00
Evan Cheng
c8e7045c8a ARM custom lower ctpop for vector types. Patch by Pete Couperus.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169325 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04 22:41:50 +00:00
Bill Wendling
9493dae613 Use the 'count' attribute to calculate the upper bound of an array.
The count attribute is more accurate with regards to the size of an array. It
also obviates the upper bound attribute in the subrange. We can also better
handle an unbound array by setting the count to -1 instead of the lower bound to
1 and upper bound to 0.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169312 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04 21:34:03 +00:00
Bill Wendling
a7645a3c66 Add a 'count' field to the DWARF subrange.
The count field is necessary because there isn't a difference between the 'lo'
and 'hi' attributes for a one-element array and a zero-element array. When the
count is '0', we know that this is a zero-element array. When it's >=1, then
it's a normal constant sized array. When it's -1, then the array is unbounded.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169218 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04 06:20:49 +00:00
Manman Ren
69261a6442 Stack Alignment: when creating stack objects in MachineFrameInfo, make sure
the alignment is clamped to TargetFrameLowering.getStackAlignment if the target
does not support stack realignment or the option "realign-stack" is off.

This will cause miscompile if the address is treated as aligned and add is
replaced with or in DAGCombine.

Added a bool StackRealignable to TargetFrameLowering to check whether stack
realignment is implemented for the target. Also added a bool RealignOption
to MachineFrameInfo to check whether the option "realign-stack" is on.

rdar://12713765


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169197 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04 00:52:33 +00:00
Jakob Stoklund Olesen
8c3dccde92 Simplify REG_SEQUENCE lowering.
The TwoAddressInstructionPass takes the machine code out of SSA form by
expanding REG_SEQUENCE instructions into copies. It is no longer
necessary to rewrite the registers used by a REG_SEQUENCE instruction
because the new coalescer algorithm can do it now.

REG_SEQUENCE is just converted to a sequence of sub-register copies now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169067 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-01 01:06:44 +00:00
Sebastian Pop
cb4953089b Codegen failure for vmull with small vectors
Codegen was failing with an assertion because of unexpected vector
operands when legalizing the selection DAG for a MUL instruction.

The asserting code was legalizing multiplies for vectors of size 128
bits. It uses a custom lowering to try and detect cases where it can
use a VMULL instruction instead of a VMOVL + VMUL.  The code was
looking for input operands to the MUL that had been sign or zero
extended. If it found the extended operands it would drop the
sign/zero extension and use the original vector size as input to a
VMULL instruction.

The code assumed that the original input vector was 64 bits so that
after dropping the extension it would fit directly into a D register
and could be used as an operand of a VMULL instruction. The input
code that trigger the failure used a vector of <4 x i8> that was
sign extended to <4 x i32>. It was not safe to drop the sign
extension in this case because the original vector is only 32 bits
wide. The fix is to insert a sign extension for the vector to reach
the required 64 bit size. In this particular example, the vector would
need to be sign extented to a <4 x i16>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169024 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-30 19:08:04 +00:00
Bill Wendling
7360116048 Handle the situation where CodeGenPrepare removes a reference to a BB that has
the last invoke instruction in the function. This also removes the last landing
pad in an function. This is fine, but with SjLj EH code, we've already placed a
bunch of code in the 'entry' block, which expects the landing pad to stick
around.

When we get to the situation where CGP has removed the last landing pad, go
ahead and nuke the SjLj instructions from the 'entry' block.
<rdar://problem/12721258>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168930 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-29 19:38:06 +00:00
Silviu Baranga
35b3df6e31 Added atomic 64 min/max/umin/umax instrinsics support in the ARM backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168886 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-29 14:41:25 +00:00
Jakob Stoklund Olesen
89bea17af2 Avoid rewriting instructions twice.
This could cause miscompilations in targets where sub-register
composition is not always idempotent (ARM).

<rdar://problem/12758887>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168837 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-29 00:26:11 +00:00
Benjamin Kramer
350c00843b ARM: Implement CanLowerReturn so large vectors get expanded into sret.
Fixes 14337.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168809 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-28 20:55:10 +00:00
Chad Rosier
92a6e532b8 Add -verify-machineinstrs to these fast-isel test cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168723 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-27 20:49:56 +00:00
Manman Ren
39834da697 CSE: allow PerformTrivialCoalescing to check copies across basic block
boundaries.

Given the following case:
BB0
  %vreg1<def> = SUBrr %vreg0, %vreg7
  %vreg2<def> = COPY %vreg7
BB1
  %vreg10<def> = SUBrr %vreg0, %vreg2
We should be able to CSE between SUBrr in BB0 and SUBrr in BB1.

rdar://12462006


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168717 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-27 18:58:41 +00:00
Ulrich Weigand
dba37a3c43 Never use .lcomm on platforms where it does not accept an alignment
argument.  Instead, use a pair of .local and .comm directives.

This avoids spurious differences between binaries built by the
integrated assembler vs. those built by the external assembler,
since the external assembler may impose alignment requirements
on .lcomm symbols where the integrated assembler does not.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168704 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-27 16:11:16 +00:00
Chad Rosier
277068fe40 Extend test case for r168657.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168658 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-27 01:10:48 +00:00
Tim Northover
310f248c22 Fix physical register liveness calculations:
+ Take account of clobbers
+ Give outputs priority over inputs since they happen later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168360 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-20 09:56:11 +00:00
Anton Korobeynikov
2386fc8daa Factor out type info emission into separate routine.
It turned out that ARM wants different layout of type infos.
This is yet another patch in attempt to fix PR7187 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168325 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-19 21:06:26 +00:00
Eli Friedman
43147afd71 Mark FP_EXTEND form v2f32 to v2f64 as "expand" for ARM NEON. Patch by Pete Couperus.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168240 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-17 01:52:46 +00:00
Chad Rosier
0a63b6ac79 [fast-isel] Add the -verify-machineinstrs to these test cases. The remaining
test cases require fixes to fast-isel before the verifier can be enabled.
Part of rdar://12594152

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168233 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-17 00:42:06 +00:00
Weiming Zhao
e56764bad1 Remove hard coded registers in ARM ldrexd and strexd instructions
This patch replaces the hard coded GPR pair [R0, R1] of
Intrinsic:arm_ldrexd and [R2, R3] of Intrinsic:arm_strexd with
even/odd GPRPair reg class.
Similar to the lowering of atomic_64 operation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168207 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-16 21:55:34 +00:00
Anton Korobeynikov
b1a392e7c5 Make sure FABS on v2f32 and v4f32 is legal on ARM NEON
This fixes PR14359


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168200 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-16 21:15:20 +00:00
Eli Friedman
846ce8ea67 Mark FP_ROUND for converting NEON v2f64 to v2f32 as expand. Add a missing
case to vector legalization so this actually works.

Patch by Pete Couperus.  Fixes PR12540.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168107 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-15 22:44:27 +00:00
Nadav Rotem
50b66387e3 The code pattern "imm0_255_neg" is used for checking if an immediate value is a small negative number.
This patch changes the definition of negative from -0..-255 to -1..-255. I am changing this because of
a bug that we had in some of the patterns that assumed that "subs" of zero does not set the carry flag.

rdar://12028498



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167963 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-14 19:39:15 +00:00
Anton Korobeynikov
062a6c8380 Fix really stupid ARM EHABI info generation bug: we should not emit
eh table and handler data if there are no landing pads in the function.
Patch by Logan Chien with some cleanups from me.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167945 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-14 19:13:30 +00:00
Anton Korobeynikov
25efd6d556 Use TARGET2 relocation for TType references on ARM.
Do some cleanup of the code while here.

Inspired by patch by Logan Chien!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167904 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-14 01:47:00 +00:00
Andrew Trick
f546ac5f9b Cleanup the main RegisterCoalescer loop.
Block priorities still apply outside loops.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167793 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-13 00:34:44 +00:00
Andrew Trick
ae692f2bae misched: Infrastructure for weak DAG edges.
This adds support for weak DAG edges to the general scheduling
infrastructure in preparation for MachineScheduler support for
heuristics based on weak edges.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167738 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-12 19:28:57 +00:00
Evan Cheng
b341fac05a Disable the Thumb no-return call optimization:
mov lr, pc
b.w _foo

The "mov" instruction doesn't set bit zero to one, it's putting incorrect
value in lr. It messes up backtraces.

rdar://12663632


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167657 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-10 02:09:05 +00:00
Amara Emerson
214fd3d244 Recommit modified r167540.
Improve ARM build attribute emission for architectures types.
This also changes the default architecture emitted for a generic CPU to "v7".


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167574 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-08 09:51:45 +00:00
Quentin Colombet
43934aee71 Vext Lowering was missing opportunities
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167318 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-02 21:32:17 +00:00
Quentin Colombet
9a419f656e Change ForceSizeOpt attribute into MinSize attribute
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167020 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 16:32:52 +00:00
Jakob Stoklund Olesen
573303e62a Completely disallow partial copies in adjustCopiesBackFrom().
Partial copies can show up even when CoalescerPair.isPartial() returns
false. For example:

   %vreg24:dsub_0<def> = COPY %vreg31:dsub_0; QPR:%vreg24,%vreg31

Such a partial-partial copy is not good enough for the transformation
adjustCopiesBackFrom() needs to do.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166944 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 17:51:52 +00:00
Quentin Colombet
80acd97266 [code size][ARM] Emit regular call instructions instead of the move, branch sequence
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166854 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-27 01:10:17 +00:00
Jakob Stoklund Olesen
17f42e02a1 Revert r163298 "Optimize codegen for VSETLNi{8,16,32} operating on Q registers."
Keep the integer_insertelement test case, the new coalescer can handle
this kind of lane insertion without help from pseudo-instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166835 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-26 23:39:46 +00:00
Evan Cheng
d258eb3ec5 Fix a miscompilation caused by a typo. When turning a adde with negative value
into a sbc with a positive number, the immediate should be complemented, not
negated. Also added a missing pattern for ARM codegen.

rdar://12559385


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166613 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 19:53:01 +00:00
Bill Wendling
8f47fc8f00 When a block ends in an indirect branch, add its successors to the machine basic block.
The CFG of the machine function needs to know that the targets of the indirect
branch are successors to the indirect branch.
<rdar://problem/12529625>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166448 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-22 23:30:04 +00:00
Shuxin Yang
970755e519 This patch is to fix radar://8426430. It is about llvm support of __builtin_debugtrap()
which is supposed to consistently raise SIGTRAP across all systems. In contrast,
__builtin_trap() behave differently on different systems. e.g. it raises SIGTRAP on ARM, and
SIGILL on X86. The purpose of __builtin_debugtrap() is to consistently provide "trap"
functionality, in the mean time preserve the compatibility with on gcc on __builtin_trap().

  The X86 backend is already able to handle debugtrap(). This patch is to:
  1) make front-end recognize "__builtin_debugtrap()" (emboddied in the one-line change to Clang).
  2) In DAG legalization phase, by default, "debugtrap" will be replaced with "trap", which
     make the __builtin_debugtrap() "available" to all existing ports without the hassle of
     changing their code.
  3) If trap-function is specified (via -trap-func=xyz to llc), both __builtin_debugtrap() and
     __builtin_trap() will be expanded into the function call of the specified trap function.
    This behavior may need change in the future.

  The provided testing-case is to make sure 2) and 3) are working for ARM port, and we
already have a testing case for x86. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166300 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 20:11:16 +00:00
Stepan Dyatkovskiy
0d3c8d5d16 ARM:
Removed extra stack frame object for fixed byval arguments,
VarArgsStyleRegisters invocation was reworked due to some improper usage in
past. PR14099 also demonstrates it.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166273 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 08:23:06 +00:00
Jakob Stoklund Olesen
cdcdfd2cab Revert r166046 "Switch back to the old coalescer for now to fix the 32 bit bit"
A fix for PR14098, including the test case is in the next commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166067 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 22:51:55 +00:00
Rafael Espindola
6f7cccd2e2 Switch back to the old coalescer for now to fix the 32 bit bit
llvm+clang+compiler-rt bootstrap.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166046 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 19:34:06 +00:00
Stepan Dyatkovskiy
b52ba9f8a8 Issue:
Stack is formed improperly for long structures passed as byval arguments for
EABI mode.

If we took AAPCS reference, we can found the next statements:

A: "If the argument requires double-word alignment (8-byte), the NCRN (Next
Core Register Number) is rounded up to the next even register number." (5.5
Parameter Passing, Stage C, C.3).

B: "The alignment of an aggregate shall be the alignment of its most-aligned
component." (4.3 Composite Types, 4.3.1 Aggregates).

So if we have structure with doubles (9 double fields) and 3 Core unused
registers (r1, r2, r3): caller should use r2 and r3 registers only.
Currently r1,r2,r3 set is used, but it is invalid.

Callee VA routine should also use r2 and r3 regs only. All is ok here. This
behaviour is guessed by rounding up SP address with ADD+BFC operations.

Fix:
Main fix is in ARMTargetLowering::HandleByVal. If we detected AAPCS mode and
8 byte alignment, we waste odd registers then.

P.S.:
I also improved LDRB_POST_IMM regression test. Since ldrb instruction will
not generated by current regression test after this patch. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166018 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 07:16:47 +00:00
Jim Grosbach
64ba635209 ARM: v1i64 and v2i64 VBSL intrinsic support.
rdar://12502028

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165981 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 21:23:40 +00:00
Silviu Baranga
bb1078ea13 Fixed PR13938: the ARM backend was crashing because it couldn't select a VDUPLANE node with the vector input size different from the output size. This was bacause the BUILD_VECTOR lowering code didn't check that the size of the input vector was correct for using VDUPLANE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165929 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 09:41:32 +00:00
Jakob Stoklund Olesen
d86296a4ae Drop <def,dead> flags when merging into an unused lane.
The new coalescer can merge a dead def into an unused lane of an
otherwise live vector register.

Clear the <dead> flag when that happens since the flag refers to the
full virtual register which is still live after the partial dead def.

This fixes PR14079.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165877 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-13 17:26:47 +00:00
Jakob Stoklund Olesen
af89690760 Allow for loops in LiveIntervals::pruneValue().
It is possible that the live range of the value being pruned loops back
into the kill MBB where the search started. When that happens, make sure
that the beginning of KillMBB is also pruned.

Instead of starting a DFS at KillMBB and skipping the root of the
search, start a DFS at each KillMBB successor, and allow the search to
loop back to KillMBB.

This fixes PR14078.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165872 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-13 16:15:31 +00:00
Manman Ren
e6c3cc8dc5 ARM: tail-call inside a function where part of a byval argument is on caller's
local frame causes problem.

For example:
void f(StructToPass s) {
  g(&s, sizeof(s));
}
will cause problem with tail-call since part of s is passed via registers and
saved in f's local frame. When g tries to access s, part of s may be corrupted
since f's local frame is popped out before the tail-call.

The current fix is to disable tail-call if getVarArgsRegSaveSize is not 0 for
the caller. This is a conservative approach, if we can prove the address of
s or part of s is not taken and passed to g, it should be okay to perform
tail-call.

rdar://12442472


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165853 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 23:39:43 +00:00
Jim Grosbach
4346fa9437 ARM: Mark VSELECT as 'expand'.
The backend already pattern matches to form VBSL when it can. We may want to
teach it to use the vbsl intrinsics at some point to prevent machine licm from
mucking with this, but using the Expand is completely correct.

http://llvm.org/bugs/show_bug.cgi?id=13831
http://llvm.org/bugs/show_bug.cgi?id=13961

Patch by Peter Couperus <peter.couperus@st.com>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165845 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 22:59:21 +00:00
Evan Cheng
d36696c4e0 Legalizer optimize a pair of div / mod to a call to divrem libcall if they are
not legal. However, it should use a div instruction + mul + sub if divide is
legal. The rem legalization code was missing a check and incorrectly uses a
divrem libcall even when div is legal.

rdar://12481395


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165778 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 01:15:47 +00:00
Evan Cheng
6b61491de3 Add isel patterns for v2f32 / v4f32 neon.vbsl intrinsics. rdar://12471808
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165673 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 23:06:34 +00:00
Stepan Dyatkovskiy
2c2cb3c09f Fix for LDRB instruction:
SDNode for LDRB_POST_IMM is invalid: number of registers added to SDNode fewer
that described in .td.

7 ops is needed, but SDNode with only 6 is created.

In more details:
In ARMInstrInfo.td, in multiclass AI2_ldridx, in definition _POST_IMM, offset
operand is defined as am2offset_imm. am2offset_imm is complex parameter type,
and actually it consists from dummy register and imm itself. As I understood
trick with dummy reg was made for AsmParser. In ARMISelLowering.cpp, this dummy
register was not added to SDNode, and it cause crash in Peephole Optimizer pass.

The problem fixed by setting up additional dummy reg when emitting
LDRB_POST_IMM instruction.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165617 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 11:43:40 +00:00
Stepan Dyatkovskiy
661afe75e8 Issue description:
SchedulerDAGInstrs::buildSchedGraph ignores dependencies between FixedStack
objects and byval parameters. So loading byval parameters from stack may be
inserted *before* it will be stored, since these operations are treated as
independent.

Fix:
Currently ARMTargetLowering::LowerFormalArguments saves byval registers with
FixedStack MachinePointerInfo. To fix the problem we need to store byval
registers with MachinePointerInfo referenced to first the "byval" parameter.

Also commit adds two new fields to the InputArg structure: Function's argument
index and InputArg's part offset in bytes relative to the start position of
Function's argument. E.g.: If function's argument is 128 bit width and it was
splitted onto 32 bit regs, then we got 4 InputArg structs with same arg index,
but different offset values. 



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165616 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 11:37:36 +00:00
Jim Grosbach
837c28a840 ARM: locate user-defined text sections next to default text.
Make sure functions located in user specified text sections (via the
section attribute) are located together with the default text sections.
Otherwise, for large object files, the relocations for call instructions
are more likely to be out of range. This becomes even more likely in the
presence of LTO.

rdar://12402636

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165254 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-04 21:33:24 +00:00
Silviu Baranga
541a858f1a Fixed a bug in the ExecutionDependencyFix pass that caused dependencies to not propagate through implicit defs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165102 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-03 08:29:36 +00:00
Jakob Stoklund Olesen
27cb347d0e Make sure the whole live range is covered when values are pruned twice.
JoinVals::pruneValues() calls LIS->pruneValue() to avoid conflicts when
overlapping two different values. This produces a set of live range end
points that are used to reconstruct the live range (with SSA update)
after joining the two registers.

When a value is pruned twice, the set of end points was insufficient:

  v1 = DEF
  v1 = REPLACE1
  v1 = REPLACE2
  KILL v1

The end point at KILL would only reconstruct the live range from
REPLACE2 to KILL, leaving the range REPLACE1-REPLACE2 dead.

Add REPLACE2 as an end point in this case so the full live range is
reconstructed.

This fixes PR13999.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165056 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-02 21:46:39 +00:00
Bob Wilson
eb1641d54a Add LLVM support for Swift.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164899 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-29 21:43:49 +00:00
Bob Wilson
154418cdd8 Whitespace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164898 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-29 21:27:31 +00:00
Jakob Stoklund Olesen
5cf178f281 Enable the new coalescer algorithm by default.
The new coalescer is better at merging values into unused vector lanes,
improving NEON code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164794 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-27 21:06:02 +00:00
Jush Lu
8f50647662 [arm-fast-isel] Add support for ELF PIC.
This is a preliminary step towards ELF support; currently ARMFastISel hasn't
been used for ELF object files yet.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164759 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-27 05:21:41 +00:00
NAKAMURA Takumi
9f4df20fe0 ARM/atomicrmw_minmax.ll: Fix RUN line.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164687 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-26 10:12:20 +00:00
James Molloy
d6d10ae151 Fix ordering of operands on lowering of atomicrmw min/max nodes on ARM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164685 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-26 09:48:32 +00:00
Bill Wendling
f18eb5887f Generate an error message instead of asserting or segfaulting when we have a
scalar-to-vector conversion that we cannot handle. For instance, when an invalid
constraint is used in an inline asm statement.
<rdar://problem/12284092>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164662 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-26 06:16:18 +00:00
Bill Wendling
1293130f4f Generate an error message instead of asserting or segfaulting when we have a
scalar-to-vector conversion that we cannot handle. For instance, when an invalid
constraint is used in an inline asm statement.
<rdar://problem/12284092>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164657 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-26 04:04:19 +00:00
Chad Rosier
e5e674ba11 [fast-isel] Fallback to SelectionDAG isel if we require strict alignment for
non-aligned i32 loads/stores.
rdar://12304911

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164381 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-21 16:58:35 +00:00
NAKAMURA Takumi
df03a28990 llvm/test/CodeGen/ARM/fast-isel.ll: Fix possible typos, s/@unaligned_i16_store/@unaligned_i16_load/g.
I guess this had apparently passed in +Asserts possibly due to verborsity.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164350 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-21 01:15:05 +00:00
Chad Rosier
3ca380d8d7 Testcase does not need to be this strict.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164347 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-21 00:47:08 +00:00
Chad Rosier
ba6aec2cf3 Add newline.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164346 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-21 00:43:18 +00:00
Chad Rosier
d70c98e884 [fast-isel] Fallback to SelectionDAG isel if we require strict alignment for
non-halfword-aligned i16 loads/stores.
rdar://12304911

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164345 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-21 00:41:42 +00:00
Jim Grosbach
ced674e470 ARM: Use a dedicated intrinsic for vector bitwise select.
The expression based expansion too often results in IR level optimizations
splitting the intermediate values into separate basic blocks, preventing
the formation of the VBSL instruction as the code author intended. In
particular, LICM would often hoist part of the computation out of a loop.

rdar://11011471

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164340 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-21 00:18:20 +00:00
Jakob Stoklund Olesen
e6e2d8cd90 Ignore PHI-defs for -new-coalescer interference checks.
A PHI can't create interference on its own. If two live ranges interfere
at a PHI, they must also interfere when leaving one of the PHI
predecessors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164330 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-20 23:08:42 +00:00
Evan Cheng
2dad6b501b Try to make these tests more portable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164320 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-20 21:35:21 +00:00
Jakob Stoklund Olesen
d40d4c34f7 Resolve conflicts involving dead vector lanes for -new-coalescer.
A common coalescing conflict in vector code is lane insertion:

  %dst = FOO
  %src = BAR
  %dst:ssub0 = COPY %src

The live range of %src interferes with the ssub0 lane of %dst, but that
lane is never read after %src would have clobbered it. That makes it
safe to merge the live ranges and eliminate the COPY:

  %dst = FOO
  %dst:ssub0 = BAR

This patch teaches the new coalescer to resolve conflicts where dead
vector lanes would be clobbered, at least as long as the clobbered
vector lanes don't escape the basic block.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164250 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-19 21:29:18 +00:00