Commit Graph

1587 Commits

Author SHA1 Message Date
Arnold Schwaighofer
21c0aa74bd ARM NEON: Fix v2f32 float intrinsics
Mark them as expand, they are not legal as our backend does not match them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176410 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-02 19:38:33 +00:00
Chad Rosier
7590022f40 Generate an error message instead of asserting or segfaulting when we can't
handle indirect register inputs.
rdar://13322011

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176367 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-01 19:12:05 +00:00
Chad Rosier
b8f307b2d6 Add support for using non-pic code for arm and thumb1 when emitting the sjlj
dispatch code.  As far as I can tell the thumb2 code is behaving as expected.
I was able to compile and run the associated test case for both arm and thumb1.
rdar://13066352


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176363 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-01 18:30:38 +00:00
Jim Grosbach
b302a4e6b5 ARM: FMA is legal only if VFP4 is available.
rdar://13306723

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176212 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-27 21:31:12 +00:00
Manman Ren
5e5974f51a SelectionDAG: If llvm.donothing has a landingpad, we should clear
CurrentCallSite to avoid an assertion failure:
assert(MMI.getCurrentCallSite() == 0 && "Overlapping call sites!");

rdar://problem/13228754


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176154 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-27 02:11:57 +00:00
Kristof Beyls
29e05fe7a8 Make ARMAsmPrinter generate the correct alignment specifier syntax in instructions.
The Printer will now print instructions with the correct alignment specifier syntax, like
    vld1.8  {d16}, [r0:64]



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175884 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-22 10:01:33 +00:00
Arnold Schwaighofer
c46e2df74c DAGCombiner: Fold pointless truncate, bitcast, buildvector series
(2xi32) (truncate ((2xi64) bitcast (buildvector i32 a, i32 x, i32 b, i32 y)))
can be folded into a (2xi32) (buildvector i32 a, i32 b).

Such a DAG would cause uneccessary vdup instructions followed by vmovn
instructions.

We generate this code on ARM NEON for a setcc olt, 2xf64, 2xf64. For example, in
the vectorized version of the code below.

double A[N];
double B[N];

void test_double_compare_to_double() {
  int i;
  for(i=0;i<N;i++)
    A[i] = (double)(A[i] < B[i]);
}

radar://13191881

Fixes bug 15283.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175670 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-20 21:33:32 +00:00
Arnold Schwaighofer
2e750c12e9 ARM NEON: Merge a f32 bitcast of a v2i32 extractelt
A vectorized sitfp on doubles will get scalarized to a sequence of an
extract_element of <2 x i32>, a bitcast to f32 and a sitofp.
Due to the the extract_element, and the bitcast we will uneccessarily generate
moves between scalar and vector registers.

The patch fixes this by using a COPY_TO_REGCLASS and a EXTRACT_SUBREG to extract
the element from the vector instead.

radar://13191881

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175520 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-19 15:27:05 +00:00
Chad Rosier
69c65b0d93 Comment out the rdar number.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175460 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-18 21:59:15 +00:00
Chad Rosier
848c25ddfa [fast-isel] Remove an invalid assert.
If the memcpy has an odd length with an alignment of 2, this would incorrectly
assert on the last 1 byte copy.
rdar://13202135

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175459 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-18 21:46:28 +00:00
Weiming Zhao
7248451c43 Re-apply r175088 for bug fix 13622: Add paired register support for
inline asm with 64-bit data on ARM

Update test case to use -mtriple=arm-linux-gnueabi


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175186 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-14 18:10:21 +00:00
Kristof Beyls
b1d081230e Make ARMAsmParser accept the correct alignment specifier syntax in instructions.
The parser will now accept instructions with alignment specifiers written like
    vld1.8  {d16}, [r0:64]
, while also still accepting the incorrect syntax
    vld1.8  {d16}, [r0, :64]



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175164 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-14 14:46:12 +00:00
Weiming Zhao
c0c2816fb3 temporarily revert the patch due to some conflicts
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175107 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-13 23:24:40 +00:00
Weiming Zhao
3019fbbe6a Bug fix 13622: Add paired register support for inline asm with 64-bit data on ARM
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175088 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-13 21:43:02 +00:00
David Peixotto
af7c042af1 PR14992 - Tablegen incorrectly converts ARM tLDMIA_UPD pseudo to tLDMIA
Fixed bug in tablegen conversion when source pseudo instruction has
a different number of arguments than the destination instruction.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175066 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-13 19:21:47 +00:00
Arnold Schwaighofer
d9316dacf5 ARM NEON: Handle v16i8 and v8i16 reverse shuffles
Lower reverse shuffles to a vrev64 and a vext instruction instead of the default
legalization of storing and loading to the stack. This is important because we
generate reverse shuffles in the loop vectorizer when we reverse store to an
array.

  uint8_t Arr[N];
  for (i = 0; i < N; ++i)
    Arr[N - i - 1] = ...

radar://13171760

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174929 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-12 01:58:32 +00:00
Bob Wilson
8f637adbd3 Revert 172027 and 174336. Remove diagnostics about over-aligned stack objects.
Aside from the question of whether we report a warning or an error when we
can't satisfy a requested stack object alignment, the current implementation
of this is not good.  We're not providing any source location in the diagnostics
and the current warning is not connected to any warning group so you can't
control it.  We could improve the source location somewhat, but we can do a
much better job if this check is implemented in the front-end, so let's do that
instead.  <rdar://problem/13127907>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174741 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-08 20:35:15 +00:00
Manman Ren
9c5861fdbd Attempt to recover gdb bot after r174445.
Failure: undefined symbol 'Lline_table_start0'.
Root-cause: we use a symbol subtraction to calculate at_stmt_list, but
the line table entries are not dumped in the assembly.
Fix: use zero instead of a symbol subtraction for Compile Unit 0.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174479 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-06 00:59:41 +00:00
Manman Ren
43213cf1ac Dwarf: support for LTO where a single object file can have multiple line tables
We generate one line table for each compilation unit in the object file.
Reviewed by Eric and Kevin.

rdar://problem/13067005


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174445 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-05 21:52:47 +00:00
Chad Rosier
1e45487dfd [SjLj Prepare] When demoting an invoke instructions to the stack, if the normal
edge is critical, then split it so we can insert the store.
rdar://13126179

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174418 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-05 18:23:10 +00:00
Logan Chien
b0c899666a Link .ARM.exidx with corresponding text section.
The sh_link in the ELF section header of .ARM.exidx should
be filled with the section index of the corresponding text
section.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174372 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-05 14:18:59 +00:00
Manman Ren
91b978e157 [Stack Alignment] emit warning instead of a hard error
Per discussion in rdar://13127907, we should emit a hard error only if
people write code where the requested alignment is larger than achievable
and assumes the low bits are zeros. A warning should be good enough when
we are not sure if the source code assumes the low bits are zeros.

rdar://13127907


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174336 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-04 23:45:08 +00:00
Eli Bendersky
0f156af831 Add a special ARM trap encoding for NaCl.
More details in this thread: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130128/163783.html

Patch by JF Bastien



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173943 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-30 16:30:19 +00:00
Logan Chien
620d5bd8e4 Add missing header and test cases for r173939.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173941 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-30 15:48:50 +00:00
Tim Northover
0adfdedacb Fix 64-bit atomic operations in Thumb mode.
The ARM and Thumb variants of LDREXD and STREXD have different constraints and
take different operands. Previously the code expanding atomic operations didn't
take this into account and asserted in Thumb mode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173780 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-29 09:06:13 +00:00
Silviu Baranga
4a9256f265 Fixed the condition codes for the atomic64 min/umin code generation on ARM. If the sutraction of the higher 32 bit parts gives a 0 result, we need to do the store operation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173437 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-25 10:39:49 +00:00
Jakob Stoklund Olesen
d32eea9636 Remove some register allocation order dependencies.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172874 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-19 00:03:32 +00:00
Tim Northover
5f2801bd65 Simplify writing floating types to assembly.
This removes previous special cases for each floating-point type in favour of a
shared codepath.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172189 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-11 10:36:13 +00:00
Manman Ren
86441169da Stack Alignment: throw error if we can't satisfy the minimal alignment
requirement when creating stack objects in MachineFrameInfo.

Add CreateStackObjectWithMinAlign to throw error when the minimal alignment
can't be achieved and to clamp the alignment when the preferred alignment
can't be achieved. Same is true for CreateVariableSizedObject.
Will not emit error in CreateSpillStackObject or CreateStackObject.

As long as callers of CreateStackObject do not assume the object will be
aligned at the requested alignment, we should not have miscompile since
later optimizations which look at the object's alignment will have the correct
information.

rdar://12713765


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172027 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-10 01:10:10 +00:00
Andrew Trick
47579cf390 MIsched: add an ILP window property to machine model.
This was an experimental option, but needs to be defined
per-target. e.g. PPC A2 needs to aggressively hide latency.

I converted some in-order scheduling tests to A2. Hal is working on
more test cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171946 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-09 03:36:49 +00:00
Tim Northover
a67352d401 Specify complete triple for fp128 tests.
This avoids FileCheck failing over different comment characters in
assembly (notably powerpc64 on Linux vs Darwin) and should fix David's
build-bot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171886 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-08 19:36:33 +00:00
Tim Northover
90f011e0ba Allow the asm printer to print fp128 values properly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171866 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-08 16:56:23 +00:00
Silviu Baranga
e97165901e Make the MergeGlobals pass correctly handle the address space qualifiers of the global variables. We partition the set of globals by their address space, and apply the same the trasnformation as before to merge them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171730 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-07 12:31:25 +00:00
Bob Wilson
103b4a571e Revert "Adding support for llvm.arm.neon.vaddl[su].* and"
This reverts r170694.  The operations can be represented in IR without
adding any new intrinsics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170765 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-20 21:09:38 +00:00
Renato Golin
332bd79951 Adding support for llvm.arm.neon.vaddl[su].* and
llvm.arm.neon.vsub[su].* intrinsics.

Patch by Pete Couperus <pjcoup@gmail.com>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170694 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-20 13:52:11 +00:00
Evan Cheng
733c6b1db1 LLVM sdisel normalize bit extraction of the form:
((x & 0xff00) >> 8) << 2
to
 (x >> 6) & 0x3fc

This is general goodness since it folds a left shift into the mask. However,
the trailing zeros in the mask prevents the ARM backend from using the bit
extraction instructions. And worse since the mask materialization may require
an addition instruction. This comes up fairly frequently when the result of 
the bit twiddling is used as memory address. e.g.

 = ptr[(x & 0xFF0000) >> 16]

We want to generate:
  ubfx   r3, r1, #16, #8
  ldr.w  r3, [r0, r3, lsl #2]

vs.
  mov.w  r9, #1020
  and.w  r2, r9, r1, lsr #14
  ldr    r2, [r0, r2]

Add a late ARM specific isel optimization to
ARMDAGToDAGISel::PreprocessISelDAG(). It folds the left shift to the
'base + offset' address computation; change the mask to one which doesn't have
trailing zeros and enable the use of ubfx.

Note the optimization has to be done late since it's target specific and we
don't want to change the DAG normalization. It's also fairly restrictive
as shifter operands are not always free. It's only done for lsh 1 / 2. It's
known to be free on some cpus and they are most common for address
computation.

This is a slight win for blowfish, rijndael, etc.

rdar://12870177


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170581 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-19 20:16:09 +00:00
Quentin Colombet
b519351b87 Disable ARM partial flag dependency optimization at -Oz
To not over constrain the scheduler for ARM in thumb mode, some optimizations  for code size reduction, specific to ARM thumb, are blocked when they add a dependency (like write after read dependency).

Disables this check when code size is the priority, i.e., code is compiled with -Oz.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170462 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-18 22:47:16 +00:00
Andrew Trick
04f52e1300 MISched: add dependence to ExitSU to model live-out latency.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170454 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-18 20:53:01 +00:00
Evan Cheng
376642ed62 Some enhancements for memcpy / memset inline expansion.
1. Teach it to use overlapping unaligned load / store to copy / set the trailing
   bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies.
2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g.
   x86 and ARM.
3. When memcpy from a constant string, do *not* replace the load with a constant
   if it's not possible to materialize an integer immediate with a single
   instruction (required a new target hook: TLI.isIntImmLegal()).
4. Use unaligned load / stores more aggressively if target hooks indicates they
   are "fast".
5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8.
   Also increase the threshold to something reasonable (8 for memset, 4 pairs
   for memcpy).

This significantly improves Dhrystone, up to 50% on ARM iOS devices.

rdar://12760078


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169791 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-10 23:21:26 +00:00
Tim Northover
6eb3e87df0 Added Mapping Symbols for ARM ELF
Before this patch, when you objdump an LLVM-compiled file, objdump tried to
decode data-in-code sections as if they were code.  This patch adds the missing
Mapping Symbols, as defined by "ELF for the ARM Architecture" (ARM IHI 0044D).

Patch based on work by Greg Fitzgerald.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169609 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-07 16:50:23 +00:00
Dmitri Gribenko
00e97c2126 Fix typos in CHECK lines.
Patch by Alexander Zinenko.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169547 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06 21:24:47 +00:00
Evan Cheng
824ec7d01a Properly fix the tes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169464 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06 02:29:29 +00:00
NAKAMURA Takumi
e1ab8e3e73 llvm/test/CodeGen/ARM/extload-knownzero.ll: Try to unbreak, to add -O0. I guess Chad expects fastisel here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169463 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06 02:22:58 +00:00
Chad Rosier
c9758b1366 [arm fast-isel] Make the fast-isel implementation of memcpy respect alignment.
rdar://12821569

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169460 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06 01:34:31 +00:00
Evan Cheng
8a7186dbc2 Let targets provide hooks that compute known zero and ones for any_extend
and extload's. If they are implemented as zero-extend, or implicitly
zero-extend, then this can enable more demanded bits optimizations. e.g.

define void @foo(i16* %ptr, i32 %a) nounwind {
entry:
  %tmp1 = icmp ult i32 %a, 100
  br i1 %tmp1, label %bb1, label %bb2
bb1:
  %tmp2 = load i16* %ptr, align 2
  br label %bb2
bb2:
  %tmp3 = phi i16 [ 0, %entry ], [ %tmp2, %bb1 ]
  %cmp = icmp ult i16 %tmp3, 24
  br i1 %cmp, label %bb3, label %exit
bb3:
  call void @bar() nounwind
  br label %exit
exit:
  ret void
}

This compiles to the followings before:
        push    {lr}
        mov     r2, #0
        cmp     r1, #99
        bhi     LBB0_2
@ BB#1:                                 @ %bb1
        ldrh    r2, [r0]
LBB0_2:                                 @ %bb2
        uxth    r0, r2
        cmp     r0, #23
        bhi     LBB0_4
@ BB#3:                                 @ %bb3
        bl      _bar
LBB0_4:                                 @ %exit
        pop     {lr}
        bx      lr

The uxth is not needed since ldrh implicitly zero-extend the high bits. With
this change it's eliminated.

rdar://12771555


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169459 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06 01:28:01 +00:00
Evan Cheng
c8e7045c8a ARM custom lower ctpop for vector types. Patch by Pete Couperus.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169325 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04 22:41:50 +00:00
Bill Wendling
9493dae613 Use the 'count' attribute to calculate the upper bound of an array.
The count attribute is more accurate with regards to the size of an array. It
also obviates the upper bound attribute in the subrange. We can also better
handle an unbound array by setting the count to -1 instead of the lower bound to
1 and upper bound to 0.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169312 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04 21:34:03 +00:00
Bill Wendling
a7645a3c66 Add a 'count' field to the DWARF subrange.
The count field is necessary because there isn't a difference between the 'lo'
and 'hi' attributes for a one-element array and a zero-element array. When the
count is '0', we know that this is a zero-element array. When it's >=1, then
it's a normal constant sized array. When it's -1, then the array is unbounded.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169218 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04 06:20:49 +00:00
Manman Ren
69261a6442 Stack Alignment: when creating stack objects in MachineFrameInfo, make sure
the alignment is clamped to TargetFrameLowering.getStackAlignment if the target
does not support stack realignment or the option "realign-stack" is off.

This will cause miscompile if the address is treated as aligned and add is
replaced with or in DAGCombine.

Added a bool StackRealignable to TargetFrameLowering to check whether stack
realignment is implemented for the target. Also added a bool RealignOption
to MachineFrameInfo to check whether the option "realign-stack" is on.

rdar://12713765


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169197 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04 00:52:33 +00:00
Jakob Stoklund Olesen
8c3dccde92 Simplify REG_SEQUENCE lowering.
The TwoAddressInstructionPass takes the machine code out of SSA form by
expanding REG_SEQUENCE instructions into copies. It is no longer
necessary to rewrite the registers used by a REG_SEQUENCE instruction
because the new coalescer algorithm can do it now.

REG_SEQUENCE is just converted to a sequence of sub-register copies now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169067 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-01 01:06:44 +00:00