Commit Graph

606 Commits

Author SHA1 Message Date
Tanya Lattner
db28247522 Handle perfect shuffle case that generates a vrev for vectors of floats.
Add test case.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131582 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-18 21:44:54 +00:00
Evan Cheng
b936e3006f Revise r131553. Just use the type of the input node and forgo the bitcast. rdar://9449159.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131555 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-18 18:59:17 +00:00
Evan Cheng
d48fda46f5 Fix an ARMTargetLowering::LowerSELECT bug: legalized result must have same type as input. Sorry test cases only trigger when dag combine is disabled. rdar://9449178
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131553 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-18 18:47:27 +00:00
Tanya Lattner
2a8eb722c7 In r131488 I misunderstood how VREV works. It splits the vector in half and splits each half. Therefore, the real problem was that we were using a VREV64 for a 4xi16, when we should have been using a VREV32.
Updated test case and reverted change to the PerfectShuffle Table.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131529 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-18 06:42:21 +00:00
Cameron Zwarich
141ec63962 Fix typo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131519 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-18 02:29:50 +00:00
Cameron Zwarich
7d336c0c68 Fix more of PR8825 by correctly using rGPR registers when lowering atomic
compare-and-swap intrinsics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131518 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-18 02:20:07 +00:00
Bill Wendling
61512ba251 Give the 'eh.sjlj.dispatchsetup' intrinsic call the value coming from the setjmp
intrinsic call. This prevents it from being reordered so that it appears
*before* the setjmp intrinsic (thus making it completely useless).
<rdar://problem/9409683>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131174 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-11 01:11:55 +00:00
Eli Friedman
fc5d305597 Make the logic for determining function alignment more explicit. No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131012 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-06 20:34:06 +00:00
Bob Wilson
e1a56ae747 Temporarily disable use of divmod compiler-rt functions for iOS.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130766 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-03 17:33:22 +00:00
Dan Gohman
cca82149ad Add an unfolded offset field to LSR's Formula record. This is used to
model constants which can be added to base registers via add-immediate
instructions which don't require an additional register to materialize
the immediate.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130743 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-03 00:46:49 +00:00
Eric Christopher
5ac179ccd2 80-col.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130558 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-29 23:12:01 +00:00
Jim Grosbach
f7da8821b4 ARM and Thumb2 support for atomic MIN/MAX/UMIN/UMAX loads.
rdar://9326019


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130234 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-26 19:44:18 +00:00
Andrew Trick
1c3af779fc Thumb2 and ARM add/subtract with carry fixes.
Fixes Thumb2 ADCS and SBCS lowering: <rdar://problem/9275821>.
t2ADCS/t2SBCS are now pseudo instructions, consistent with ARM, so the
assembly printer correctly prints the 's' suffix.

Fixes Thumb2 adde -> SBC matching to check for live/dead carry flags.

Fixes the internal ARM machine opcode mnemonic for ADCS/SBCS.
Fixes ARM SBC lowering to check for live carry (potential bug).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130048 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-23 03:55:32 +00:00
Evan Cheng
c8578948c9 Remove -use-divmod-libcall. Let targets opt in when they are available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129884 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-20 22:20:12 +00:00
Stuart Hastings
e341e8ce1a Excise unintended hunk in 129858. <rdar://problem/7662569>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129862 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-20 18:09:26 +00:00
Stuart Hastings
c73158730d ARM byval support. Will be enabled by another patch to the FE. <rdar://problem/7662569>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129858 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-20 16:47:52 +00:00
Eric Christopher
2cc4013853 Remove some duplicate op action entries and reorganize.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129781 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-19 18:49:19 +00:00
Chris Lattner
7a2bdde0a0 Fix a ton of comment typos found by codespell. Patch by
Luis Felipe Strano Moraes!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129558 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-15 05:18:47 +00:00
Evan Cheng
9eec66e604 Fix another fcopysign lowering bug. If src is f64 and destination is f32, don't
forget to right shift the source by 32 first. rdar://9287902


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129556 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-15 01:31:00 +00:00
Cameron Zwarich
5af60ce2a8 Fix a typo in an ARM-specific DAG combine. This fixes <rdar://problem/9278274>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129468 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-13 21:01:19 +00:00
Cameron Zwarich
d0aacbcc2e Split a store of a VMOVDRR into two integer stores to avoid mixing NEON and ARM
stores of arguments in the same cache line. This fixes the second half of
<rdar://problem/8674845>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129345 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-12 02:24:17 +00:00
Evan Cheng
4da0c7c0c9 Change -arm-trap-func= into a non-arm specific option. Now Intrinsic::trap is lowered into a call to the specified trap function at sdisel time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129152 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-08 21:37:21 +00:00
Evan Cheng
274d8d4eba Add option to emit @llvm.trap as a function call instead of a trap instruction. rdar://9249183.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129107 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-07 20:31:12 +00:00
Tanya Lattner
0433b21c98 Prevent ARM DAG Combiner from doing an AND or OR combine on an illegal vector type (vectors of size 3). Also included test cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129074 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-07 15:24:20 +00:00
Evan Cheng
2c69f8eec6 Change -arm-divmod-libcall to a target neutral option.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129045 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-07 00:58:44 +00:00
Owen Anderson
b48c791515 Reapply r128946 (pseudoization of various instructions), and fix the extra imp-def of CPSR it was adding.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128965 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-05 23:55:28 +00:00
Owen Anderson
493cba1b32 Revert r128946 while I figure out why it broke the buildbots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128951 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-05 23:03:06 +00:00
Owen Anderson
76634dfabb Give RSBS and RSCS the pseudo treatment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128946 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-05 22:42:54 +00:00
Owen Anderson
7670601313 Fix bugs in the pseuo-ization of ADCS/SBCS pointed out by Jim, as well as doing the expansion earlier (using a custom inserter) to allow for the chance of predicating these instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128940 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-05 21:48:57 +00:00
Bill Wendling
f05b1dcf87 Revamp the SjLj "dispatch setup" intrinsic.
It needed to be moved closer to the setjmp statement, because the code directly
after the setjmp needs to know about values that are on the stack. Also, the
'bitcast' of the function context was causing a dead load. This wouldn't be too
horrible, except that at -O0 it wasn't optimized out, and because it wasn't
using the correct base pointer (if there is a VLA), it would try to access a
value from a garbage address.
<rdar://problem/9130540>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128873 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-05 01:37:43 +00:00
Cameron Zwarich
4071a71112 Do some peephole optimizations to remove pointless VMOVs from Neon to integer
registers that arise from argument shuffling with the soft float ABI. These
instructions are particularly slow on Cortex A8. This fixes one half of
<rdar://problem/8674845>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128759 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-02 02:40:43 +00:00
Evan Cheng
8e23e815ad Issue libcalls __udivmod*i4 / __divmod*i4 for div / rem pairs.
rdar://8911343


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128696 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-01 00:42:02 +00:00
Evan Cheng
463d358f1d Distribute (A + B) * C to (A * C) + (B * C) to make use of NEON multiplier
accumulator forwarding:
vadd d3, d0, d1
vmul d3, d3, d2
=>
vmul d3, d0, d2
vmla d3, d1, d2


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128665 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-31 19:38:48 +00:00
Evan Cheng
ee2e0e347e Don't try to create zero-sized stack objects.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128586 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-30 23:44:13 +00:00
Cameron Zwarich
c0e6d780cd Add a ARM-specific SD node for VBSL so that forms with a constant first operand
can be recognized. This fixes <rdar://problem/9183078>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128584 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-30 23:01:21 +00:00
Evan Cheng
92e3916c3b Add intrinsics @llvm.arm.neon.vmulls and @llvm.arm.neon.vmullu.* back. Frontends
was lowering them to sext / uxt + mul instructions. Unfortunately the
optimization passes may hoist the extensions out of the loop and separate them.
When that happens, the long multiplication instructions can be broken into
several scalar instructions, causing significant performance issue.

Note the vmla and vmls intrinsics are not added back. Frontend will codegen them
as intrinsics vmull* + add / sub. Also note the isel optimizations for catching
mul + sext / zext are not changed either.

First part of rdar://8832507, rdar://9203134


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128502 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-29 23:06:19 +00:00
Cameron Zwarich
3007d3331b Add Neon SINT_TO_FP and UINT_TO_FP lowering from v4i16 to v4f32. Fixes
<rdar://problem/8875309> and <rdar://problem/9057191>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128492 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-29 21:41:55 +00:00
Evan Cheng
78fe9ababe Optimizing (zext A + zext B) * C, to (VMULL A, C) + (VMULL B, C) during
isel lowering to fold the zero-extend's and take advantage of no-stall
back to back vmul + vmla:
 vmull q0, d4, d6
 vmlal q0, d5, d6
is faster than
 vaddl q0, d4, d5
 vmovl q1, d6                                                                                                                                                                             
 vmul  q0, q0, q1

This allows us to vmull + vmlal for:
    f = vmull_u8(   vget_high_u8(s), c);
    f = vmlal_u8(f, vget_low_u8(s),  c);

rdar://9197392


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128444 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-29 01:56:09 +00:00
Eric Christopher
29aeed1bf8 Fix the bfi handling for or (and a mask) (and b mask). We need the two
masks to match inversely for the code as is to work. For the example given
we actually want:

bfi r0, r2, #1, #1

not #0, however, given the way the pattern is written it's not possible
at the moment.

Fixes rdar://9177502


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128320 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-26 01:21:03 +00:00
Evan Cheng
485fafc840 Re-apply r127953 with fixes: eliminate empty return block if it has no predecessors; update dominator tree if cfg is modified.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127981 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-21 01:19:09 +00:00
Daniel Dunbar
7a90e04fc7 Revert r127953, "SimplifyCFG has stopped duplicating returns into predecessors
to canonicalize IR", it broke a lot of things.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127954 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-19 21:47:14 +00:00
Evan Cheng
ae16d6b972 SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IR
to have single return block (at least getting there) for optimizations. This
is general goodness but it would prevent some tailcall optimizations.
One specific case is code like this:
int f1(void);
int f2(void);
int f3(void);
int f4(void);
int f5(void);
int f6(void);
int foo(int x) {
  switch(x) {
  case 1: return f1();
  case 2: return f2();
  case 3: return f3();
  case 4: return f4();
  case 5: return f5();
  case 6: return f6();
  }
}

=>
LBB0_2:                                 ## %sw.bb
  callq   _f1
  popq    %rbp
  ret
LBB0_3:                                 ## %sw.bb1
  callq   _f2
  popq    %rbp
  ret
LBB0_4:                                 ## %sw.bb3
  callq   _f3
  popq    %rbp
  ret

This patch teaches codegenprep to duplicate returns when the return value
is a phi and where the phi operands are produced by tail calls followed by
an unconditional branch:

sw.bb7:                                           ; preds = %entry
  %call8 = tail call i32 @f5() nounwind
  br label %return
sw.bb9:                                           ; preds = %entry
  %call10 = tail call i32 @f6() nounwind
  br label %return
return:
  %retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ]
  ret i32 %retval.0

This allows codegen to generate better code like this:

LBB0_2:                                 ## %sw.bb
        jmp     _f1                     ## TAILCALL
LBB0_3:                                 ## %sw.bb1
        jmp     _f2                     ## TAILCALL
LBB0_4:                                 ## %sw.bb3
        jmp     _f3                     ## TAILCALL

rdar://9147433


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127953 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-19 17:17:39 +00:00
Bill Wendling
0d4c9d94f6 The VTBL (and VTBX) instructions are rather permissive concerning the masks they
accept. If a value in the mask is out of range, it uses the value 0, for VTBL,
or leaves the value unchanged, for VTBX.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127700 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-15 21:15:20 +00:00
Bill Wendling
a24cb40be2 Some minor cleanups based on feedback.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127694 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-15 20:47:26 +00:00
Bill Wendling
69a05a7b92 Generate a VTBL instruction instead of a series of loads and stores when we
can. As Nate pointed out, VTBL isn't super performant, but it *has* to be better
than this:

_shuf:
@ BB#0:       @ %entry
  push        {r4, r7, lr}
  add         r7, sp, #4
  sub         sp, #12
  mov         r4, sp
  bic         r4, r4, #7
  mov         sp, r4
  mov         r2, sp
  vmov        d16, r0, r1
  orr         r0, r2, #6
  orr         r3, r2, #7
  vst1.8      {d16[0]}, [r3]
  vst1.8      {d16[5]}, [r0]
  subs        r4, r7, #4
  orr         r0, r2, #5
  vst1.8      {d16[4]}, [r0]
  orr         r0, r2, #4
  vst1.8      {d16[4]}, [r0]
  orr         r0, r2, #3
  vst1.8      {d16[0]}, [r0]
  orr         r0, r2, #2
  vst1.8      {d16[2]}, [r0]
  orr         r0, r2, #1
  vst1.8      {d16[1]}, [r0]
  vst1.8      {d16[3]}, [r2]
  vldr.64     d16, [sp]
  vmov        r0, r1, d16
  mov         sp, r4
  pop         {r4, r7, pc}

The "illegal" testcase in vext.ll is no longer illegal.
<rdar://problem/9078775>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127630 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-14 23:02:38 +00:00
Evan Cheng
21a6179c9d Indentation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127595 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-14 18:02:30 +00:00
Bob Wilson
79f56c9618 Fix a compiler crash where a Glue value had multiple uses. Radar 9049552.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127198 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-08 01:17:20 +00:00
Bob Wilson
1b772f9962 Fix comment typos.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127197 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-08 01:17:16 +00:00
Cameron Zwarich
be2119e8e2 Move getRegPressureLimit() from TargetLoweringInfo to TargetRegisterInfo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127175 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-07 21:56:36 +00:00
Bob Wilson
4faa0e1952 Remove unused conditional negate operations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127090 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-05 16:54:31 +00:00