Commit Graph

10788 Commits

Author SHA1 Message Date
Juergen Ributzka
e431884ed7 [FastISel][X86] Add support for cvttss2si/cvttsd2si intrinsics.
This adds support for the cvttss2si/cvttsd2si intrinsics. Preceding
insertelement instructions are folded into the conversion instruction (if
possible).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210870 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-13 02:21:58 +00:00
Juergen Ributzka
7f8d138f50 [FastISel][X86] - Add branch weights
Add branch weights to branch instructions, so that the following passes can
optimize based on it (i.e. basic block ordering).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210863 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-13 00:45:11 +00:00
Juergen Ributzka
4eddf94a14 [FastISel][X86] Add MachineMemOperand to load/store instructions.
This commit adds MachineMemOperands to load and store instructions. This allows
the peephole optimizer to fold load instructions. Unfortunatelly the peephole
optimizer currently doesn't run at -O0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210858 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 23:27:57 +00:00
Juergen Ributzka
cc738f8367 Update test case to use "not" instead of "XFAIL".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210829 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 21:17:40 +00:00
Matt Arsenault
00c3986254 R600: Mostly remove remaining AMDIL intrinsics.
Delete all unused ones, and add new AMDGPU named intrinsics for
the ones that are. Handle the old AMDIL names for comptability (although
remove their GCCBuiltin names) and add tests since there weren't any
for these before.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210827 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 21:15:44 +00:00
Juergen Ributzka
c7147a3b6d [FastISel][X86] Argument lowering test case
This test case is supposed to xfail, because we do not handle structs or byval
arguments.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210816 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 20:34:09 +00:00
Juergen Ributzka
3f2b28dcaf [FastIsel][X86] Add support for lowering the first 8 floating-point arguments.
Recommit with fixed argument attribute checking code, which is required to bail
out of all the cases we don't handle yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210815 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 20:12:34 +00:00
Saleem Abdulrasool
3455c40f91 CodeGen: enable mov.w/mov.t pairs with minsize for WoA
Windows on ARM uses COFF/PE which is intrinsically position independent.  For
the case of 32-bit immediates, use a pair-wise relocation as otherwise we may
exceed the range of operators.  This fixes a code generation crash when using
-Oz when targeting Windows on ARM.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210814 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 20:06:33 +00:00
Juergen Ributzka
a15b05e1aa Revert "[FastIsel][X86] Add support for lowering the first 8 floating-point arguments."
Reverting it because it breaks several tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210810 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 19:21:43 +00:00
Tom Stellard
82a51defb6 Revert "SelectionDAG: Enable (and (setcc x), (setcc y)) -> (setcc (and x, y)) for vectors"
This reverts commit r210540, adds a testcase for the regression it
caused, and marks the R600 test it was supposed to fix as XFAIL.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210792 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 16:04:47 +00:00
James Molloy
5eba90a861 Disable the load/store optimization pass for Thumb-1.
Moritz's changes have improved codegen a lot, but further testing showed significant correctness problems. Disable by default until these have been worked out.

Patch by Moritz Roth!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210789 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 15:18:33 +00:00
Daniel Sanders
1f4c755c2c [mips][mips64r6] bc1[tf] are not available on MIPS32r6/MIPS64r6
Summary:
Also tightened up the acceptable condition operand for these instructions
on MIPS-I to MIPS-III. Support for $fcc[1-7] was added in MIPS-IV. Prior
to that only $fcc0 is acceptable.

We currently don't optimize (BEQZ (NOT $a), $target) and similar. It's
probably best to do this in InstCombine.

Depends on D4111

Reviewers: jkolek, zoran.jovanovic, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D4112

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210787 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 15:00:17 +00:00
Daniel Sanders
28002c2f82 [mips][mips64r6] [sl][duw]xc1 are not available on MIPS32r6/MIPS64r6
Summary:
Folded mips64-fp-indexed-ls.ll into fp-indexed-ls.ll. To do so, the zext's in
mips64-fp-indexed-ls.ll were changed to implicit sign extensions (performed
by getelementptr). This does not affect the purpose of the test.

Depends on D4004

Reviewers: zoran.jovanovic, jkolek, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D4110

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210784 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 14:19:28 +00:00
Daniel Sanders
8007133f3e [mips][mips64r6] c.cond.fmt, mov[fntz], and mov[fntz].[ds] are not available on MIPS32r6/MIPS64r6
Summary:
c.cond.fmt has been replaced by cmp.cond.fmt. Where c.cond.fmt wrote to
dedicated condition registers, cmp.cond.fmt writes 1 or 0 to normal FGR's
(like the GPR comparisons).

mov[fntz] have been replaced by seleqz and selnez. These instructions
conditionally zero a register based on a bool in a GPR. The results can
then be or'd together to act as a select without, for example, requiring a third
register read port.

mov[fntz].[ds] have been replaced with sel.[ds]

MIPS64r6 currently generates unnecessary sign-extensions for most selects.
This is because the result of a SETCC is currently an i32. Bits 32-63 are
undefined in i32 and the behaviour of seleqz/selnez would otherwise depend
on undefined bits. Later, we will fix this by making the result of SETCC an
i64 on MIPS64 targets.

Depends on D3958

Reviewers: jkolek, vmedic, zoran.jovanovic

Reviewed By: vmedic, zoran.jovanovic

Differential Revision: http://reviews.llvm.org/D4003

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210777 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 13:39:06 +00:00
Daniel Sanders
14e97c4f51 [mips] Use MTHC1 when it is available (MIPS32r2 and later) for both FP32 and FP64
Summary:
To make this work for both AFGR64 and FGR64 register sets, I've had to make the
instruction definition consistent with the white lie (that it reads the lower
32-bits of the register) when they are generated by expandBuildPairF64().

Corrected the definition of hasMips32r2() and hasMips64r2() to include
MIPS32r6 and MIPS64r6.

Depends on D3956

Reviewers: jkolek, zoran.jovanovic, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3957

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210771 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 11:55:58 +00:00
Daniel Sanders
a61aa38ee1 [mips][mips64r6] madd.[ds], msub.[ds], nmadd.[ds], and nmsub.[ds] are not available on MIPS32r6/MIPS64r6
Summary:
This patch updates both the assembler and the code generator.

MIPS32r6/MIPS64r6 replaces them with maddf.[ds] and msubf.[ds] which are fused
multiply-add/sub operations. We don't emit these yet, this patch only prevents the removed instructions from being emitted.

Depends on D3955

Reviewers: jkolek, zoran.jovanovic, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3956

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210763 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 11:04:18 +00:00
Daniel Sanders
d94bc707c4 [mips][mips64r6] madd/maddu/msub/msubu are not available on MIPS32r6/MIPS64r6
Summary:
This patch disables madd/maddu/msub/msubu in both the assembler and code
generator.

Depends on D3896

Reviewers: jkolek, zoran.jovanovic, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3955

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210762 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 10:54:16 +00:00
Andrea Di Biagio
bf4e625cf1 [X86] Teach how to combine AVX and AVX2 horizontal binop on packed 256-bit vectors.
This patch adds target combine rules to match:
 - [AVX] Horizontal add/sub of packed single/double precision floating point
   values from 256-bit vectors;
 - [AVX2] Horizontal add/sub of packed integer values from 256-bit vectors.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210761 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 10:53:48 +00:00
Daniel Sanders
38b2a0bfdd [mips][mips64r6] Replace m[tf]hi, m[tf]lo, mult, multu, dmult, dmultu, div, ddiv, divu, ddivu for MIPS32r6/MIPS64.
Summary:
The accumulator-based (HI/LO) multiplies and divides from earlier ISA's have
been removed and replaced with GPR-based equivalents. For example:
  div $1, $2
  mflo $3
is now:
  div $3, $1, $2

This patch disables the accumulator-based multiplies and divides for
MIPS32r6/MIPS64r6 and uses the GPR-based equivalents instead.

Renamed expandPseudoDiv to insertDivByZeroTrap to better describe the
behaviour of the function.

MipsDelaySlotFiller now invalidates the liveness information when moving
instructions to the delay slot. Without this, divrem.ll will abort since
%GP ends up used before it is defined.

Reviewers: vmedic, zoran.jovanovic, jkolek

Reviewed By: jkolek

Differential Revision: http://reviews.llvm.org/D3896

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210760 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 10:44:10 +00:00
Matt Arsenault
0b87955888 R600/SI: Use a register set to -1 for data0 on ds_inc*/ds_dec*
There is not such thing as a 0-data ds instruction, and the data
operand needs to be a vgpr set to something meaningful.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210756 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 08:21:54 +00:00
Juergen Ributzka
729fc8facc [FastISel][x86] Add testcase for r210719.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210746 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 03:54:05 +00:00
Juergen Ributzka
f66c14f656 [x86] Improve frameaddress test from r210709.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210743 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 03:29:29 +00:00
Juergen Ributzka
2c9a12f081 [FastISel] Add support for the stackmap intrinsic.
This implements target-independent FastISel lowering for the stackmap intrinsic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210742 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 03:29:26 +00:00
Juergen Ributzka
02503401b4 [FastISel][X86] Add support for the sqrt intrinsic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210720 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 23:11:02 +00:00
Juergen Ributzka
a2d36a20fa [FastISel][X86] Add support for the frameaddress intrinsic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210709 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 21:44:44 +00:00
Chad Rosier
2e4cd27799 [AArch64] Basic Sched Model for Cortex-A57.
Patch by Dave Estes<cestes@codeaurora.org>
Differential Revision: http://reviews.llvm.org/D4008

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210705 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 21:06:56 +00:00
Matt Arsenault
7fa80b45eb R600/SI: Fix bitcast between v2i32 and f64
This is the same problem fixed in r210664 for more types.

The test passes without this fix. For some reason
I'm only hitting this when creating selects lowered
to v2i32 selects.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210692 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 19:31:13 +00:00
Matt Arsenault
c9dbd0da7a R600/SI: Add common 64-bit LDS atomics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210680 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 18:08:54 +00:00
Matt Arsenault
6b19a3a474 R600/SI: Add 32-bit LDS atomic cmpxchg
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210678 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 18:08:48 +00:00
Matt Arsenault
a396a70c1d R600/SI: Use LDS atomic inc / dec
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210677 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 18:08:45 +00:00
Matt Arsenault
2da1a85cbb R600/SI: Add other LDS atomic operations
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210676 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 18:08:42 +00:00
Matt Arsenault
4a19dd468d R600/SI: Fix backwards names for local atomic instructions.
The manual lists them as *_RTN_U32, not *_U32_RTN, which is more
consistent with how every other sized instruction is named.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210674 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 18:08:37 +00:00
Matt Arsenault
b97095b94f R600/SI: Refactor local atomics.
Use patterns that will also match the immediate offset to
match the normal read / writes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210673 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 18:08:34 +00:00
Matt Arsenault
8a9df8f92c R600/SI: Use v_cvt_f32_ubyte* instructions
This eliminates extra extract instructions when loading an i8 vector to
a float vector.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210666 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 17:50:44 +00:00
Matt Arsenault
a2dca4cc04 R600/SI: Fix selection failure on scalar_to_vector
There seem to be only 2 places that produce these,
and it's kind of tricky to hit them.

Also fixes failure to bitcast between i64 and v2f32,
although this for some reason wasn't actually broken in the
simple bitcast testcase, but did in the scalar_to_vector one.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210664 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 17:40:32 +00:00
Daniel Sanders
0ee5398b7f [mips][mips64r6] Improve tests affected by the changes to multiplies and divides
Summary:
MIPS32r6/MIPS64r6 support has not been added yet.

inlineasm-cnstrnt-reg.ll:
  Explicitly specify the CPU since it will not work on MIPS32r6/MIPS64r6
  when -integrated-as is the default. We can't change the mnemonic since the
  LO register is an implicit def of mtlo and MIPS32r6/MIPS64r6 has no
  instructions that use LO.

2008-08-01-AsmInline.ll:
  Explicitly specify the CPU since MIPS32r6/MIPS64r6 will correctly emit
  different code and this is a regression test.

mips64instrs.ll and mips64muldiv.ll
  Check registers and the way the multiply is used in m1

divrem.ll
  Check registers and use multiple filecheck prefixes to limit redundancy

Reviewers: vmedic, jkolek, zoran.jovanovic, matheusalmeida

Reviewed By: matheusalmeida

Subscribers: matheusalmeida

Differential Revision: http://reviews.llvm.org/D3894

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210656 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 15:48:00 +00:00
Cameron McInally
998d8f50a7 Add AVX512 masked leadz instrinsic support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210652 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 12:54:45 +00:00
Andrea Di Biagio
a069e64112 [X86] Refactor the logic to select horizontal adds/subs to a helper function.
This patch moves part of the logic implemented by the target specific
combine rules added at r210477 to a separate helper function.
This should make easier to add more rules for matching AVX/AVX2 horizontal
adds/subs.

This patch also fixes a problem caused by a wrong check performed on indices
of extract_vector_elt dag nodes in input to the scalar adds/subs.

New tests have been added to verify that we correctly check indices of
extract_vector_elt dag nodes when selecting a horizontal operation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210644 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 07:57:50 +00:00
Jiangning Liu
f847ccb87a Global merge for global symbols.
This commit is to improve global merge pass and support global symbol merge.
The global symbol merge is not enabled by default. For aarch64, we need some
more back-end fix to make it really benifit ADRP CSE.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210640 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 06:44:53 +00:00
Juergen Ributzka
0adbcf3ba9 [FastISel][X86] Extend support for {s|u}{add|sub|mul}.with.overflow intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210610 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 23:52:44 +00:00
Matt Arsenault
1f4772305a R600: Use BCNT_INT for evergreen
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210569 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 19:18:28 +00:00
Matt Arsenault
69891c0115 R600/SI: Implement i64 ctpop
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210568 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 19:18:24 +00:00
Matt Arsenault
ee9772d9dd R600/SI: Use bcnt instruction for ctpop
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210567 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 19:18:21 +00:00
Matt Arsenault
bfd00e21b7 R600: Handle fcopysign
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210564 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 19:00:20 +00:00
Matt Arsenault
0ba78a9121 R600/SI: Handle sign_extend and zero_extend to i64 with patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210563 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 18:54:59 +00:00
Reed Kotler
805c9e4943 Do Materialize Floating Point in Mips Fast-Isel
Summary:
Implement materialize of floating point literals in Mips Fast-Isel

Reopened version of D3659

Test Plan: simplestorefp1.ll

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D4071

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210546 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 16:45:44 +00:00
Andrea Di Biagio
c0edcf7de8 [X86] Improved target combine rules for selecting horizontal add/sub.
This patch slightly changes the algorithm introduced at revision 210477
to fix a problem where the algorithm was producing incorrect code for 
the VEX.256 encoded versions of horizontal add/sub.

For these cases, we now try to split the two 256-bit vectors into
128-bit chunks before emitting horizontal add/sub dag nodes.

Added a new test case into haddsub-2.ll.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210545 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 16:42:57 +00:00
Adam Nemet
8dea1c4167 [X86] AVX512: Add vmovntdqa
Along with the corresponding intrinsic and tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210543 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 16:39:53 +00:00
Renato Golin
2d89932fb2 Fix a bug in the Thumb1 ARM Load/Store optimizer
Previously, the basic block was searched for future uses of the base register,
and if necessary any writeback to the base register was reset using a SUB
instruction (e.g. before calling a function) just before such a use. However,
this step happened *before* the merged LDM/STM instruction was built. So if
there was (e.g.) a function call directly after the not-yet-formed LDM/STM,
the pass would first insert a SUB instruction to reset the base register,
and then (at the same location, incorrectly) insert the LDM/STM itself.

This patch fixes PR19972. Patch by Moritz Roth.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210542 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 16:39:21 +00:00
Tom Stellard
102d0f3e3f SelectionDAG: Don't use MVT::Other to determine legality of ISD::SELECT_CC
The SelectionDAG bad a special case for ISD::SELECT_CC, where it would
allow targets to specify:

setOperationAction(ISD::SELECT_CC, MVT::Other, Expand);

to indicate that they wanted to expand ISD::SELECT_CC for all types.
This wasn't applied correctly everywhere, and it makes writing new
DAG patterns with ISD::SELECT_CC difficult.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210541 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 16:01:29 +00:00