32871 Commits

Author SHA1 Message Date
Bob Wilson
5820b51546 Fix incorrect immediate size for AddrModeT2_i8s4 in rewriteT2FrameIndex.
The natural way to handle this addressing mode would be to say that it has
8 bits and gets scaled by 4, but since the MC layer is expecting the scaling
to be already reflected in the immediate value, we have been setting the
Scale to 1. That's fine, but then NumBits needs to be adjusted to reflect
the effective increase in the range of the immediate. That adjustment was
missing.

The consequence is that the register scavenger can fail.
The estimateRSStackSizeLimit() function in ARMFrameLowering.cpp correctly
assumes that the AddrModeT2_i8s4 address mode can handle scaled offsets up to
1020. Under just the right circumstances, we fail to reserve space for the
scavenger because it thinks that nothing will be needed. However, the overly
pessimistic behavior in rewriteT2FrameIndex causes some frame indexes to be
out of range and require scavenged registers, and so the scavenger asserts.

Unfortunately I have not been able to come up with a testcase for this. I
can only reproduce it on an internal branch where the frame layout and
register allocation is slightly different than trunk. We really need a
way to serialize MachineInstr-level IR to write reasonable tests for things
like this.

rdar://problem/19909005

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230233 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-23 16:57:19 +00:00
Bruno Cardoso Lopes
77d2363908 [X86][MMX] Add MMX instructions to foldable tables
Teach the peephole optimizer to work with MMX instructions by adding
entries into the foldable tables. This covers folding opportunities not
handled during isel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230226 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-23 15:23:22 +00:00
Bruno Cardoso Lopes
c606f3a3cb [X86][MMX] Support folding loads in psll, psrl and psra intrinsics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230225 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-23 15:23:14 +00:00
Elena Demikhovsky
fdafc8fd5e AVX-512: recommitted 229837 + bugfix + test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230223 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-23 15:12:31 +00:00
Elena Demikhovsky
d8e5adcd92 restructured X86 scalar unary operation templates
I made the templates general, no need to define pattern separately for each instruction/intrinsic.
Now only need to add r_Int pattern for AVX.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230221 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-23 14:14:02 +00:00
NAKAMURA Takumi
d40ce7ac2b Fix a warning on HexagonMCCodeEmitter::MCII. [-Wunused-private-field]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230170 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-22 09:58:29 +00:00
Craig Topper
f9c1605d56 [X86] Add some missing redundant MMX and SSE encodings for disassembler.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230165 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-22 07:50:41 +00:00
Matt Arsenault
29f97a6c46 R600/SI: Use v_madmk_f32
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230149 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-21 21:29:10 +00:00
Matt Arsenault
c490f78e53 R600/SI: Try to use v_madak_f32
This is a code size optimization when the constant
only has one use.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230148 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-21 21:29:07 +00:00
Matt Arsenault
9036390498 R600/SI: Don't crash when getting immediate operand size
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230147 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-21 21:29:04 +00:00
Matt Arsenault
dc9d5dcdd7 R600/SI: Fix mad*k definitions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230146 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-21 21:29:00 +00:00
Benjamin Kramer
2b17108064 Remove dead prototype.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230137 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-21 14:35:00 +00:00
Benjamin Kramer
edf99a5e3f X86: Remove custom lowering of SIGN_EXTEND_INREG
This was just replicating logic from the legalizer. Covered by existing
tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230136 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-21 14:31:29 +00:00
Eric Christopher
9494699d5e Remove obsolete comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230134 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-21 08:48:23 +00:00
Eric Christopher
113747defd Have the MipsAsmPrinter fp stub emission code take a custom
MCSubtargetInfo as the MachineFunction has gone away and we need
to emit code at the module level.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230133 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-21 08:48:22 +00:00
Eric Christopher
68992caa2e Turn an if+llvm_unreachable into an assert and reword comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230132 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-21 08:32:38 +00:00
Eric Christopher
3a389c6950 Endianness can be gotten from the DataLayout which we already
have. Also, the subtarget is invalid at this point.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230131 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-21 08:32:22 +00:00
David Majnemer
164db1c6b9 X86: Call __main using the SelectionDAG
Synthesizing a call directly using the MI layer would confuse the frame
lowering code.  This is problematic as frame lowering is highly
sensitive the particularities of calls, etc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230129 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-21 05:49:45 +00:00
Tim Northover
ca7e0787f0 CodeGen: convert CCState interface to using ArrayRefs
Everyone except R600 was manually passing the length of a static array
at each callsite, calculated in a variety of interesting ways. Far
easier to let ArrayRef handle that.

There should be no functional change, but out of tree targets may have
to tweak their calls as with these examples.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230118 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-21 02:11:17 +00:00
David Majnemer
e95985d3a0 Win64: Stack alignment constraints aren't applied during SET_FPREG
Stack realignment occurs after the prolog, not during, for Win64.
Because of this, don't factor in the maximum stack alignment when
establishing a frame pointer.

This fixes PR22572.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230113 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-21 01:04:47 +00:00
Reid Kleckner
4b91be0289 X86: Remove pre-2010 dead code in mergeSPUpdatesDown
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230075 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 22:13:25 +00:00
Simon Pilgrim
0de2c870d8 LowerScalarImmediateShift - Merged v16i8 and v32i8 shift lowering. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230074 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 22:13:03 +00:00
Matt Arsenault
16fc5e9c0f R600/SI: Remove v_sub_f64 pseudo
The expansion code does the same thing. Since
the operands were not defined with the correct
types, this has the side effect of fixing operand
folding since the expanded pseudo would never use
SGPRs or inline immediates.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230072 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 22:10:45 +00:00
Matt Arsenault
bbb748eece R600: Use new fmad node.
This enables a few useful combines that used to only
use fma.

Also since v_mad_f32 apparently does not support denormals,
disable the existing cases that are custom handled if they are
requested.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230071 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 22:10:41 +00:00
Jozef Kolek
b2e79a8e69 Reversed revision 229706. The reason is regression, which is caused by the
usage of instruction ADDU16 by CodeGen. For this instruction an improper
register is allocated, i.e. the register that is not from register set defined
for the instruction.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230053 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 20:26:52 +00:00
Eric Christopher
d53224014d Fix an asan use-after-free bug introduced by the asm printer
changes to remove non-Function based subtargets out of the asm
printer. For module level emission we'll need to construct up
an MCSubtargetInfo so that we can encode instructions for
emission.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230050 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 19:54:07 +00:00
Andrea Di Biagio
3583d23018 [X86][FastIsel] Teach how to select float-half conversion intrinsics.
This patch teaches X86FastISel how to select intrinsic 'convert_from_fp16' and
intrinsic 'convert_to_fp16'.
If the target has F16C, we can select VCVTPS2PHrr for a float-half conversion,
and VCVTPH2PSrr for a half-float conversion.

Differential Revision: http://reviews.llvm.org/D7673


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230043 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 19:37:14 +00:00
Eric Christopher
dd38f4e94d Remove a use of the Subtarget in the darwin ppc asm printer.
EmitFunctionStubs is called from doFinalization and so can't
depend on the Subtarget existing. It's also irrelevant as
we know we're darwin since we're in the darwin asm printer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230039 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 18:53:42 +00:00
Eric Christopher
6de800e056 Get the cached subtarget off the MachineFunction rather than
inquiring for a new one from the TargetMachine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230037 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 18:44:15 +00:00
Sanjay Patel
74e8bf678a canonicalize a v2f64 blendi of 2 registers
This canonicalization step saves us 3 pattern matching possibilities * 4 math ops
for scalar FP math that uses xmm regs. The backend can re-commute the operands
post-instruction-selection if that makes register allocation better.

The tests in llvm/test/CodeGen/X86/sse-scalar-fp-arith.ll cover this scenario already,
so there are no new tests with this patch.

Differential Revision: http://reviews.llvm.org/D7777


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230024 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 16:55:27 +00:00
Kit Barton
3e00ca983c I incorrectly marked the VORC instruction as isCommutable when I added it.
This fix removes the VORC instruction definition from the isCommutable block.

Phabricator review: http://reviews.llvm.org/D7772


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230020 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 15:54:58 +00:00
Chandler Carruth
634fc5f26b [x86] Switching the shuffle equivalence test to a variadic template was
the wrong answer. We also got initializer lists which are *way* cleaner
for this kind of thing. Let's use those and make this a normal, boring
functionn accepting ArrayRef.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230004 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 10:47:28 +00:00
Eric Christopher
05e2b94f35 Fix wording and grammar in Mips subtarget options.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230001 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 08:42:34 +00:00
Eric Christopher
d8210e33d4 Get the cached subtarget off the MachineFunction rather than
inquiring for a new one from the TargetMachine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230000 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 08:39:06 +00:00
Eric Christopher
f179b3f1d9 Get the cached subtarget off the MachineFunction rather than
inquiring for a new one from the TargetMachine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229999 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 08:24:37 +00:00
Eric Christopher
3ce9f152e4 Get the cached subtarget off the MachineFunction rather than
inquiring for a new one from the TargetMachine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229998 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 08:24:34 +00:00
Eric Christopher
b661ab1cbd Save the MachineFunction in startFunction so that we can use it for
lookups of the subtarget later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229996 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 08:01:55 +00:00
Eric Christopher
7b0c988b90 Use the cached subtarget from the MachineFunction rather than
doing a lookup on the TargetMachine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229995 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 08:01:52 +00:00
Eric Christopher
c9d0715997 Make the TargetMachine::getSubtarget that takes a Function argument
take a reference to match the getSubtargetImpl that takes a Function
argument.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229994 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 07:32:59 +00:00
Nick Lewycky
12cbedbaee Fix build in release mode, -Wunused-variable on this lambda function used only in an assert.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229977 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 07:16:17 +00:00
David Blaikie
7be2b85e1e Fix -Wunused-variable warning in non-asserts build, and optimize a little bit while I'm here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229970 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 06:28:38 +00:00
Hal Finkel
2c5f9584ba [PowerPC] Loop Data Prefetching for the BG/Q
The IBM BG/Q supercomputer's A2 cores have a hardware prefetching unit, the
L1P, but it does not prefetch directly into the A2's L1 cache. Instead, it
prefetches into its own L1P buffer, and the latency to access that buffer is
significantly higher than that to the L1 cache (although smaller than the
latency to the L2 cache). As a result, especially when multiple hardware
threads are not actively busy, explicitly prefetching data into the L1 cache is
advantageous.

I've been using this pass out-of-tree for data prefetching on the BG/Q for well
over a year, and it has worked quite well. It is enabled by default only for
the BG/Q, but can be enabled for other cores as well via a command-line option.

Eventually, we might want to add some TTI interfaces and move this into
Transforms/Scalar (there is nothing particularly target dependent about it,
although only machines like the BG/Q will benefit from its simplistic
strategy).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229966 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 05:08:21 +00:00
Chandler Carruth
efbbaefea5 [x86] Remove the old vector shuffle lowering code and its flag.
The new shuffle lowering has been the default for some time. I've
enabled the new legality testing by default with no really blocking
regressions. I've fuzz tested this very heavily (many millions of fuzz
test cases have passed at this point). And this cleans up a ton of code.
=]

Thanks again to the many folks that helped with this transition. There
was a lot of work by others that went into the new shuffle lowering to
make it really excellent.

In case you aren't using a diff algorithm that can handle this:
  X86ISelLowering.cpp: 22 insertions(+), 2940 deletions(-)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229964 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 04:25:04 +00:00
Chandler Carruth
07ef8904ad [x86] Now that the new vector shuffle legality is enabled and everything
is going well, remove the flag and the code for the old legality tests.

This is the first step toward removing the entire old vector shuffle
lowering. *Much* more code to delete coming up next.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229963 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 03:59:35 +00:00
Chandler Carruth
38749b8e07 [x86] Make the new vector shuffle legality test on by default, which
reflects the fact that the x86 backend can in fact lower any shuffle you
want it to with reasonably high code quality.

My recent work on the new vector shuffle has made this regress *very*
little. The diff in the test cases makes me very, very happy.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229958 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 03:05:47 +00:00
Eric Christopher
8c4bb575e1 Revert "AVX-512: Full implementation for VRNDSCALESS/SD instructions and intrinsics."
The instructions were being generated on architectures that don't support avx512.

This reverts commit r229837.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229942 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 00:45:28 +00:00
Eric Christopher
74678a1ed1 Add a license header to the AVX512 file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229941 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 00:36:53 +00:00
Ahmed Bougacha
5898fc70ec [ARM] Re-re-apply VLD1/VST1 base-update combine.
This re-applies r223862, r224198, r224203, and r224754, which were
reverted in r228129 because they exposed Clang misalignment problems
when self-hosting.

The combine caused the crashes because we turned ISD::LOAD/STORE nodes
to ARMISD::VLD1/VST1_UPD nodes.  When selecting addressing modes, we
were very lax for the former, and only emitted the alignment operand
(as in "[r1:128]") when it was larger than the standard alignment of
the memory type.

However, for ARMISD nodes, we just used the MMO alignment, no matter
what.  In our case, we turned ISD nodes to ARMISD nodes, and this
caused the alignment operands to start being emitted.

And that's how we exposed alignment problems that were ignored before
(but I believe would have been caught with SCTRL.A==1?).

To fix this, we can just mirror the hack done for ISD nodes:  only
take into account the MMO alignment when the access is overaligned.

Original commit message:
We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD
when the base pointer is incremented after the load/store.

We can do the same thing for generic load/stores.

Note that we can only combine the first load/store+adds pair in
a sequence (as might be generated for a v16f32 load for instance),
because other combines turn the base pointer addition chain (each
computing the address of the next load, from the address of the last
load) into independent additions (common base pointer + this load's
offset).

rdar://19717869, rdar://14062261.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229932 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-19 23:52:41 +00:00
Ahmed Bougacha
7f47189b51 [ARM] Minor cleanup to CombineBaseUpdate. NFC.
In preparation for a future patch:
- rename isLoad to isLoadOp: the former is confusing, and can be taken
  to refer to the fact that the node is an ISD::LOAD.  (it isn't, yet.)
- change formatting here and there.
- add some comments.
- const-ify bools.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229929 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-19 23:30:37 +00:00
Ahmed Bougacha
953c5c9458 [CodeGen] Use ArrayRef instead of std::vector&. NFC.
The former lets us use SmallVectors.  Do so in ARM and AArch64.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229925 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-19 23:13:10 +00:00