13507 Commits

Author SHA1 Message Date
Elena Demikhovsky
9cc691fa05 AVX-512, X86: Added lowering for shift operations for SKX.
The other changes in the LowerShift() are not functional,
just to make the code more convenient.
So, the functional changes for SKX only.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237129 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-12 13:25:46 +00:00
John Brawn
e38f45effc [ARM] Use AEABI aligned function variants
AEABI defines aligned variants of memcpy etc. that can be faster than
the default version due to not having to do alignment checks. When
emitting target code for these functions make use of these aligned
variants if possible. Also convert memset to memclr if possible.

Differential Revision: http://reviews.llvm.org/D8060


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237127 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-12 13:13:38 +00:00
Igor Laevsky
00666a17ff Reverse ordering of base and derived pointer during safepoint lowering.
According to the documentation in StackMap section for the safepoint we should have:
"The first Location in each pair describes the base pointer for the object. The second is the derived pointer actually being relocated."
But before this change we emitted them in reverse order - derived pointer first, base pointer second.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237126 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-12 13:12:14 +00:00
Vasileios Kalintiris
98ed1175d9 [mips][FastISel] Handle calls with non legal types i8 and i16.
Summary: Allow calls with non legal integer types based on i8 and i16 to be processed by mips fast-isel.

Based on a patch by Reed Kotler.

Test Plan:
"Make check" test forthcoming.
Test-suite passes at O0/O2 and with mips32 r1/r2

Reviewers: rkotler, dsanders

Subscribers: llvm-commits, rfuhler

Differential Revision: http://reviews.llvm.org/D6770

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237121 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-12 12:29:17 +00:00
Vasileios Kalintiris
d8b1e16033 [mips][FastISel] Simplify callabi.ll by using multiple check prefixes.
Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9635

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237119 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-12 12:17:11 +00:00
Vasileios Kalintiris
4ffee4b853 [mips][FastISel] Allow computation of addresses from constant expressions.
Summary:
Try to compute addresses when the offset from a memory location is a constant
expression.

Based on a patch by Reed Kotler.

Test Plan:
Passes test-suite for -O0/O2 and mips 32 r1/r2

Reviewers: rkotler, dsanders

Subscribers: llvm-commits, aemerson, rfuhler

Differential Revision: http://reviews.llvm.org/D6767

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237117 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-12 12:08:31 +00:00
Elena Demikhovsky
cfff317af7 AVX-512: select operation for i1 vectors
like: select i1 %cond, <16 x i1> %a, <16 x i1> %b.
I added pseudo-CMOV patterns to resolve the "select".
Added tests for KNL and SKX.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237106 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-12 09:36:52 +00:00
Michael Kuperstein
9cf6c24660 [X86] DAGCombine should not assume arbitrary vector types are simple
The X86-specific DAGCombine for stores should not assume vector types are always simple.
This fixes PR23476.

Differential Revision: http://reviews.llvm.org/D9659

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237097 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-12 07:33:07 +00:00
Eric Christopher
0552d51c45 Migrate existing backends that care about software floating point
to use the information in the module rather than TargetOptions.

We've had and clang has used the use-soft-float attribute for some
time now so have the backends set a subtarget feature based on
a particular function now that subtargets are created based on
functions and function attributes.

For the one middle end soft float check go ahead and create
an overloadable TargetLowering::useSoftFloat function that
just checks the TargetSubtargetInfo in all cases.

Also remove the command line option that hard codes whether or
not soft-float is set by using the attribute for all of the
target specific test cases - for the generic just go ahead and
add the attribute in the one case that showed up.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237079 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-12 01:26:05 +00:00
Andrew Kaylor
e04fe6c135 [WinEH] Handle nested landing pads that return directly to the parent function.
Differential Revision: http://reviews.llvm.org/D9684



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237063 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-11 23:06:02 +00:00
Andrew Kaylor
aeeca679f7 [WinEH] Update exception numbering to give handlers their own base state.
Differential Revision: http://reviews.llvm.org/D9512



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237014 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-11 19:41:19 +00:00
Pirama Arumuga Nainar
c9f2596abc [X86] Updates to X86 backend for f16 promotion
Summary:
r235215 adds support for f16 to be considered as a load/store type and
promote f16 operations to f32.

This patch has miscellaneous fixes for the X86 backend so all f16
operations are handled:
1. Set loadextaction for f16 vectors to expand.
2. Handle FP_EXTEND in a switch statement when handling v2f32
3. Do not fold (FP_TO_SINT (load f16)) into FP_TO_INT*_IN_MEM or
(store (SINT_TO_FP )) to a FILD.

Tests included.

Reviewers: ab, srhines, delena

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9092

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237004 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-11 17:14:39 +00:00
Elena Demikhovsky
63df7cd4ea AVX-512: Changed CC parameter in "cmp" intrinsic
from i8 to i32 according to the Intel Spec

by Igor Breger (igor.breger@intel.com)


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236979 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-11 09:03:14 +00:00
Elena Demikhovsky
8189eb4d7e AVX-512: Added SKX instructions and intrinsics:
{add/sub/mul/div/} x {ps/pd} x {128/256} 2. max/min with sae

By Asaf Badouh (asaf.badouh@intel.com)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236971 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-11 06:05:05 +00:00
Elena Demikhovsky
c7c44fa75e AVX-512: fixed UINT_TO_FP operation for 512-bit types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236955 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-10 14:23:52 +00:00
Elena Demikhovsky
59a0fe6e3f AVX-512: fixed a bug in i1 vectors lowering
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236947 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-10 10:33:32 +00:00
NAKAMURA Takumi
31e094b55d llvm/test/CodeGen/AArch64/tailcall_misched_graph.ll: s/REQUIRE/REQUIRES/
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236928 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-09 05:59:00 +00:00
James Y Knight
01d1787830 Fix MergeConsecutiveStore for non-byte-sized memory accesses.
The bug showed up as a compile-time assertion failure:
  Assertion `NumBits >= MIN_INT_BITS && "bitwidth too small"' failed
when building msan tests on x86-64.

Prior to r236850, this bug was masked due to a bogus alignment check,
which also accidentally rejected non-byte-sized accesses. Afterwards,
an invalid ElementSizeBytes == 0 got further into the function, and
triggered the assertion failure.

It would probably be a good idea to allow it to handle merging stores
of unusual widths as well, but for now, to un-break it, I'm just
making the minimal fix.

Differential Revision: http://reviews.llvm.org/D9626

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236927 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-09 03:13:37 +00:00
Pete Cooper
f5b930b2e2 [Fast-ISel] Don't mark the first use of a remat constant as killed.
When emitting something like 'add x, 1000' if we remat the 1000 then we should be able to
mark the vreg containing 1000 as killed.  Given that we go bottom up in fast-isel, a later
use of 1000 will be higher up in the BB and won't kill it, or be impacted by the lower kill.

However, rematerialised constant expressions aren't generated bottom up.  The local value save area
grows downwards.  This means that if you remat 2 constant expressions which both use 1000 then the
first will kill it, then the second, which is *lower* in the BB will read a killed register.

This is the case in the attached test where the 2 GEPs both need to generate 'add x, 6680' for the constant offset.

Note that this commit only makes kill flag generation conservative.  There's nothing else obviously wrong with
the local value save area growing downwards, and in fact it needs to for handling arbitrarily complex constant expressions.

However, it would be nice if there was a solution which would let us generate more accurate kill flags, or just kill flags completely.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236922 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-09 00:51:03 +00:00
Arnold Schwaighofer
75e36e847e ScheduleDAGInstrs: In functions with tail calls PseudoSourceValues are not non-aliasing distinct objects
The code that builds the dependence graph assumes that two PseudoSourceValues
don't alias. In a tail calling function two FixedStackObjects might refer to the
same location. Worse 'immutable' fixed stack objects like function arguments are
not immutable and will be clobbered.

Change this so that a load from a FixedStackObject is not invariant in a tail
calling function and don't return a PseudoSourceValue for an instruction in tail
calling functions when building the dependence graph so that we handle function
arguments conservatively.

Fix for PR23459.

rdar://20740035

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236916 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-08 23:52:00 +00:00
Hans Wennborg
262697d9d8 Switch lowering: cluster adjacent fall-through cases even at -O0
It's cheap to do, and codegen is much faster if cases can be merged
into clusters.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236905 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-08 21:23:39 +00:00
Pete Cooper
f9f04c25b2 [Fast-ISel] Clear kill flags on registers replaced by updateValueMap.
When selecting an extract instruction, we don't actually generate code but instead work out which register we are reading, and rewrite uses of the extract def to the source register.  This is done via updateValueMap,.

However, its possible that the source register we are rewriting *to* to also have uses.  If those uses are after a kill of the value we are rewriting *from* then we have uses after a kill and the verifier fails.

This code checks for the case where the to register is also used, and if so it clears all kill on the from register.  This is conservative, but better that always clearing kills on the from register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236897 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-08 20:46:54 +00:00
Brendon Cahoon
74b576041a [Hexagon] Generate more hardware loops
Refactored parts of the hardware loop pass to generate
more. Also, added more tests.

Differential Revision: http://reviews.llvm.org/D9568


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236896 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-08 20:18:21 +00:00
Pete Cooper
c2347d5cf1 [X86] Fast-ISel was incorrectly always killing the source of a truncate.
A trunc from i32 to i1 on x86_64 generates an instruction such as

%vreg19<def> = COPY %vreg9:sub_8bit<kill>; GR8:%vreg19 GR32:%vreg9

However, the copy here should only have the kill flag on the 32-bit path, not the 64-bit one.
Otherwise, we are killing the source of the truncate which could be used later in the program.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236890 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-08 18:29:42 +00:00
Pat Gavlin
5c7f7462e4 Extend the statepoint intrinsic to allow statepoints to be marked as transitions from GC-aware code to code that is not GC-aware.
This changes the shape of the statepoint intrinsic from:

  @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 unused, ...call args, i32 # deopt args, ...deopt args, ...gc args)

to:

  @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 flags, ...call args, i32 # transition args, ...transition args, i32 # deopt args, ...deopt args, ...gc args)

This extension offers the backend the opportunity to insert (somewhat) arbitrary code to manage the transition from GC-aware code to code that is not GC-aware and back.

In order to support the injection of transition code, this extension wraps the STATEPOINT ISD node generated by the usual lowering lowering with two additional nodes: GC_TRANSITION_START and GC_TRANSITION_END. The transition arguments that were passed passed to the intrinsic (if any) are lowered and provided as operands to these nodes and may be used by the backend during code generation.

Eventually, the lowering of the GC_TRANSITION_{START,END} nodes should be informed by the GC strategy in use for the function containing the intrinsic call; for now, these nodes are instead replaced with no-ops.

Differential Revision: http://reviews.llvm.org/D9501

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236888 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-08 18:07:42 +00:00
Pete Cooper
d90099d36c Clear kill flags on all used registers when sinking instructions.
The test here was sinking the AND here to a lower BB:

	%vreg7<def> = ANDWri %vreg8, 0; GPR32common:%vreg7,%vreg8
	TBNZW %vreg8<kill>, 0, <BB#1>; GPR32common:%vreg8

which meant that vreg8 was read after it was killed.

This commit changes the code from clearing kill flags on the AND to clearing flags on all registers used by the AND.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236886 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-08 17:54:32 +00:00
Brendon Cahoon
7fd56b1e4a [Hexagon] Update AnalyzeBranch, etc target hooks
Improved the AnalyzeBranch, InsertBranch, and RemoveBranch
functions in order to handle more of our branch instructions.
This requires changes to analyzeCompare and PredicateInstructions.
Specifically, we've added support for new value compare jumps,
improved handling of endloop, added more compare instructions,
and improved support for predicate instructions.

Differential Revision: http://reviews.llvm.org/D9559


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236876 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-08 16:16:29 +00:00
Andrea Di Biagio
405e5f276b [X86] Teach 'getTargetShuffleMask' how to look through ISD::WrapperRIP when decoding a PSHUFB mask.
The function 'getTargetShuffleMask' already knows how to deal with PSHUFB nodes
where the mask node is a load from constant pool, and the constant pool node
is wrapped by a X86ISD::Wrapper node. This patch extends that logic by teaching
it how to also look through X86ISD::WrapperRIP.

This helps function combineX86ShufflesRecusively to combine more shuffle
sequences containing PSHUFB nodes if we are in RIPRel PIC mode.

Before this change, llc (with -relocation-model=pic -march=x86-64) was unable
to decode a pshufb where the mask was loaded from a constant pool. For example,
the no-op shuffle from test 'x86-fold-pshufb.ll' was not folded into its
operand, so instead of generating a single 'movaps' the backend always
generated a sub-optimal 'movdqa + pshufb' sequence.

Added test x86-fold-pshufb.ll.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236863 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-08 15:11:07 +00:00
James Y Knight
ff3820a182 Fix test added in r236850 for OSX builders.
Need to specify triple so that llvm emits the asm syntax that the
test expected.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236855 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-08 14:04:54 +00:00
James Y Knight
e9359f427e Fix alignment checks in MergeConsecutiveStores.
1) check whether the alignment of the memory is sufficient for the
*merged* store or load to be efficient.

Not doing so can result in some ridiculously poor code generation, if
merging creates a vector operation which must be aligned but isn't.

2) DON'T check that the alignment of each load/store is equal. If
you're merging 2 4-byte stores, the first *might* have 8-byte
alignment, but the second certainly will have 4-byte alignment. We do
want to allow those to be merged.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236850 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-08 13:47:01 +00:00
Vasileios Kalintiris
ca33d72658 [mips] Emit the .insn directive for empty basic blocks.
Summary:
In microMIPS, labels need to know whether they are on code or data. This is
indicated with STO_MIPS_MICROMIPS and can be inferred by being followed
by instructions. For empty basic blocks, we can ensure this by emitting the
.insn directive after the label.

Also, this fixes some failures in our out-of-tree microMIPS buildbots, for the
exception handling regression tests under: SingleSource/Regression/C++/EH

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9530

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236815 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-08 09:10:15 +00:00
Eric Christopher
7a5d9c0611 Now that we have a soft-float attribute, use it instead of the
hard coded command line option for the Mips soft float tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236801 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-08 00:57:22 +00:00
Pete Cooper
e4eff4b231 Clear kill flags in tail duplication.
If we duplicate an instruction then we must also clear kill flags on any uses we rewrite.
Otherwise we might be killing a register which was used in other BBs.

For example, here the entry BB ended up with these instructions, the ADD having been tail duplicated.

	%vreg24<def> = t2ADDri %vreg10<kill>, 1, pred:14, pred:%noreg, opt:%noreg; GPRnopc:%vreg24 rGPR:%vreg10
	%vreg22<def> = COPY %vreg10; GPR:%vreg22 rGPR:%vreg10

	The copy here is inserted after the add and so needs vreg10 to be live.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236782 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-07 21:48:26 +00:00
Pete Cooper
d887b7ef2f [AArch64] Fix sext/zext folding in address arithmetic.
We were accidentally folding a sign/zero extend in to address arithmetic in a different BB when the extend wasn't available there.

Cross BB fast-isel isn't safe, so restrict this to only when the extend is in the same BB as the use.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236764 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-07 19:21:36 +00:00
Nemanja Ivanovic
308873bcb8 Add VSX Scalar loads and stores to the PPC back end
This patch corresponds to review:
http://reviews.llvm.org/D9440

It adds a new register class to the PPC back end to contain single precision
values in VSX registers. Additionally, it adds scalar loads and stores for
VSX registers.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236755 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-07 18:24:05 +00:00
Diego Novillo
aa46024ea3 Fix information loss in branch probability computation.
Summary:
This addresses PR 22718. When branch weights are too large, they were
being clamped to the range [1, MaxWeightForBB]. But this clamping is
only applied to edges that go outside the range, so it distorts the
relative branch probabilities.

This patch changes the weight calculation to scale every branch so the
relative probabilities are preserved. The scaling is done differently
now. First, all the branch weights are added up, and if the sum exceeds
32 bits, it computes an integer scale to bring all the weights within
the range.

The patch fixes an existing test that had slightly wrong branch
probabilities due to the previous clamping. It now gets branch weights
scaled accordingly.

Reviewers: dexonsmith

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9442

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236750 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-07 17:22:06 +00:00
Sanjay Patel
39cf555429 [x86] eliminate unnecessary shuffling/moves with unary scalar math ops (PR21507)
Finish the job that was abandoned in D6958 following the refactoring in
http://reviews.llvm.org/rL230221:

1. Uncomment the intrinsic def for the AVX r_Int instruction.
2. Add missing r_Int entries to the load folding tables; there are already
   tests that check these in "test/Codegen/X86/fold-load-unops.ll", so I
   haven't added any more in this patch.
3. Add patterns to solve PR21507 ( https://llvm.org/bugs/show_bug.cgi?id=21507 ).

So instead of this:

  movaps	%xmm0, %xmm1
  rcpss	%xmm1, %xmm1
  movss	%xmm1, %xmm0

We should now get:

  rcpss	%xmm0, %xmm0

And instead of this:

  vsqrtss	%xmm0, %xmm0, %xmm1
  vblendps	$1, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm1[0],xmm0[1,2,3]

We should now get:

  vsqrtss	%xmm0, %xmm0, %xmm0


Differential Revision: http://reviews.llvm.org/D9504



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236740 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-07 15:48:53 +00:00
Elena Demikhovsky
d08d0340e5 AVX-512: Added all forms of FP compare instructions for KNL and SKX.
Added intrinsics for the instructions. CC parameter of the intrinsics was changed from i8 to i32 according to the spec.

By Igor Breger (igor.breger@intel.com)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236714 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-07 11:24:42 +00:00
NAKAMURA Takumi
a3ded6b432 llvm/test/CodeGen/X86/llc-override-mcpu-mattr.ll: Tweak not to be affected by x64 Calling Convention.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236710 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-07 10:18:28 +00:00
Akira Hatanaka
e6f0494cd8 Let llc and opt override "-target-cpu" and "-target-features" via command line
options.

This commit fixes a bug in llc and opt where "-mcpu" and "-mattr" wouldn't
override function attributes "-target-cpu" and "-target-features" in the IR.

Differential Revision: http://reviews.llvm.org/D9537


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236677 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 23:54:14 +00:00
Pete Cooper
28b0dda32e Handle dead defs in the if converter.
We had code such as this:
  r2 = ...
  t2Bcc

label1:
  ldr ... r2

label2;
  return r2<dead, def>

The if converter was transforming this to
   r2<def> = ...
   return [pred] r2<dead,def>
   ldr <r2, kill>
   return

which fails the machine verifier because the ldr now reads from a dead def.

The fix here detects dead defs in stepForward and passes them back to the caller in the clobbers list.  The caller then clears the dead flag from the def is the value is live.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236660 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 22:51:04 +00:00
Pete Cooper
537ff782aa Fix incorrect kill flags in fastisel.
If called twice in the same BB on the same constant, FastISel::fastEmit_ri_ was marking the materialized vreg as killed on each use, instead of only the last use.

Change this to only mark the last use as killed by making earlier uses check if the vreg is already used elsewhere.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236650 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 22:09:29 +00:00
Pete Cooper
0040d179d2 [x86] Fix register class of folded load index reg.
When folding a load in to another instruction, we need to fix the class of the index register
Otherwise, it could be something like GR64 not GR64_NOSP and would fail the machine verifier.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236644 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 21:37:19 +00:00
Tim Northover
84b8c10729 CodeGen: move over-zealous assert into actual if statement.
It's quite possible to encounter an insertvalue instruction that's more deeply
nested than the value we're looking for, but when that happens we really
mustn't compare beyond the end of the index array.

Since I couldn't see any guarantees about what comparisons std::equal makes, we
probably need to directly check the size beforehand. In practice, I suspect
most std::equal implementations would probably bail early, which would be OK.
But just in case...

rdar://20834485

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236635 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 20:07:38 +00:00
Duncan P. N. Exon Smith
7838051bda DwarfDebug: Emit number of bytes in .debug_loc entry directly
Emit the number of bytes in a `.debug_loc` entry directly.  The old code
created temp labels (expensive), emitted the difference between them,
and then emitted one on each side of the relevant bytes.

(I'm looking at `llc` memory usage on `verify-uselistorder.lto.opt.bc`
(the optimized version of ld64's `-save-temps` when linking the
`verify-uselistorder` executable in an LTO bootstrap).  I've hacked
`MCContext::Allocate()` to just call `malloc()` instead of using the
`BumpPtrAllocator` so that the heap profile is easier to read.  As far
as peak memory is concerned, `MCContext::Allocate()` is equivalent to a
leak, since it only gets freed at process teardown.

In my heap profile, this patch drops memory usage of
`DwarfDebug::emitDebugLoc()` from 132.56 MB (11.4%) down to 29.86 MB
(2.7%) at peak memory.  Some of that must be noise from `SmallVector`
(or other) allocations -- peak memory only dropped from 1160 MB down to
1100 MB -- but this nevertheless shaves 5% off the top.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236629 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 19:11:20 +00:00
Pawel Bylica
5304314702 Readd the regression test from r236584. Calling convention fixed to linux.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236610 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 16:43:21 +00:00
Pete Cooper
99413f0d40 [ARM] Fast-Isel was incorrectly selecting <2 x double> adds.
With neon enabled, we reach SelectBinaryFPOp and are able to get registers for a <2 x double> add.

However, we shouldn't actually attempt arithmetic on it as ARMIselLowering says "v2f64 is legal so that QR subregs can be extracted as f64 elements, but neither Neon nor VFP support any arithmetic operations on it."

This commit disables SelectBinaryFPOp for any vector types.  There's already a FIXME to try handle neon.  Doing so would require fixing this conditional which isn't safe for vectors 'VT == MVT::f64 || VT == MVT::i64'

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236609 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 16:39:17 +00:00
Bill Schmidt
982f60be44 [PPC64LE] Adjust vector splats during VSX swap optimization
The initial code drop for VSX swap optimization permitted the
optimization only when all operations in a web of related computation
are lane-insensitive.  For some lane-sensitive operations, we can
still permit the optimization provided that we make adjustments to
those operations.  This patch adds special handling for vector splats
so that their presence doesn't kill the optimization.

Vector splats are lane-sensitive since they identify by number a
vector element to be used as the source of a splat.  When swap
optimizations take place, the desired vector element will move to the
opposite doubleword of the quadword vector.  We thus replace the index
I by (I + N/2) % N, where N is the number of elements in the vector.

A new test case is added to test that swap optimization succeeds when
vector splats are present, and that the proper input element is used
as the source of the splat.

An ancillary change removes SH_BUILDVEC as one of the kinds of special
handling that may be required by VSX swap optimization.  From
experience with GCC, I had expected to need some modifications for
vector build operations, but I did not find that to be the case.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236606 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 15:40:46 +00:00
Artyom Skrobov
78fc2103c9 [ARM] generate VMAXNM/VMINNM for a compare followed by a select, in safe math mode too
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236590 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 11:44:10 +00:00
Pawel Bylica
5fa37a7f2d Revert regression test from r236584.
Temporary remove a regression test added in r236584. It fails on Windows.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236586 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 10:41:46 +00:00