6365 Commits

Author SHA1 Message Date
Pat Gavlin
5c7f7462e4 Extend the statepoint intrinsic to allow statepoints to be marked as transitions from GC-aware code to code that is not GC-aware.
This changes the shape of the statepoint intrinsic from:

  @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 unused, ...call args, i32 # deopt args, ...deopt args, ...gc args)

to:

  @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 flags, ...call args, i32 # transition args, ...transition args, i32 # deopt args, ...deopt args, ...gc args)

This extension offers the backend the opportunity to insert (somewhat) arbitrary code to manage the transition from GC-aware code to code that is not GC-aware and back.

In order to support the injection of transition code, this extension wraps the STATEPOINT ISD node generated by the usual lowering lowering with two additional nodes: GC_TRANSITION_START and GC_TRANSITION_END. The transition arguments that were passed passed to the intrinsic (if any) are lowered and provided as operands to these nodes and may be used by the backend during code generation.

Eventually, the lowering of the GC_TRANSITION_{START,END} nodes should be informed by the GC strategy in use for the function containing the intrinsic call; for now, these nodes are instead replaced with no-ops.

Differential Revision: http://reviews.llvm.org/D9501

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236888 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-08 18:07:42 +00:00
Andrea Di Biagio
405e5f276b [X86] Teach 'getTargetShuffleMask' how to look through ISD::WrapperRIP when decoding a PSHUFB mask.
The function 'getTargetShuffleMask' already knows how to deal with PSHUFB nodes
where the mask node is a load from constant pool, and the constant pool node
is wrapped by a X86ISD::Wrapper node. This patch extends that logic by teaching
it how to also look through X86ISD::WrapperRIP.

This helps function combineX86ShufflesRecusively to combine more shuffle
sequences containing PSHUFB nodes if we are in RIPRel PIC mode.

Before this change, llc (with -relocation-model=pic -march=x86-64) was unable
to decode a pshufb where the mask was loaded from a constant pool. For example,
the no-op shuffle from test 'x86-fold-pshufb.ll' was not folded into its
operand, so instead of generating a single 'movaps' the backend always
generated a sub-optimal 'movdqa + pshufb' sequence.

Added test x86-fold-pshufb.ll.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236863 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-08 15:11:07 +00:00
James Y Knight
e9359f427e Fix alignment checks in MergeConsecutiveStores.
1) check whether the alignment of the memory is sufficient for the
*merged* store or load to be efficient.

Not doing so can result in some ridiculously poor code generation, if
merging creates a vector operation which must be aligned but isn't.

2) DON'T check that the alignment of each load/store is equal. If
you're merging 2 4-byte stores, the first *might* have 8-byte
alignment, but the second certainly will have 4-byte alignment. We do
want to allow those to be merged.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236850 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-08 13:47:01 +00:00
Diego Novillo
aa46024ea3 Fix information loss in branch probability computation.
Summary:
This addresses PR 22718. When branch weights are too large, they were
being clamped to the range [1, MaxWeightForBB]. But this clamping is
only applied to edges that go outside the range, so it distorts the
relative branch probabilities.

This patch changes the weight calculation to scale every branch so the
relative probabilities are preserved. The scaling is done differently
now. First, all the branch weights are added up, and if the sum exceeds
32 bits, it computes an integer scale to bring all the weights within
the range.

The patch fixes an existing test that had slightly wrong branch
probabilities due to the previous clamping. It now gets branch weights
scaled accordingly.

Reviewers: dexonsmith

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9442

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236750 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-07 17:22:06 +00:00
Sanjay Patel
39cf555429 [x86] eliminate unnecessary shuffling/moves with unary scalar math ops (PR21507)
Finish the job that was abandoned in D6958 following the refactoring in
http://reviews.llvm.org/rL230221:

1. Uncomment the intrinsic def for the AVX r_Int instruction.
2. Add missing r_Int entries to the load folding tables; there are already
   tests that check these in "test/Codegen/X86/fold-load-unops.ll", so I
   haven't added any more in this patch.
3. Add patterns to solve PR21507 ( https://llvm.org/bugs/show_bug.cgi?id=21507 ).

So instead of this:

  movaps	%xmm0, %xmm1
  rcpss	%xmm1, %xmm1
  movss	%xmm1, %xmm0

We should now get:

  rcpss	%xmm0, %xmm0

And instead of this:

  vsqrtss	%xmm0, %xmm0, %xmm1
  vblendps	$1, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm1[0],xmm0[1,2,3]

We should now get:

  vsqrtss	%xmm0, %xmm0, %xmm0


Differential Revision: http://reviews.llvm.org/D9504



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236740 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-07 15:48:53 +00:00
Elena Demikhovsky
d08d0340e5 AVX-512: Added all forms of FP compare instructions for KNL and SKX.
Added intrinsics for the instructions. CC parameter of the intrinsics was changed from i8 to i32 according to the spec.

By Igor Breger (igor.breger@intel.com)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236714 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-07 11:24:42 +00:00
NAKAMURA Takumi
a3ded6b432 llvm/test/CodeGen/X86/llc-override-mcpu-mattr.ll: Tweak not to be affected by x64 Calling Convention.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236710 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-07 10:18:28 +00:00
Akira Hatanaka
e6f0494cd8 Let llc and opt override "-target-cpu" and "-target-features" via command line
options.

This commit fixes a bug in llc and opt where "-mcpu" and "-mattr" wouldn't
override function attributes "-target-cpu" and "-target-features" in the IR.

Differential Revision: http://reviews.llvm.org/D9537


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236677 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 23:54:14 +00:00
Pete Cooper
0040d179d2 [x86] Fix register class of folded load index reg.
When folding a load in to another instruction, we need to fix the class of the index register
Otherwise, it could be something like GR64 not GR64_NOSP and would fail the machine verifier.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236644 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 21:37:19 +00:00
Duncan P. N. Exon Smith
7838051bda DwarfDebug: Emit number of bytes in .debug_loc entry directly
Emit the number of bytes in a `.debug_loc` entry directly.  The old code
created temp labels (expensive), emitted the difference between them,
and then emitted one on each side of the relevant bytes.

(I'm looking at `llc` memory usage on `verify-uselistorder.lto.opt.bc`
(the optimized version of ld64's `-save-temps` when linking the
`verify-uselistorder` executable in an LTO bootstrap).  I've hacked
`MCContext::Allocate()` to just call `malloc()` instead of using the
`BumpPtrAllocator` so that the heap profile is easier to read.  As far
as peak memory is concerned, `MCContext::Allocate()` is equivalent to a
leak, since it only gets freed at process teardown.

In my heap profile, this patch drops memory usage of
`DwarfDebug::emitDebugLoc()` from 132.56 MB (11.4%) down to 29.86 MB
(2.7%) at peak memory.  Some of that must be noise from `SmallVector`
(or other) allocations -- peak memory only dropped from 1160 MB down to
1100 MB -- but this nevertheless shaves 5% off the top.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236629 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 19:11:20 +00:00
Pawel Bylica
5304314702 Readd the regression test from r236584. Calling convention fixed to linux.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236610 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 16:43:21 +00:00
Pawel Bylica
5fa37a7f2d Revert regression test from r236584.
Temporary remove a regression test added in r236584. It fails on Windows.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236586 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 10:41:46 +00:00
Pawel Bylica
c0b7f693ce SelectionDAG: Handle out-of-bounds index in extract vector element
Summary: This patch correctly handles undef case of EXTRACT_VECTOR_ELT node where the element index is constant and not less than vector size.

Test Plan:
CodeGen for X86 test included.
Also one incorrect regression test fixed.

Reviewers: qcolombet, chandlerc, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, llvm-commits

Differential Revision: http://reviews.llvm.org/D9250

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236584 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 10:19:14 +00:00
Sanjoy Das
5a8a366ddf [Statepoints] Remove broken test case.
statepoint-indirect-return.ll breaks on linux systems.  Delete the test
case to make the bots green while I figure out what the right fix is.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236568 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 02:51:46 +00:00
Sanjoy Das
d77522093e [StatepointLowering] Don't create temporary instructions. NFCI.
Summary:
Instead of creating a temporary call instruction and lowering that, use
SelectionDAGBuilder::lowerCallOperands.

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9480

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236563 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 02:36:20 +00:00
Pete Cooper
9fb69672d6 [X86 fast-isel] Constrain the index reg class to not include SP.
The index reg on instructions with complex address modes is a GPR64_NOSP.  Constrain it to appease the machine verifier.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236557 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-05 23:41:53 +00:00
Reid Kleckner
4def1cbf5d Re-land "[WinEH] Add an EH registration and state insertion pass for 32-bit x86"
This reverts commit r236360.

This change exposed a bug in WinEHPrepare by opting win32 code into EH
preparation. We already knew that WinEHPrepare has bugs, and is the
status quo for x64, so I don't think that's a reason to hold off on this
change. I disabled exceptions in the sanitizer tests in r236505 and an
earlier revision.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236508 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-05 17:44:16 +00:00
Reid Kleckner
c5315e5d05 [X86] Fix assertion while DAG combining offsets and ExternalSymbols
ExternalSymbol nodes do not contain offsets, unlike GlobalValue nodes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236471 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-04 23:22:36 +00:00
Sanjay Patel
9cf57f7a88 zap windows line endings; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236460 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-04 21:27:27 +00:00
Elena Demikhovsky
fac3fb4244 AVX-512: added a test for encoding
by Asaf Badouh (asaf.badouh@intel.com)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236421 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-04 12:59:15 +00:00
Elena Demikhovsky
125a76c502 AVX-512: added calling convention for i1 vectors in 32-bit mode.
Fixed some bugs in extend/truncate for AVX-512 target.
Removed VBROADCASTM (masked broadcast) node, since it is not used any more.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236420 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-04 12:40:50 +00:00
Elena Demikhovsky
70a6f4522a AVX-512: added integer "add" and "sub" instructions with saturation for SKX
with intrinsics and tests

by Asaf Badouh (asaf.badouh@intel.com)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236418 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-04 12:35:55 +00:00
Elena Demikhovsky
869807297d AVX-512: Added VPACK* instructions forms for KNL and SKX
and their intrinsics
by Asaf Badouh (asaf.badouh@intel.com)


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236414 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-04 09:14:02 +00:00
Elena Demikhovsky
2d05c885ff Masked gather and scatter intrinsics - enabled codegen for KNL.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236394 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-03 07:12:25 +00:00
Simon Pilgrim
d85813d9a5 [DAGCombiner] Enabled vector float/double -> int constant folding
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236387 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-02 13:04:07 +00:00
Simon Pilgrim
b5adf7c5f3 Line ending fix
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236386 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-02 11:50:47 +00:00
Simon Pilgrim
4f871770ff [SSE] Added vector int (i32 and i64) -> float/double conversion tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236385 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-02 11:42:47 +00:00
Simon Pilgrim
d087fe8e0b [SSE] Added vector float/double -> i32 and i64 conversion tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236384 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-02 11:18:47 +00:00
Eric Christopher
e243e06cbb Rework test to use FileCheck by making sure we have no xmm registers
with numbers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236373 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-02 01:06:17 +00:00
Reid Kleckner
039d60c254 Revert "[WinEH] Add an EH registration and state insertion pass for 32-bit x86"
This reverts commit r236359. Things are still broken despite testing. :(

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236360 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-01 22:50:14 +00:00
Reid Kleckner
2701a7ff17 Re-land "[WinEH] Add an EH registration and state insertion pass for 32-bit x86"
This reverts commit r236340.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236359 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-01 22:40:25 +00:00
Reid Kleckner
053f7d148e Revert "[WinEH] Add an EH registration and state insertion pass for 32-bit x86"
This reverts commit r236339, it breaks the win32 clang-cl self-host.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236340 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-01 20:14:04 +00:00
Reid Kleckner
018ed7b68b [WinEH] Add an EH registration and state insertion pass for 32-bit x86
This pass is responsible for constructing the EH registration object
that gets linked into fs:00, which is all it does in this change. In the
future, it will also insert stores to update the EH state number.

I considered keeping this functionality in WinEHPrepare, but it's pretty
separable and X86 specific. It has conceptually very little to do with
the task of WinEHPrepare, which is currently outlining.  WinEHPrepare is
also in theory useful on ARM, but this logic is pretty x86 specific.

Reviewers: andrew.w.kaylor, majnemer

Differential Revision: http://reviews.llvm.org/D9422

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236339 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-01 20:04:54 +00:00
Simon Pilgrim
509fb2c84c [SelectionDAG] Unary vector constant folding integer legality fixes
This patch fixes issues with vector constant folding not correctly handling scalar input operands if they require implicit truncation - this was tested with llvm-stress as recommended by Patrik H Hagglund.

The patch ensures that integer input scalars from a build vector are correctly truncated before folding, and that constant integer scalar results are promoted to a legal type before inclusion in the new folded build vector.

I have added another crash test case and also a test for UINT_TO_FP / SINT_TO_FP using an non-truncated scalar input, which was failing before this patch.

Differential Revision: http://reviews.llvm.org/D9282

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236308 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-01 08:20:04 +00:00
Reid Kleckner
7a1b190bcd [X86] Use 4 byte preferred aggregate alignment on Win32
This helps reduce the frequency of stack realignment prologues in 32-bit
X86 Windows code. Before this change and the corresponding clang change,
we would take the max of the type preferred alignment and the explicit
alignment on the alloca.

If you don't override aggregate alignment in datalayout, you get a
default of 8. This dates back to 2007 / r34356, and changing it seems
prohibitively difficult at this point.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236270 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-30 22:11:59 +00:00
Andrea Di Biagio
cdc4c42bac Fix comment in test. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236262 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-30 21:22:28 +00:00
Andrea Di Biagio
3b15669938 Fix for PR23103. Correctly propagate the 'IsUndef' flag to the register operands of a commuted instruction.
Revision 220239 exposed a latent bug in method
'TargetInstrInfo::commuteInstruction'. When commuting the operands of a machine
instruction, method 'commuteInstruction' didn't correctly propagate the
'IsUndef' flag to the register operands of the new (commuted) instruction.

Before this patch, the following instruction:
  %vreg4<def> = VADDSDrr  %vreg14, %vreg5<undef>; FR64:%vreg4,%vreg14,%vreg5

was wrongly converted by method 'commuteInstruction' into:
  %vreg4<def> = VADDSDrr  %vreg5, %vreg14<undef>; FR64:%vreg4,%vreg5,%vreg14

The correct instruction should have been:
  %vreg4<def> = VADDSDrr  %vreg5<undef>, %vreg14; FR64:%vreg4,%vreg5,%vreg14

This patch fixes the problem in method 'TargetInstrInfo::commuteInstruction'.
When swapping the operands of a machine instruction, we now make sure that
'IsUndef' flags are correctly set.
Added test case 'pr23103.ll'.

Differential Revision: http://reviews.llvm.org/D9406


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236258 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-30 21:03:29 +00:00
Pete Cooper
1870668beb Don't rewrite jumps to empty BBs to landing pads.
In the test case here, the 'unreachable' BB was removed by BranchFolding because its empty.

It then rewrote the jump from 'entry' to jump to its fallthrough, which was a landing pad.

This results in 'entry' jumping to 2 different landing pads, which fails the machine verifier.

rdar://problem/20750162

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236248 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-30 18:58:23 +00:00
Simon Pilgrim
c8ee30be4f [SSE] Fix for MUL v16i8 on pre-SSE41 targets (PR23369).
Sign extension of i8 to i16 was placing the unpacked bytes in the lower byte instead of the upper byte.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236209 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-30 08:23:16 +00:00
Owen Anderson
36a398fe70 Semantically revert r236031, which is not a good idea for in-order targets.
At the least it should be guarded by some kind of target hook.
It also introduced catastrophic compile time and code quality
regressions on some out of tree targets (test case still being
reduced/sanitized).

Sanjay agreed with reverting this patch until these issues can be
resolved.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236199 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-30 04:06:32 +00:00
Hans Wennborg
49baa9f896 Switch lowering: use profile info to build weight-balanced binary search trees
This will cause hot nodes to appear closer to the root.

The literature says building the tree like this makes it a near-optimal (in
terms of search time given key frequencies) binary search tree. In LLVM's case,
we can do up to 3 comparisons in each leaf node, so it might be better to opt
for lower tree height in some cases; that's something to look into in the
future.

Differential Revision: http://reviews.llvm.org/D9318

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236192 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-30 00:57:37 +00:00
Pete Cooper
224f06e5dd Change x86 CMOVE_F to read it source, not write it.
This was breaking sqlite with the machine verifier because operand 0 was a def according to tablegen, but didn't have the 'isDef' flag set.

Looking at the ISA, its clear that this operand is a source as writing to st(0) is implicit.  So move the operand to the correct place in the td file.

rdar://problem/20751584

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236183 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-29 23:51:33 +00:00
Reid Kleckner
86b699c278 [X86] Avoid mangling frameescape labels
x86 Windows uses the '_' prefix for all global symbols, and this was
mistakenly being applied to frameescape labels, which are not externally
visible global symbols. They use the private global prefix 'L'.

The *right* way to fix this is probably to stop masquerading this label
as an ExternalSymbol and create a new SDNode type. These labels are not
"external", and we know they will be resolved by assembly time. Having a
custom SDNode type would allow us to do better X86 address mode
matching, so it's probably worth doing eventually.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236123 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-29 16:46:01 +00:00
Duncan P. N. Exon Smith
e56023a059 IR: Give 'DI' prefix to debug info metadata
Finish off PR23080 by renaming the debug info IR constructs from `MD*`
to `DI*`.  The last of the `DIDescriptor` classes were deleted in
r235356, and the last of the related typedefs removed in r235413, so
this has all baked for about a week.

Note: If you have out-of-tree code (like a frontend), I recommend that
you get everything compiling and tests passing with the *previous*
commit before updating to this one.  It'll be easier to keep track of
what code is using the `DIDescriptor` hierarchy and what you've already
updated, and I think you're extremely unlikely to insert bugs.  YMMV of
course.

Back to *this* commit: I did this using the rename-md-di-nodes.sh
upgrade script I've attached to PR23080 (both code and testcases) and
filtered through clang-format-diff.py.  I edited the tests for
test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns
were off-by-three.  It should work on your out-of-tree testcases (and
code, if you've followed the advice in the previous paragraph).

Some of the tests are in badly named files now (e.g.,
test/Assembler/invalid-mdcompositetype-missing-tag.ll should be
'dicompositetype'); I'll come back and move the files in a follow-up
commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236120 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-29 16:38:44 +00:00
Sanjay Patel
959b276771 transform fadd chains to increase parallelism
This is a compromise: with this simple patch, we should always handle a chain of exactly 3
operations optimally, but we're not generating the optimal balanced binary tree for a longer
sequence.

In general, this transform will reduce the dependency chain for a sequence of instructions
using N operands from a worst case N-1 dependent operations to N/2 dependent operations. 
The optimal balanced binary tree would reduce the chain to log2(N).

The trade-off for not dealing with longer sequences is: (1) we have less complexity in the
compiler, (2) we avoid unknown compile-time blowup calculating a balanced tree, and (3) we
don't need to worry about the increased register pressure required to parallelize longer
sequences. It also seems unlikely that we would ever encounter really long strings of
dependent ops like that in the wild, but I'm not sure how to verify that speculation.
FWIW, I see no perf difference for test-suite running on btver2 (x86-64) with -ffast-math
and this patch.

We can extend this patch to cover other associative operations such as fmul, fmax, fmin, 
integer add, integer mul.

This is a partial fix for:
https://llvm.org/bugs/show_bug.cgi?id=17305

and if extended:
https://llvm.org/bugs/show_bug.cgi?id=21768
https://llvm.org/bugs/show_bug.cgi?id=23116

The issue also came up in:
http://reviews.llvm.org/D8941

Differential Revision: http://reviews.llvm.org/D9232



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236031 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 21:03:22 +00:00
Elena Demikhovsky
83259d70bb Fixed crash of variable shift inst on AVX2
https://llvm.org/bugs/show_bug.cgi?id=22955



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235993 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 14:46:35 +00:00
Elena Demikhovsky
44a0c9071a AVX-512: Added "pandn" intrinsics set
by Asaf Badouh (asaf.badouh@intel.com)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235971 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 08:12:42 +00:00
Hans Wennborg
b176a4f2e4 Switch lowering: Take branch weight into account when ordering for fall-through
Previously, the code would try to put a fall-through case last,
even if that meant moving a case with much higher branch weight
further down the chain.

Ordering by branch weight is most important, putting a fall-through
block last is secondary.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235942 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-27 23:35:22 +00:00
Hans Wennborg
84145dcd08 Switch lowering: order bit tests by branch weight.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235912 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-27 20:21:17 +00:00
Elena Demikhovsky
f8ae1af2e1 AVX-512: added calling conventions for i1 vectors.
Fixed bug: https://llvm.org/bugs/show_bug.cgi?id=20724



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235889 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-27 15:11:19 +00:00