Commit Graph

6711 Commits

Author SHA1 Message Date
Hal Finkel
8e1d151abe [DAGCombine] Remainder of fix to r225380 (More FMA folding opportunities)
As pointed out by Aditya (and Owen), when we elide an FP extend to form an FMA,
we need to extend the incoming operands so that the resulting node will really
be legal. This is currently enabled only for PowerPC, and it happens to work
there regardless, but this should fix the functionality for everyone else
should anyone else wish to use it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225492 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-09 01:29:29 +00:00
Hal Finkel
40ddb2ce8f Partial fix to r225380 (More FMA folding opportunities)
As pointed out by Aditya (and Owen), there are two things wrong with this code.
First, it adds patterns which elide FP extends when forming FMAs, and that might
not be profitable on all targets (it belongs behind the pre-existing
aggressive-FMA-formation flag). This is fixed by this change.

Second, the resulting nodes might have operands of different types (the
extensions need to be re-added). That will be fixed in the follow-up commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225485 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-09 00:45:54 +00:00
Elena Demikhovsky
6e8b53da17 Masked Load/Store - fixed a bug in type legalization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225441 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 12:29:19 +00:00
Ahmed Bougacha
7fac1d945f [SelectionDAG] Allow targets to specify legality of extloads' result
type (in addition to the memory type).

The *LoadExt* legalization handling used to only have one type, the
memory type.  This forced users to assume that as long as the extload
for the memory type was declared legal, and the result type was legal,
the whole extload was legal.

However, this isn't always the case.  For instance, on X86, with AVX,
this is legal:
    v4i32 load, zext from v4i8
but this isn't:
    v4i64 load, zext from v4i8
Whereas v4i64 is (arguably) legal, even without AVX2.

Note that the same thing was done a while ago for truncstores (r46140),
but I assume no one needed it yet for extloads, so here we go.

Calls to getLoadExtAction were changed to add the value type, found
manually in the surrounding code.

Calls to setLoadExtAction were mechanically changed, by wrapping the
call in a loop, to match previous behavior.  The loop iterates over
the MVT subrange corresponding to the memory type (FP vectors, etc...).
I also pulled neighboring setTruncStoreActions into some of the loops;
those shouldn't make a difference, as the additional types are illegal.
(e.g., i128->i1 truncstores on PPC.)

No functional change intended.

Differential Revision: http://reviews.llvm.org/D6532


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225421 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 00:51:32 +00:00
Olivier Sallenave
033a537a84 More FMA folding opportunities.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225380 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 20:54:17 +00:00
Olivier Sallenave
bc2572cac5 Test commit
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225368 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 19:45:17 +00:00
Philip Reames
28fa9e1e9f Introduce an example statepoint GC strategy
This change includes the most basic possible GCStrategy for a GC which is using the statepoint lowering code. At the moment, this GCStrategy doesn't really do much - aside from actually generate correct stackmaps that is - but I went ahead and added a few extra correctness checks as proof of concept. It's mostly here to provide documentation on how to do one, and to provide a point for various optimization legality hooks I'd like to add going forward. (For context, see the TODOs in InstCombine around gc.relocate.)

Most of the validation logic added here as proof of concept will soon move in to the Verifier.  That move is dependent on http://reviews.llvm.org/D6811

There was discussion in the review thread about addrspace(1) being reserved for something.  I'm going to follow up on a seperate llvmdev thread.  If needed, I'll update all the code at once.

Note that I am deliberately not making a GCStrategy required to use gc.statepoints with this change. I want to give folks out of tree - including myself - a chance to migrate. In a week or two, I'll make having a GCStrategy be required for gc.statepoints. To this end, I added the gc tag to one of the test cases but not others.

Differential Revision: http://reviews.llvm.org/D6808



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225365 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 19:07:50 +00:00
Mehdi Amini
35300ff218 SelectionDAGBuilder: move constant initialization out of loop
No semantic change intended.

Reviewers: resistor

Differential Revision: http://reviews.llvm.org/D6834



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225278 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 18:20:04 +00:00
Craig Topper
9bf73516cb Replace several 'assert(false' with 'llvm_unreachable' or fold a condition into the assert.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225160 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 10:15:49 +00:00
Alexey Samsonov
c0319dd9c2 Revert "merge consecutive stores of extracted vector elements"
This reverts commit r224611. This change causes crashes
in X86 DAG->DAG Instruction Selection.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225031 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-31 00:40:28 +00:00
Elena Demikhovsky
b31322328a Masked Load/Store - Changed the order of parameters in intrinsics.
No functional changes.
The documentation is coming.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224829 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-25 07:49:20 +00:00
Mehdi Amini
8548c2453f Always assert in DAGCombine and not only when -debug is enabled
Right now in DAG Combine check the validity of the returned type 
only when -debug is given on the command line. However usually 
the test cases in the validation does not use -debug. 
An Assert build should always check this.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224779 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-23 18:59:02 +00:00
Michael Kuperstein
1f0ddef593 [DagCombine] Improve DAGCombiner BUILD_VECTOR when it has two sources of elements
This partially fixes PR21943.

For AVX, we go from:

vmovq   (%rsi), %xmm0
vmovq   (%rdi), %xmm1
vpermilps       $-27, %xmm1, %xmm2 ## xmm2 = xmm1[1,1,2,3]
vinsertps       $16, %xmm2, %xmm1, %xmm1 ## xmm1 = xmm1[0],xmm2[0],xmm1[2,3]
vinsertps       $32, %xmm0, %xmm1, %xmm1 ## xmm1 = xmm1[0,1],xmm0[0],xmm1[3]
vpermilps       $-27, %xmm0, %xmm0 ## xmm0 = xmm0[1,1,2,3]
vinsertps       $48, %xmm0, %xmm1, %xmm0 ## xmm0 = xmm1[0,1,2],xmm0[0]

To the expected:

vmovq   (%rdi), %xmm0
vmovhpd (%rsi), %xmm0, %xmm0
retq

Fixing this for AVX2 is still open.

Differential Revision: http://reviews.llvm.org/D6749

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224759 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-23 08:59:45 +00:00
Matt Arsenault
d796cf2e01 Enable (sext x) == C --> x == (trunc C) combine
Extend the existing code which handles this for zext. This makes this
more useful for targets with ZeroOrNegativeOne BooleanContent and
obsoletes a custom combine SI uses for i1 setcc (sext(i1), 0, setne)
since the constant will now be shrunk to i1.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224691 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-21 16:48:42 +00:00
Sanjay Patel
3c3cd10928 merge consecutive stores of extracted vector elements
Add a path to DAGCombiner::MergeConsecutiveStores() 
to combine multiple scalar stores when the store operands
are extracted vector elements. This is a partial fix for
PR21711 ( http://llvm.org/bugs/show_bug.cgi?id=21711 ).

For the new test case, codegen improves from:

   vmovss  %xmm0, (%rdi)
   vextractps      $1, %xmm0, 4(%rdi)
   vextractps      $2, %xmm0, 8(%rdi)
   vextractps      $3, %xmm0, 12(%rdi)
   vextractf128    $1, %ymm0, %xmm0
   vmovss  %xmm0, 16(%rdi)
   vextractps      $1, %xmm0, 20(%rdi)
   vextractps      $2, %xmm0, 24(%rdi)
   vextractps      $3, %xmm0, 28(%rdi)
   vzeroupper
   retq

To:

   vmovups	%ymm0, (%rdi)
   vzeroupper
   retq

Patch reviewed by Nadav Rotem.

Differential Revision: http://reviews.llvm.org/D6698



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224611 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-19 20:23:41 +00:00
Michael Kuperstein
fd350586f5 [DAGCombine] Slightly improve lowering of BUILD_VECTOR into a shuffle.
This handles the case of a BUILD_VECTOR being constructed out of elements extracted from a vector twice the size of the result vector. Previously this was always scalarized. Now, we try to construct a shuffle node that feeds on extract_subvectors.

This fixes PR15872 and provides a partial fix for PR21711.

Differential Revision: http://reviews.llvm.org/D6678

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224429 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-17 12:32:17 +00:00
Hans Wennborg
74b5b195fd SelectionDAG switch lowering: use 'unsigned' to count destination popularity
SwitchInst::getNumCases() returns unsinged, so using uint64_t to count cases
seems unnecessary.

Also fix a missing CHECK in the test case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224393 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-16 23:41:59 +00:00
Sanjay Patel
128b2c383c merge consecutive loads that are offset from a base address
SelectionDAG::isConsecutiveLoad() was not detecting consecutive loads
when the first load was offset from a base address. 

This patch recognizes that pattern and subtracts the offset before comparing
the second load to see if it is consecutive.

The codegen change in the new test case improves from:

vmovsd	32(%rdi), %xmm0
vmovsd	48(%rdi), %xmm1 
vmovhpd	56(%rdi), %xmm1, %xmm1
vmovhpd	40(%rdi), %xmm0, %xmm0
vinsertf128	$1, %xmm1, %ymm0, %ymm0

To:

vmovups	32(%rdi), %ymm0

An existing test case is also improved from:

vmovsd	(%rdi), %xmm0
vmovsd	16(%rdi), %xmm1
vmovsd	24(%rdi), %xmm2
vunpcklpd	%xmm2, %xmm0, %xmm0 ## xmm0 = xmm0[0],xmm2[0]
vmovhpd	8(%rdi), %xmm1, %xmm3

To:

vmovsd	(%rdi), %xmm0
vmovsd	16(%rdi), %xmm1
vmovhpd	24(%rdi), %xmm0, %xmm0
vmovhpd	8(%rdi), %xmm1, %xmm1

This patch fixes PR21771 ( http://llvm.org/bugs/show_bug.cgi?id=21771 ).

Differential Revision: http://reviews.llvm.org/D6642



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224379 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-16 21:57:18 +00:00
Michael Ilseman
9ecdca9115 Silence more static analyzer warnings.
Add in definedness checks for shift operators, null checks when
pointers are assumed by the code to be non-null, and explicit
unreachables.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224255 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-15 18:48:43 +00:00
Matt Arsenault
6e6318f148 Add target hook for whether it is profitable to reduce load widths
Add an option to disable optimization to shrink truncated larger type
loads to smaller type loads. On SI this prevents using scalar load
instructions in some cases, since there are no scalar extloads.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224084 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-12 00:00:24 +00:00
Duncan P. N. Exon Smith
dad20b2ae2 IR: Split Metadata from Value
Split `Metadata` away from the `Value` class hierarchy, as part of
PR21532.  Assembly and bitcode changes are in the wings, but this is the
bulk of the change for the IR C++ API.

I have a follow-up patch prepared for `clang`.  If this breaks other
sub-projects, I apologize in advance :(.  Help me compile it on Darwin
I'll try to fix it.  FWIW, the errors should be easy to fix, so it may
be simpler to just fix it yourself.

This breaks the build for all metadata-related code that's out-of-tree.
Rest assured the transition is mechanical and the compiler should catch
almost all of the problems.

Here's a quick guide for updating your code:

  - `Metadata` is the root of a class hierarchy with three main classes:
    `MDNode`, `MDString`, and `ValueAsMetadata`.  It is distinct from
    the `Value` class hierarchy.  It is typeless -- i.e., instances do
    *not* have a `Type`.

  - `MDNode`'s operands are all `Metadata *` (instead of `Value *`).

  - `TrackingVH<MDNode>` and `WeakVH` referring to metadata can be
    replaced with `TrackingMDNodeRef` and `TrackingMDRef`, respectively.

    If you're referring solely to resolved `MDNode`s -- post graph
    construction -- just use `MDNode*`.

  - `MDNode` (and the rest of `Metadata`) have only limited support for
    `replaceAllUsesWith()`.

    As long as an `MDNode` is pointing at a forward declaration -- the
    result of `MDNode::getTemporary()` -- it maintains a side map of its
    uses and can RAUW itself.  Once the forward declarations are fully
    resolved RAUW support is dropped on the ground.  This means that
    uniquing collisions on changing operands cause nodes to become
    "distinct".  (This already happened fairly commonly, whenever an
    operand went to null.)

    If you're constructing complex (non self-reference) `MDNode` cycles,
    you need to call `MDNode::resolveCycles()` on each node (or on a
    top-level node that somehow references all of the nodes).  Also,
    don't do that.  Metadata cycles (and the RAUW machinery needed to
    construct them) are expensive.

  - An `MDNode` can only refer to a `Constant` through a bridge called
    `ConstantAsMetadata` (one of the subclasses of `ValueAsMetadata`).

    As a side effect, accessing an operand of an `MDNode` that is known
    to be, e.g., `ConstantInt`, takes three steps: first, cast from
    `Metadata` to `ConstantAsMetadata`; second, extract the `Constant`;
    third, cast down to `ConstantInt`.

    The eventual goal is to introduce `MDInt`/`MDFloat`/etc. and have
    metadata schema owners transition away from using `Constant`s when
    the type isn't important (and they don't care about referring to
    `GlobalValue`s).

    In the meantime, I've added transitional API to the `mdconst`
    namespace that matches semantics with the old code, in order to
    avoid adding the error-prone three-step equivalent to every call
    site.  If your old code was:

        MDNode *N = foo();
        bar(isa             <ConstantInt>(N->getOperand(0)));
        baz(cast            <ConstantInt>(N->getOperand(1)));
        bak(cast_or_null    <ConstantInt>(N->getOperand(2)));
        bat(dyn_cast        <ConstantInt>(N->getOperand(3)));
        bay(dyn_cast_or_null<ConstantInt>(N->getOperand(4)));

    you can trivially match its semantics with:

        MDNode *N = foo();
        bar(mdconst::hasa               <ConstantInt>(N->getOperand(0)));
        baz(mdconst::extract            <ConstantInt>(N->getOperand(1)));
        bak(mdconst::extract_or_null    <ConstantInt>(N->getOperand(2)));
        bat(mdconst::dyn_extract        <ConstantInt>(N->getOperand(3)));
        bay(mdconst::dyn_extract_or_null<ConstantInt>(N->getOperand(4)));

    and when you transition your metadata schema to `MDInt`:

        MDNode *N = foo();
        bar(isa             <MDInt>(N->getOperand(0)));
        baz(cast            <MDInt>(N->getOperand(1)));
        bak(cast_or_null    <MDInt>(N->getOperand(2)));
        bat(dyn_cast        <MDInt>(N->getOperand(3)));
        bay(dyn_cast_or_null<MDInt>(N->getOperand(4)));

  - A `CallInst` -- specifically, intrinsic instructions -- can refer to
    metadata through a bridge called `MetadataAsValue`.  This is a
    subclass of `Value` where `getType()->isMetadataTy()`.

    `MetadataAsValue` is the *only* class that can legally refer to a
    `LocalAsMetadata`, which is a bridged form of non-`Constant` values
    like `Argument` and `Instruction`.  It can also refer to any other
    `Metadata` subclass.

(I'll break all your testcases in a follow-up commit, when I propagate
this change to assembly.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223802 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-09 18:38:53 +00:00
Owen Anderson
59bf8e81f3 Fix a few instances found in SelectionDAG where we were not handling F16 at parity with F32 and F64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223760 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-09 06:50:39 +00:00
Justin Bogner
70b0751080 InstrProf: An intrinsic and lowering for instrumentation based profiling
Introduce the ``llvm.instrprof_increment`` intrinsic and the
``-instrprof`` pass. These provide the infrastructure for writing
counters for profiling, as in clang's ``-fprofile-instr-generate``.

The implementation of the instrprof pass is ported directly out of the
CodeGenPGO classes in clang, and with the followup in clang that rips
that code out to use these new intrinsics this ends up being NFC.

Doing the instrumentation this way opens some doors in terms of
improving the counter performance. For example, this will make it
simple to experiment with alternate lowering strategies, and allows us
to try handling profiling specially in some optimizations if we want
to.

Finally, this drastically simplifies the frontend and puts all of the
lowering logic in one place.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223672 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-08 18:02:35 +00:00
Hans Wennborg
a421ac689f SelectionDAG switch lowering: Replace unreachable default with most popular case.
This can significantly reduce the size of the switch, allowing for more
efficient lowering.

I also worked with the idea of exploiting unreachable defaults by
omitting the range check for jump tables, but always ended up with a
non-neglible binary size increase. It might be worth looking into some more.

SimplifyCFG currently does this transformation, but I'm working towards changing
that so we can optimize harder based on unreachable defaults.

Differential Revision: http://reviews.llvm.org/D6510

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223566 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-06 01:28:50 +00:00
Simon Pilgrim
94590ca4cf [InstCombine] Minor optimization for bswap with binary ops
Added instcombine optimizations for BSWAP with AND/OR/XOR ops:

OP( BSWAP(x), BSWAP(y) ) -> BSWAP( OP(x, y) )
OP( BSWAP(x), CONSTANT ) -> BSWAP( OP(x, BSWAP(CONSTANT) ) )

Since its just a one liner, I've also added BSWAP to the DAGCombiner equivalent as well:

fold (OP (bswap x), (bswap y)) -> (bswap (OP x, y))

Refactored bswap-fold tests to use FileCheck instead of just checking that the bswaps had gone.

Differential Revision: http://reviews.llvm.org/D6407



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223349 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 09:44:01 +00:00
Elena Demikhovsky
73ae1df82c Masked Load / Store Intrinsics - the CodeGen part.
I'm recommiting the codegen part of the patch.
The vectorizer part will be send to review again.

Masked Vector Load and Store Intrinsics.
Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores.
Added SDNodes for masked operations and lowering patterns for X86 code generator.
Examples:
<16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align */, <16 x i1> %mask)
declare void @llvm.masked.store.v8f64(i8* %addr, <8 x double> %value, i32 4, <8 x i1> %mask)

Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch.

http://reviews.llvm.org/D6191



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223348 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 09:40:44 +00:00
Hal Finkel
1855b261db [PowerPC] Implement readcyclecounter for PPC32
We've long supported readcyclecounter on PPC64, but it is easier there (the
read of the 64-bit time-base register can be accomplished via a single
instruction). This now provides an implementation for PPC32 as well. On PPC32,
the time-base register is still 64 bits, but can only be read 32 bits at a time
via two separate SPRs. The ISA manual explains how to do this properly (it
involves re-reading the upper bits and looping if the counter has wrapped while
being read).

This requires PPC to implement a custom integer splitting legalization for the
READCYCLECOUNTER node, turning it into a target-specific SDAG node, which then
gets turned into a pseudo-instruction, which is then expanded to the necessary
sequence (which has three SPR reads, the comparison and the branch).

Thanks to Paul Hargrove for pointing out to me that this was still unimplemented.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223161 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-02 22:01:00 +00:00
Philip Reames
301256d436 Restructure some assertion checking based on post commit feedback by Aaron and Tom.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223150 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-02 21:01:48 +00:00
Philip Reames
5eccf7b3df Appease a build bot complaining about an unused variable that's used in an assertion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223142 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-02 19:28:57 +00:00
Philip Reames
d021bb8003 [Statepoints 3/4] Statepoint infrastructure for garbage collection: SelectionDAGBuilder
This is the third patch in a small series.  It contains the CodeGen support for lowering the gc.statepoint intrinsic sequences (223078) to the STATEPOINT pseudo machine instruction (223085).  The change also includes the set of helper routines and classes for working with gc.statepoints, gc.relocates, and gc.results since the lowering code uses them.  

With this change, gc.statepoints should be functionally complete.  The documentation will follow in the fourth change, and there will likely be some cleanup changes, but interested parties can start experimenting now.

I'm not particularly happy with the amount of code or complexity involved with the lowering step, but at least it's fairly well isolated.  The statepoint lowering code is split into it's own files and anyone not working on the statepoint support itself should be able to ignore it.  

During the lowering process, we currently spill aggressively to stack. This is not entirely ideal (and we have plans to do better), but it's functional, relatively straight forward, and matches closely the implementations of the patchpoint intrinsics.  Most of the complexity comes from trying to keep relocated copies of values in the same stack slots across statepoints.  Doing so avoids the insertion of pointless load and store instructions to reshuffle the stack.  The current implementation isn't as effective as I'd like, but it is functional and 'good enough' for many common use cases.  

In the long term, I'd like to figure out how to integrate the statepoint lowering with the register allocator.  In principal, we shouldn't need to eagerly spill at all.  The register allocator should do any spilling required and the statepoint should simply record that fact.  Depending on how challenging that turns out to be, we may invest in a smarter global stack slot assignment mechanism as a stop gap measure.  

Reviewed by: atrick, ributzka





git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223137 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-02 18:50:36 +00:00
Hans Wennborg
c9605b6067 Revert r223049, r223050 and r223051 while investigating test failures.
I didn't foresee affecting the Clang test suite :/

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223054 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-01 17:36:43 +00:00
Hans Wennborg
e6f4d335d7 SelectionDAG switch lowering: Replace unreachable default with most popular case.
This can significantly reduce the size of the switch, allowing for more
efficient lowering.

I also worked with the idea of exploiting unreachable defaults by
omitting the range check for jump tables, but always ended up with a
non-neglible binary size increase. It might be worth looking into some more.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223049 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-01 17:08:32 +00:00
Akira Hatanaka
ca780b4578 [stack protector] Set edge weights for newly created basic blocks.
This commit fixes a bug in stack protector pass where edge weights were not set
when new basic blocks were added to lists of successor basic blocks.

Differential Revision: http://reviews.llvm.org/D5766


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222987 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-01 04:27:03 +00:00
Hans Wennborg
809b955259 Switch lowering: reformat some for loops etc. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222962 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-29 21:24:12 +00:00
Hans Wennborg
5c6c5e23bd Switch lowering: Fix broken 'Figure out which block is next' code
This doesn't seem to have worked in a long time, but other optimizations
would clean it up.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222961 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-29 21:17:05 +00:00
Duncan P. N. Exon Smith
54786a0936 Revert "Masked Vector Load and Store Intrinsics."
This reverts commit r222632 (and follow-up r222636), which caused a host
of LNT failures on an internal bot.  I'll respond to the commit on the
list with a reproduction of one of the failures.

Conflicts:
	lib/Target/X86/X86TargetTransformInfo.cpp

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222936 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-28 21:29:14 +00:00
Elena Demikhovsky
30f747478a Converted back to Unix format (after my last commit 222632)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222636 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-23 15:21:53 +00:00
Elena Demikhovsky
ae1ae2c3a1 Masked Vector Load and Store Intrinsics.
Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores.
Added SDNodes for masked operations and lowering patterns for X86 code generator.
Examples:
<16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align */, <16 x i1> %mask)
declare void @llvm.masked.store.v8f64(i8* %addr, <8 x double> %value, i32 4, <8 x i1> %mask)

Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch.

http://reviews.llvm.org/D6191



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222632 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-23 08:07:43 +00:00
Sanjay Patel
e8f6119214 Don't repeat class/function/variable names in comments. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222555 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-21 18:58:38 +00:00
Sanjay Patel
d1510d968c Less space; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222546 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-21 18:05:59 +00:00
Andrea Di Biagio
607099b697 [DAG] Teach how to turn a build_vector into a shuffle if some of the operands are zero.
Before this patch, the DAGCombiner only tried to convert build_vector dag nodes
into shuffles if all operands were either extract_vector_elt or undef.

This patch improves that logic and teaches the DAGCombiner how to deal with
build_vector dag nodes where one or more operands are zero. A build_vector
dag node with some zero operands is turned into a shuffle only if the resulting
shuffle mask is legal for the target.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222536 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-21 14:32:06 +00:00
Andrea Di Biagio
04fab61b35 [DAG] Refactor the shuffle combining logic in DAGCombiner. NFC.
This patch simplifies the logic that combines a pair of shuffle nodes into
a single shuffle if there is a legal mask. Also added comments to better
describe the algorithm. No functional change intended.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222522 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-21 11:33:07 +00:00
Hao Liu
09ad94decb DAGCombiner: Allow the DAGCombiner to combine multiple FDIVs with the same divisor info FMULs by the reciprocal.
E.g., ( a / D; b / D ) -> ( recip = 1.0 / D; a * recip; b * recip)

A hook is added to allow the target to control whether it needs to do such combine.

Reviewed in http://reviews.llvm.org/D6334


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222510 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-21 06:39:58 +00:00
Simon Pilgrim
a6943fff90 [X86][SSE] pslldq/psrldq byte shifts/rotation for SSE2
This patch builds on http://reviews.llvm.org/D5598 to perform byte rotation shuffles (lowerVectorShuffleAsByteRotate) on pre-SSSE3 (palignr) targets - pre-SSSE3 is only enabled on i8 and i16 vector targets where it is a more definite performance gain.

I've also added a separate byte shift shuffle (lowerVectorShuffleAsByteShift) that makes use of the ability of the SLLDQ/SRLDQ instructions to implicitly shift in zero bytes to avoid the need to create a zero register if we had used palignr.

Differential Revision: http://reviews.llvm.org/D5699



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222340 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-19 10:06:49 +00:00
David Blaikie
5401ba7099 Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool>
This is to be consistent with StringSet and ultimately with the standard
library's associative container insert function.

This lead to updating SmallSet::insert to return pair<iterator, bool>,
and then to update SmallPtrSet::insert to return pair<iterator, bool>,
and then to update all the existing users of those functions...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222334 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-19 07:49:26 +00:00
Owen Anderson
b39e517168 Fix an incorrect chain operand when expanding INSERT_VECTOR operations through the stack.
Patch by Daniil Troshkov!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222254 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-18 20:50:19 +00:00
Oliver Stannard
d9d2703b71 Fix optimisations of SELECT_CC which assumed result is boolean
Some optimisations in DAGCombiner cause miscompilations for targets that use
TargetLowering::UndefinedBooleanContent, because they assume that the results
of a SELECT_CC node are boolean values, and can be safely ANDed, ORed and
XORed. These optimisations are only valid for targets that use
ZeroOrOneBooleanContent or ZeroOrNegativeOneBooleanContent.

This is a follow-up to D6210/r221693.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222123 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-17 10:49:31 +00:00
Craig Topper
a5babc8a31 Move register class name strings to a single array in MCRegisterInfo to reduce static table size and number of relocation entries.
Indices into the table are stored in each MCRegisterClass instead of a pointer. A new method, getRegClassName, is added to MCRegisterInfo and TargetRegisterInfo to lookup the string in the table.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222118 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-17 05:50:14 +00:00
Craig Topper
cac2f22b32 Replace a couple asserts with static_asserts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222114 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-17 00:26:50 +00:00
Craig Topper
56391ddf5d Convert some EVTs to MVTs where only a SimpleValueType is needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222109 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-16 21:17:18 +00:00