Commit Graph

73258 Commits

Author SHA1 Message Date
David Blaikie
2c453a0c03 DebugInfo: Sink the code emitting DW_AT_APPLE_omit_frame_ptr down to a more common spot.
No functional change. Pre-emptive refactoring before I start pushing
some of this subprogram creation down into DWARFCompileUnit so I can
build different subprograms in the skeleton unit from the dwo unit for
adding -gmlt-like data to the skeleton.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218713 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 22:32:49 +00:00
Jingyue Wu
9cd9e4bb2c [SimplifyCFG] threshold for folding branches with common destination
Summary:
This patch adds a threshold that controls the number of bonus instructions
allowed for folding branches with common destination. The original code allows
at most one bonus instruction. With this patch, users can customize the
threshold to allow multiple bonus instructions. The default threshold is still
1, so that the code behaves the same as before when users do not specify this
threshold.

The motivation of this change is that tuning this threshold significantly (up
to 25%) improves the performance of some CUDA programs in our internal code
base. In general, branch instructions are very expensive for GPU programs.
Therefore, it is sometimes worth trading more arithmetic computation for a more
straightened control flow. Here's a reduced example:

  __global__ void foo(int a, int b, int c, int d, int e, int n,
                      const int *input, int *output) {
    int sum = 0;
    for (int i = 0; i < n; ++i)
      sum += (((i ^ a) > b) && (((i | c ) ^ d) > e)) ? 0 : input[i];
    *output = sum;
  }

The select statement in the loop body translates to two branch instructions "if
((i ^ a) > b)" and "if (((i | c) ^ d) > e)" which share a common destination.
With the default threshold, SimplifyCFG is unable to fold them, because
computing the condition of the second branch "(i | c) ^ d > e" requires two
bonus instructions. With the threshold increased, SimplifyCFG can fold the two
branches so that the loop body contains only one branch, making the code
conceptually look like:

  sum += (((i ^ a) > b) & (((i | c ) ^ d) > e)) ? 0 : input[i];

Increasing the threshold significantly improves the performance of this
particular example. In the configuration where both conditions are guaranteed
to be true, increasing the threshold from 1 to 2 improves the performance by
18.24%. Even in the configuration where the first condition is false and the
second condition is true, which favors shortcuts, increasing the threshold from
1 to 2 still improves the performance by 4.35%.

We are still looking for a good threshold and maybe a better cost model than
just counting the number of bonus instructions. However, according to the above
numbers, we think it is at least worth adding a threshold to enable more
experiments and tuning. Let me know what you think. Thanks!

Test Plan: Added one test case to check the threshold is in effect

Reviewers: nadav, eliben, meheff, resistor, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, llvm-commits

Differential Revision: http://reviews.llvm.org/D5529

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218711 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 22:23:38 +00:00
David Blaikie
76ff19ffa7 Disable the -gmlt optimization implemented in r218129 under Darwin due to issues with dsymutil.
r218129 omits DW_TAG_subprograms which have no inlined subroutines when
emitting -gmlt data. This makes -gmlt very low cost for -O0 builds.

Darwin's dsymutil reasonably considers a CU empty if it has no
subprograms (which occurs with the above optimization in -O0 programs
without any force_inline function calls) and drops the line table, CU,
and everything in this situation, making backtraces impossible.

Until dsymutil is modified to account for this, disable this
optimization on Darwin to preserve the desired functionality.
(see r218545, which should be reverted after this patch, for other
discussion/details)

Footnote:
In the long term, it doesn't look like this scheme (of simplified debug
info to describe inlining to enable backtracing) is tenable, it is far
too size inefficient for optimized code (the DW_TAG_inlined_subprograms,
even once compressed, are nearly twice as large as the line table
itself (also compressed)) and we'll be considering things like Cary's
two level line table proposal to encode all this information directly in
the line table.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218702 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 21:28:32 +00:00
Sanjay Patel
73a335f7f6 Use the target-specified iteration count to opt out of any further refinement of an estimate. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218700 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 20:44:23 +00:00
Sanjay Patel
cafc85bf1e Split the estimate() interface into separate functions for each type. NFC.
It was hacky to use an opcode as a switch because it won't always match
(rsqrte != sqrte), and it looks like we'll need to add more special casing
per arch than I had hoped for. Eg, x86 will prefer a different NR estimate
implementation. ARM will want to use it's 'step' instructions. There also
don't appear to be any new estimate instructions in any arch in a long,
long time. Altivec vloge and vexpte may have been the first and last in
that field...



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218698 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 20:28:48 +00:00
Juergen Ributzka
9952c922c2 Recommit r218010 [FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ.
Note: This version fixed an issue with the TBZ/TBNZ instructions that were
generated in FastISel. The issue was that the 64bit version of TBZ (TBZX)
automagically sets the upper bit of the immediate field that is used to specify
the bit we want to test. To test for any of the lower 32bits we have to first
extract the subregister and use the 32bit version of the TBZ instruction (TBZW).

Original commit message:
Teach selectBranch to fold bit test and branch into a single instruction (TBZ or
TBNZ).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218693 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 19:59:35 +00:00
Matt Arsenault
28233d3a63 R600/SI: Fix printing of clamp and omod
No tests for omod since nothing uses it yet, but
this should get rid of the remaining annoying trailing
zeros after some instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218692 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 19:49:48 +00:00
Matt Arsenault
532a5c7dc6 R600/SI: Update VOP3b to not include obsolete operands
abs / neg are now part of the srcN_modifiers operands

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218691 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 19:49:43 +00:00
Bradley Smith
95b3e168c5 Extend C disassembler API to allow specifying target features
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218682 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 16:31:40 +00:00
Reed Kotler
8a6f79e58d Add numeric extend, trunctate to mips fast-isel
Summary:
 Add numeric extend, trunctate to mips fast-isel

 Reactivates D4827



Test Plan:
fpext.ll
loadstoreconv.ll

Reviewers: dsanders

Subscribers: mcrosier

Differential Revision: http://reviews.llvm.org/D5251

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218681 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 16:30:13 +00:00
Tom Coxon
8a23890385 [AArch64] Remove unnecessary whitespace. (Test commit)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218680 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 16:23:16 +00:00
Andrea Di Biagio
9e6df85d39 [DAG] Check in advance if a build_vector has a legal type before attempting to convert it into a shuffle.
Currently, the DAG Combiner only tries to convert type-legal build_vector nodes
into shuffles. This patch simply moves the logic that checks if a
build_vector has a legal value type up before we even start analyzing the
operands. This allows to early exit immediately from method
'visitBUILD_VECTOR' if the node type is known to be illegal.

No functional change intended.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218677 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 15:30:22 +00:00
Alex Lorenz
38c59de6b1 llvm-cov: Use the number of executed functions for the function coverage metric.
This commit fixes llvm-cov's function coverage metric by using the number of executed functions instead of the number of fully covered functions.

Differential Revision: http://reviews.llvm.org/D5196


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218672 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 12:45:13 +00:00
Lorenzo Martignoni
f49592dddc Introduce support for custom wrappers for vararg functions.
Differential Revision: http://reviews.llvm.org/D5412



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218671 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 12:33:16 +00:00
Robert Khasanov
8acdc5232d [AVX512] Added intrinsics for 128-, 256- and 512-bit versions of VCMPGT{BWDQ}.
Patch by Sergey Lisitsyn <sergey.lisitsyn@intel.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218670 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 12:15:52 +00:00
Robert Khasanov
175ff01f0f [AVX512] Added intrinsics for 128- and 256-bit versions of VCMPEQ{BWDQ}
Fixed lowering of this intrinsics in case when mask is v2i1 and v4i1.
Now cmp intrinsics lower in the following way:
 (i8 (int_x86_avx512_mask_pcmpeq_q_128
             (v2i64 %a), (v2i64 %b), (i8 %mask))) ->
 (i8 (bitcast
   (v8i1 (insert_subvector undef,
           (v2i1 (and (PCMPEQM %a, %b),
                      (extract_subvector
                         (v8i1 (bitcast %mask)), 0))), 0))))


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218669 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 11:41:54 +00:00
Robert Khasanov
cfa5724d50 [AVX512] Added intrinsics for VPCMPEQB and VPCMPEQW.
Added new operand type for intrinsics (IIT_V64)


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218668 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 11:32:22 +00:00
Robert Khasanov
58da66b2bf [AVX512] Enabled intrinsics for VPCMPEQD and VPCMPEQQ.
Added CMP_MASK intrinsic type


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218667 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 11:19:50 +00:00
Job Noorman
deb16c9eac Make sure aggregates are properly alligned on MSP430.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218665 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 11:15:44 +00:00
Chad Rosier
ecea7ba518 [IndVarSimplify] Widen loop unsigned compares.
This patch extends r217953 to handle unsigned comparison.
Phabricator revision: http://reviews.llvm.org/D5526

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218659 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 03:17:42 +00:00
Chandler Carruth
4abb04a65c [x86] Revert r218588, r218589, and r218600. These patches were pursuing
a flawed direction and causing miscompiles. Read on for details.

Fundamentally, the premise of this patch series was to map
VECTOR_SHUFFLE DAG nodes into VSELECT DAG nodes for all blends because
we are going to *have* to lower to VSELECT nodes for some blends to
trigger the instruction selection patterns of variable blend
instructions. This doesn't actually work out so well.

In order to match performance with the existing VECTOR_SHUFFLE
lowering code, we would need to re-slice the blend in order to fit it
into either the integer or floating point blends available on the ISA.
When coming from VECTOR_SHUFFLE (or other vNi1 style VSELECT sources)
this works well because the X86 backend ensures that these types of
operands to VSELECT get sign extended into '-1' and '0' for true and
false, allowing us to re-slice the bits in whatever granularity without
changing semantics.

However, if the VSELECT condition comes from some other source, for
example code lowering vector comparisons, it will likely only have the
required bit set -- the high bit. We can't blindly slice up this style
of VSELECT. Reid found some code using Halide that triggers this and I'm
hopeful to eventually get a test case, but I don't need it to understand
why this is A Bad Idea.

There is another aspect that makes this approach flawed. When in
VECTOR_SHUFFLE form, we have very distilled information that represents
the *constant* blend mask. Converting back to a VSELECT form actually
can lose this information, and so I think now that it is better to treat
this as VECTOR_SHUFFLE until the very last moment and only use VSELECT
nodes for instruction selection purposes.

My plan is to:
1) Clean up and formalize the target pre-legalization DAG combine that
   converts a VSELECT with a constant condition operand into
   a VECTOR_SHUFFLE.
2) Remove any fancy lowering from VSELECT during *legalization* relying
   entirely on the DAG combine to catch cases where we can match to an
   immediate-controlled blend instruction.

One additional step that I'm not planning on but would be interested in
others' opinions on: we could add an X86ISD::VSELECT or X86ISD::BLENDV
which encodes a fully legalized VSELECT node. Then it would be easy to
write isel patterns only in terms of this to ensure VECTOR_SHUFFLE
legalization only ever forms the fully legalized construct and we can't
cycle between it and VSELECT combining.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218658 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 02:52:28 +00:00
Matt Arsenault
108eb07f61 Fix missing C++ mode comment
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218654 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 01:05:27 +00:00
Juergen Ributzka
a0af4b0271 [FastISel][AArch64] Fold sign-/zero-extends into the load instruction.
The sign-/zero-extension of the loaded value can be performed by the memory
instruction for free. If the result of the load has only one use and the use is
a sign-/zero-extend, then we emit the proper load instruction. The extend is
only a register copy and will be optimized away later on.

Other instructions that consume the sign-/zero-extended value are also made
aware of this fact, so they don't fold the extend too.

This fixes rdar://problem/18495928.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218653 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 00:49:58 +00:00
Juergen Ributzka
e8a9706eda [FastISel][AArch64] Factor out scale factor calculation. NFC.
Factor out the code that determines the implicit scale factor of memory
operations for a given value type.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218652 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 00:49:54 +00:00
Eric Christopher
11bea1bff3 Simplify conditional.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218643 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 23:31:13 +00:00
Adam Nemet
e3d2fcce41 [AVX512] Use X86VectorVTInfo in the masking helper classes and the FMAs
No functionality change.

Makes the code more compact (see the FMA part).

This needs a new type attribute MemOpFrag in X86VectorVTInfo.  For now I only
defined this in the simple cases.  See the commment before the attribute.

Diff of X86.td.expanded before and after is empty except for the appearance of
the new attribute.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218637 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 22:54:41 +00:00
Hans Wennborg
4edcbaec90 WinCOFFObjectWriter: optimize the string table for common suffices
This is a follow-up from r207670 which did the same for ELF.

Differential Revision: http://reviews.llvm.org/D5530

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218636 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 22:43:20 +00:00
Eric Christopher
6a2169eb6f Add soft-float to the key for the subtarget lookup in the TargetMachine
map, this makes sure that we can compile the same code for two different
ABIs (hard and soft float) in the same module.

Update one testcase accordingly (and fix some confusing naming) and
add a new testcase as well with the ordering swapped which would
highlight the problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218632 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 21:57:54 +00:00
Eric Christopher
200f3764bf Fix spelling and reflow comments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218631 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 21:57:52 +00:00
Dave Estes
f657899174 [AArch64] Refines the Cortex-A57 Machine Model
Primarily refines all of the instructions with accurate latency
and micro-op information. Refinements largely focus on the NEON
instructions.

Additionally, a few advanced features are modeled, including
forwarding for MAC instructions and hazards for floating point SQRT
and DIV.

Lastly, the issue-width is reduced to three so that the scheduler
will better accommodate the narrower decode and dispatch width.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218627 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 21:27:36 +00:00
David Blaikie
03b4667e14 Unit test r218187, changing RTDyldMemoryManager::getSymbolAddress's behavior favor mangled lookup over unmangled lookup.
The contract of this function seems problematic (fallback in either
direction seems like it could produce bugs in one client or another),
but here's some tests for its current behavior, at least. See the
commit/review thread of r218187 for more discussion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218626 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 21:25:13 +00:00
Matt Arsenault
0db97bb9b4 Fix include order
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218611 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 15:53:15 +00:00
Matt Arsenault
1bcadc9b5c R600/SI: Fix hardcoded values for modifiers.
Move enums to SIDefines.h

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218610 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 15:50:26 +00:00
Matt Arsenault
49cbc1891b R600/SI: Also fix fsub + fadd a, a to mad combines
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218609 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 14:59:38 +00:00
Matt Arsenault
a5f45d5444 R600/SI: Fix using mad with multiplies by 2
These turn into fadds, so combine them into the target
mad node.

fadd (fadd (a, a), b) -> mad 2.0, a, b

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218608 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 14:59:34 +00:00
Chad Rosier
ea64dce261 [AArch64] Improve cost model to handle sdiv by a pow-of-two.
This patch improves the target-specific cost model to better handle signed
division by a power of two. The immediate result is that this enables the SLP
vectorizer to do a better job.

http://reviews.llvm.org/D5469
PR20714

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218607 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 13:59:31 +00:00
Frederic Riss
97552d7c1c Store TypeUnits in a SmallVector<DWARFUnitSection> instead of a single DWARFUnitSection.
There will be multiple TypeUnits in an unlinked object that will be extracted
from different sections. Now that we have DWARFUnitSection that is supposed
to represent an input section, we need a DWARFUnitSection<TypeUnit> per
input .debug_types section.

Once this is done, the interface is homogenous and we can move the Section
parsing code into DWARFUnitSection.

This is a respin of r218513 that got reverted because it broke some builders.
This new version features an explicit move constructor for the DWARFUnitSection
class to workaround compilers unable to generate correct C++11 default
constructors.

Reviewers: samsonov, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5482

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218606 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 13:56:39 +00:00
Kevin Qin
dbaeb6e7cb Use a loop to simplify the runtime unrolling prologue.
Runtime unrolling will create a prologue to execute the extra
iterations which is can't divided by the unroll factor. It
generates an if-then-else sequence to jump into a factor -1
times unrolled loop body, like

    extraiters = tripcount % loopfactor
    if (extraiters == 0) jump Loop:
    if (extraiters == loopfactor) jump L1
    if (extraiters == loopfactor-1) jump L2
    ...
    L1:  LoopBody;
    L2:  LoopBody;
    ...
    if tripcount < loopfactor jump End
    Loop:
    ...
    End:

It means if the unroll factor is 4, the loop body will be 7
times unrolled, 3 are in loop prologue, and 4 are in the loop.
This commit is to use a loop to execute the extra iterations
in prologue, like

        extraiters = tripcount % loopfactor
        if (extraiters == 0) jump Loop:
        else jump Prol
 Prol:  LoopBody;
        extraiters -= 1                 // Omitted if unroll factor is 2.
        if (extraiters != 0) jump Prol: // Omitted if unroll factor is 2.
        if (tripcount < loopfactor) jump End
 Loop:
 ...
 End:

Then when unroll factor is 4, the loop body will be copied by
only 5 times, 1 in the prologue loop, 4 in the original loop.
And if the unroll factor is 2, new loop won't be created, just
as the original solution.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218604 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 11:15:00 +00:00
Oliver Stannard
017c6111a8 [Thumb2] ldrexd and strexd are not defined on v7M
The Thumb2 ldrexd and strexd instructions are not defined for
M-class architectures.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218603 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 10:57:29 +00:00
Chandler Carruth
8ac2f142a8 [x86] Make the new vector shuffle lowering lower blends as VSELECT
nodes, and rely exclusively on its logic. This removes a ton of
duplication from the blend lowering and centralizes it in one place.

One downside is that it requires a bunch of hacks to make this work with
the current legalization framework. We have to manually speculate one
aspect of legalizing VSELECT nodes to get everything to work nicely
because the existing legalization framework isn't *actually* bottom-up.

The other grossness is that we somewhat duplicate the analysis of
constant blends. I'm on the fence here. If reviewers thing this would
look better with VSELECT when it has constant operands dumping over tho
VECTOR_SHUFFLE, we could go that way. But it would be a substantial
change because currently all of the actual blend instructions are
matched via patterns in the TD files based around VSELECT nodes (despite
them not being perfect fits for that). Suggestions welcome, but at least
this removes the rampant duplication in the backend.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218600 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 09:57:07 +00:00
Jyoti Allur
bc88cfc351 Remove dead code from DIBuilder
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218593 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 06:32:54 +00:00
Chandler Carruth
d23f1883d3 [x86] Delete a bunch of really bad and totally unnecessary code in the
X86 target-specific DAG combining that tried to convert VSELECT nodes
into VECTOR_SHUFFLE nodes that it "knew" would lower into
immediate-controlled blend nodes.

Turns out, we have perfectly good lowering of all these VSELECT nodes,
and indeed that lowering already knows how to handle lowering through
BLENDI to immediate-controlled blend nodes. The code just wasn't getting
used much because this thing forced the world to go through the vector
shuffle lowering. Yuck.

This also exposes that I was too aggressive in avoiding domain crossing
in v218588 with that lowering -- when the other option is to expand into
two 128-bit vectors, it is worth domain crossing. Restore that behavior
now that we have nice tests covering it.

The test updates here fall into two camps. One is where previously we
ended up with an unsigned encoding of the blend operand and now we get
a signed encoding. In most of those places there were elaborate comments
explaining exactly what these operands really mean. Rather than that,
just switch these tests to use the nicely decoded comments that make it
obvious that the final shuffle matches.

The other updates are just removing pointless domain crossing by
blending integers with PBLENDW rather than BLENDPS.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218589 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 02:01:20 +00:00
Chandler Carruth
3589550b3e [x86] Refactor all of the VSELECT-as-blend lowering code to avoid domain
crossing and generally work more like the blend emission code in the new
vector shuffle lowering.

My goal is to have the new vector shuffle lowering just produce VSELECT
nodes that are either matched here to BLENDI or are legal and matched in
the .td files to specific blend instructions. That seems much cleaner as
there are other ways to produce a VSELECT anyways. =]

No *observable* functionality changed yet, mostly because this code
appears to be near-dead. The behavior of this lowering routine did
change though. This code being mostly dead and untestable will change
with my next commit which will also point some new tests at it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218588 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 01:32:54 +00:00
Chandler Carruth
b3cf6a65d6 [x86] Improve naming and comments for VSELECT lowering.
No functionality changed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218586 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 00:51:58 +00:00
Chandler Carruth
8e93ce1780 [x86] Add the dispatch skeleton to the new vector shuffle lowering for
AVX-512.

There is no interesting logic yet. Everything ends up eventually
delegating to the generic code to split the vector and shuffle the
halves. Interestingly, that logic does a significantly better job of
lowering all of these types than the generic vector expansion code does.
Mostly, it lets most of the cases fall back to nice AVX2 code rather
than all the way back to SSE code paths.

Step 2 of basic AVX-512 support in the new vector shuffle lowering. Next
up will be to incrementally add direct support for the basic instruction
set to each type (adding tests first).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218585 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 00:37:27 +00:00
Chandler Carruth
3bc1ba672c [x86] Make the split-and-lower routine fully generic by relaxing the
assertion, making the name generic, and improving the documentation.

Step 1 in adding very primitive support for AVX-512. No functionality
changed yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218584 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 00:21:49 +00:00
Chandler Carruth
b61dfec824 [x86] Teach the new vector shuffle lowering to fall back on AVX-512
vectors.

Someone will need to build the AVX512 lowering, which should follow
AVX1 and AVX2 *very* closely for AVX512F and AVX512BW resp. I've added
a dummy test which is a port of the v8f32 and v8i32 tests from AVX and
AVX2 to v8f64 and v8i64 tests for AVX512F and AVX512BW. Hopefully this
is enough information for someone to implement proper lowering here. If
not, I'll be happy to help, but right now the AVX-512 support isn't
a priority for me.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218583 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-28 23:53:10 +00:00
Chandler Carruth
4f4280469c [x86] Fix the new vector shuffle lowering's use of VSELECT for AVX2
lowerings.

This was hopelessly broken. First, the x86 backend wants '-1' to be the
element value representing true in a boolean vector, and second the
operand order for VSELECT is backwards from the actual x86 instructions.
To make matters worse, the backend is just using '-1' as the true value
to get the high bit to be set. It doesn't actually symbolically map the
'-1' to anything. But on x86 this isn't quite how it works: there *only*
the high bit is relevant. As a consequence weird non-'-1' values like
0x80 actually "work" once you flip the operands to be backwards.

Anyways, thanks to Hal for helping me sort out what these *should* be.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218582 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-28 23:23:55 +00:00
Matt Arsenault
0df40b4969 Add MachineOperand::ChangeToFPImmediate and setFPImm
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218579 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-28 19:24:59 +00:00
Chandler Carruth
3f40848670 [x86] Fix a really silly bug that I introduced fixing another bug in the
new vector shuffle target DAG combines -- it helps to actually test for
the value you want rather than just using an integer in a boolean
context.

Have I mentioned that I loathe implicit conversions recently? :: sigh ::

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218576 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-28 06:11:04 +00:00
Chandler Carruth
21b69296fb [x86] Fix yet another bug in the new vector shuffle lowering's handling
of widening masks.

We can't widen a zeroing mask unless both elements that would be merged
are either zeroed or undef. This is the only way to widen a mask if it
has a zeroed element.

Also clean up the code here by ordering the checks in a more logical way
and by using the symoblic values for undef and zero. I'm actually torn
on using the symbolic values because the existing code is littered with
the assumption that -1 is undef, and moreover that entries '< 0' are the
special entries. While that works with the values given to these
constants, using the symbolic constants actually makes it a bit more
opaque why this is the case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218575 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-28 03:30:25 +00:00
Hans Wennborg
e05d3b921f WinCOFFObjectWriter.cpp: make write_uint32_le more efficient
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218574 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-28 00:22:27 +00:00
James Molloy
aada52189e [AArch64] Redundant store instructions should be removed as dead code
If there is a store followed by a store with the same value to the same location, then the store is dead/noop. It can be removed.

This problem is found in spec2006-197.parser.

For example,
  stur    w10, [x11, #-4]
  stur    w10, [x11, #-4]
Then one of the two stur instructions can be removed.

Patch by David Xu!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218569 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-27 17:02:54 +00:00
Yaron Keren
af34c3a995 Fix llvm::huge_valf multiple initializations with Visual C++.
llvm::huge_valf is defined in a header file, so it is initialized
multiple times in every compiled unit upon program startup.

With non-VC compilers huge_valf is set to a HUGE_VALF which the
compiler can probably optimize out.

With VC numeric_limits<float>::infinity() does not return a number
but a runtime structure member which therotically may change 
between calls so the compiler does not optimize out the 
initialization and it happens many times. It can be easily seen by 
placing a breakpoint on the initialization line.

This patch moves llvm::huge_valf initialization to a source file
instead of the header.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218567 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-27 14:41:29 +00:00
Chandler Carruth
b66b0cf2eb [x86] Fix yet another issue with widening vector shuffle elements.
I spotted this by inspection when debugging something else, so I have no
test case what-so-ever, and am not even sure it is possible to
realistically trigger the bug. But this is what was intended here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218565 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-27 08:40:33 +00:00
Chandler Carruth
72c3b07dfd [x86] Fix terrible bugs everywhere in the new vector shuffle lowering
and in the target shuffle combining when trying to widen vector
elements.

Previously only one of these was correct, and we didn't correctly
propagate zeroing target shuffle masks (which have a different sentinel
value from undef in non- target shuffle masks now). This isn't just
a missed optimization, this caused us to drop zeroing shuffles on the
floor and miscompile code. The added test case is one example of that.

There are other fixes to the test suite as a consequence of this as well
as restoring the undef elements in some of the masks that were lost when
I brought sanity to the actual *value* of the undef and zero sentinels.

I've also just cleaned up some of the PSHUFD and PSHUFLW and PSHUFHW
combining code, but that code really needs to go. It was a nice initial
attempt, but it isn't very principled and the recursive shuffle combiner
is much more powerful.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218562 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-27 04:42:44 +00:00
Chandler Carruth
8470b5b812 [x86] Flip the sentinel values used in the target shuffle mask decoding
to significantly more sane sentinels. Notably, everywhere else in the
backend's representation of shuffles uses '-1' to represent undef. The
target shuffle masks really shouldn't diverge from that, especially as
in a few places they are manipulated by shared code.

This causes us to lose some undef lanes in various test masks. I want to
get these back, but technically it isn't invalid and there are a *lot*
of bugs here so I want to try to establish a saner baseline for fixing
some of the bugs by aligning the specific senitnel values used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218561 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-27 04:42:39 +00:00
Sanjay Patel
676af35b38 Refactor reciprocal and reciprocal square root estimate into target-independent functions (part 2).
This is purely refactoring. No functional changes intended. PowerPC is the only target
that is currently using this interface.

The ultimate goal is to allow targets other than PowerPC (certainly X86 and Aarch64) to turn this:

z = y / sqrt(x)

into:

z = y * rsqrte(x)

And:

z = y / x

into:

z = y * rcpe(x)

using whatever HW magic they can use. See http://llvm.org/bugs/show_bug.cgi?id=20900 .

There is one hook in TargetLowering to get the target-specific opcode for an estimate instruction
along with the number of refinement steps needed to make the estimate usable.

Differential Revision: http://reviews.llvm.org/D5484



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218553 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 23:01:47 +00:00
David Majnemer
01ea611601 Object: BSS/virtual sections don't have contents
Users of getSectionContents shouldn't try to pass in BSS or virtual
sections.  In all instances, this is a bug in the code calling this
routine.

N.B. Some COFF implementations (like CL) will mark their BSS sections as
taking space on disk.  This would confuse COFFObjectFile into thinking
the section is larger than the file.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218549 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 22:32:16 +00:00
Yaron Keren
a51dbbd394 clang-format of ChangeStdinToBinary & ChangeStdoutToBinary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218547 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 22:27:11 +00:00
Chandler Carruth
0a31a52b91 [x86] Fix a moderately terrifying bug in the new 128-bit shuffle logic
that managed to elude all of my fuzz testing historically. =/

Something changed to allow this code path to actually be exercised and
it was doing bad things. It is especially heavily exercised by the
patterns that emerge when doing AVX shuffles that end up lowered through
the 128-bit code path.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218540 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 20:41:45 +00:00
Chad Rosier
4150a8de76 [IndVar] Don't widen loop compare unless IV user is sign extended.
PR21030

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218539 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 20:05:35 +00:00
Matt Arsenault
07b7c98d61 R600/SI: Use break instead of continue
If an instruction doesn't have src1, it doesn't have src2

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218536 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:55:14 +00:00
Matt Arsenault
88416c337b R600/SI: Add a note about the order of the operands to div_scale
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218534 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:55:09 +00:00
Matt Arsenault
508b8db287 R600/SI: Move finding SGPR operand to move to separate function
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218533 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:55:06 +00:00
Matt Arsenault
d991d2217b R600/SI Allow same SGPR to be used for multiple operands
Instead of moving the first SGPR that is different than the first,
legalize the operand that requires the fewest moves if one
SGPR is used for multiple operands.

This saves extra moves and is also required for some instructions
which require that the same operand be used for multiple operands.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218532 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:55:03 +00:00
Matt Arsenault
aed12d4bad R600/SI: Partially move operand legalization to post-isel hook.
Disable the SGPR usage restriction parts of the DAG legalizeOperands.
It now should only be doing immediate folding until it can be replaced
later. The real legalization work is now done by the other
SIInstrInfo::legalizeOperands

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218531 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:54:59 +00:00
Matt Arsenault
29202835d8 R600/SI: Implement findCommutedOpIndices
The base implementation of commuteInstruction is used
in some cases, but it turns out this has been broken for a
long time since modifiers were inserted between the real operands.

The base implementation of commuteInstruction also fails on immediates,
which also needs to be fixed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218530 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:54:54 +00:00
Matt Arsenault
8a70e28114 R600/SI: Don't move operands that are required to be SGPRs
e.g. v_cndmask_b32 requires the condition operand be an SGPR.
If one of the source operands were an SGPR, that would be considered
the one SGPR use and the condition operand would be illegally moved.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218529 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:54:52 +00:00
Matt Arsenault
5b199b585c R600/SI: Don't assert on exotic operand types
This needs a test, but I'm not sure if it is currently possible and
I originally hit it due to a bug. Right now the only global address
operands have no reason to be VALU instructions, although it
theoretically could be a problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218528 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:54:46 +00:00
Matt Arsenault
26b2a7834e R600/SI: Fix using wrong operand indices when commuting
No test since the current SIISelLowering::legalizeOperands
effectively hides this, and the general uses seem to only fire
on SALU instructions which don't have modifiers between
the operands.

When trying to use legalizeOperands immediately after
instruction selection, it now sees a lot more patterns
it did not see before which break on this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218527 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:54:43 +00:00
Matt Arsenault
ea849e9adc R600/SI: Remove apparently dead code in legalizeOperands
No tests hit this, and I don't see any way a GlobalAddress
node would survive beyond lowering on SI. It it would, the
move should probably be inserted by selection.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218526 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:54:38 +00:00
David Peixotto
ea468dddfe Ignore annotation function calls in cost computation
The annotation instructions are dropped during codegen and have no
impact on size.  In some cases, the annotations were preventing the
unroller from unrolling a loop because the annotation calls were
pushing the cost over the unrolling threshold.

Differential Revision: http://reviews.llvm.org/D5335



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218525 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:48:40 +00:00
Chandler Carruth
a7579ed23f [x86] The mnemonic is SHUFPS not SHUPFS. =[ I'm very bad at spelling
sadly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218524 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:27:40 +00:00
Chandler Carruth
7929a210d5 [x86] In the new vector shuffle lowering, when trying to do another
layer of tie-breaking sorting, it really helps to check that you're in
a tie first. =] Otherwise the whole thing cycles infinitely. Test case
added, another one found through fuzz testing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218523 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:24:26 +00:00
Chandler Carruth
7164a4ae0a [x86] Fix a large collection of bugs that crept in as I fleshed out the
AVX support.

New test cases included. Note that none of the existing test cases
covered these buggy code paths. =/ Also, it is clear from this that
SHUFPS and SHUFPD are the most bug prone shuffle instructions in x86. =[

These were all detected by fuzz-testing. (I <3 fuzz testing.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218522 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:11:02 +00:00
Renato Golin
6215f78195 Elide repeated register operand in Thumb1 instructions
This patch makes the ARM backend transform 3 operand instructions such as
'adds/subs' to the 2 operand version of the same instruction if the first
two register operands are the same.

Example: 'adds r0, r0, #1' will is transformed to 'adds r0, #1'.

Currently for some instructions such as 'adds' if you try to assemble
'adds r0, r0, #8' for thumb v6m the assembler would throw an error message
because the immediate cannot be encoded using 3 bits.

The backend should be smart enough to transform the instruction to
'adds r0, #8', which allows for larger immediate constants.

Patch by Ranjeet Singh.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218521 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 16:14:29 +00:00
Andrea Di Biagio
a5ab9baf83 [X86][SchedModel] SSE reciprocal square root instruction latencies.
The SSE rsqrt instruction (a fast reciprocal square root estimate) was
grouped in the same scheduling IIC_SSE_SQRT* class as the accurate (but very
slow) SSE sqrt instruction. For code which uses rsqrt (possibly with
newton-raphson iterations) this poor scheduling was affecting performances.

This patch splits off the rsqrt instruction from the sqrt instruction scheduling
classes and creates new IIC_SSE_RSQER* classes with latency values based on
Agner's table.

Differential Revision: http://reviews.llvm.org/D5370

Patch by Simon Pilgrim.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218517 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 12:56:44 +00:00
Frederic Riss
a0d5d7aed8 Revert "Store TypeUnits in a SmallVector<DWARFUnitSection> instead of a single DWARFUnitSection."
This reverts commit r218513.

Buildbots using libstdc++ issue an error when trying to copy
SmallVector<std::unique_ptr<>>. Revert the commit until we have a fix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218514 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 12:34:06 +00:00
Frederic Riss
5fb5bdbf6a Store TypeUnits in a SmallVector<DWARFUnitSection> instead of a single DWARFUnitSection.
Summary:
There will be multiple TypeUnits in an unlinked object that will be extracted
from different sections. Now that we have DWARFUnitSection that is supposed
to represent an input section, we need a DWARFUnitSection<TypeUnit> per
input .debug_types section.

Once this is done, the interface is homogenous and we can move the Section
parsing code into DWARFUnitSection.

Reviewers: samsonov, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5482

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218513 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 12:15:40 +00:00
Daniel Sanders
12aa552637 Fix unused variable warning added in r218509
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218510 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 10:45:26 +00:00
Daniel Sanders
7ecd98679e [mips] Generalize the handling of f128 return values to support f128 arguments.
Summary:
This will allow us to handle f128 arguments without duplicating code from
CCState::AnalyzeFormalArguments() or CCState::AnalyzeCallOperands().

No functional change.

Reviewers: vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5292

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218509 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 10:06:12 +00:00
Robert Khasanov
26ba182fdf [AVX512] Added load/store from BW/VL subsets to Register2Memory opcode tables.
Added lowering tests for these instructions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218508 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 09:48:50 +00:00
David Majnemer
ed2b7578b8 Fix build breakage on MSVC 2013
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218499 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 04:47:54 +00:00
David Majnemer
af100b0350 Target: Fix build breakage.
No functional change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218497 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 02:57:05 +00:00
David Majnemer
346056ffc0 Support: Remove undefined behavior from &raw_ostream::operator<<
Don't negate signed integer types in &raw_ostream::operator<<(const
FormattedNumber &FN).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218496 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 02:48:14 +00:00
David Xu
2109982c88 Revert patch ofr218493
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218494 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 02:28:03 +00:00
David Xu
c41ae2a5c4 Redundant store instructions should be removed as dead code
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218493 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 02:02:09 +00:00
Eric Christopher
55a90ab4ef Add the first backend support for on demand subtarget creation
based on the Function. This is currently used to implement
mips16 support in the mips backend via the existing module
pass resetting the subtarget.

Things to note:

a) This involved running resetTargetOptions before creating a
new subtarget so that code generation options like soft-float
could be recognized when creating the new subtarget. This is
to deal with initialization code in isel lowering that only
paid attention to the initial value.

b) Many of the existing testcases weren't using the soft-float
feature correctly. I've corrected these based on the check
values assuming that was the desired behavior.

c) The mips port now pays attention to the target-cpu and
target-features strings when generating code for a particular
function. I've removed these from one function where the
requested cpu and features didn't match the check lines in
the testcase.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218492 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 01:44:08 +00:00
Eric Christopher
a6e0a6e729 Move resetTargetOptions from taking a MachineFunction to a Function
since we are accessing the TargetMachine that we're a member
function of.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218489 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 01:28:10 +00:00
Matt Arsenault
584886c0bb R600/SI: Fix emitting trailing whitespace after s_waitcnt
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218486 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 01:09:46 +00:00
Adam Nemet
479f2f7a14 [AVX512] Simplify use of !con()
No change in X86.td.expanded.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218485 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 00:53:12 +00:00
Adam Nemet
08f261afbf [AVX512] Pull pattern for subvector extract into the instruction definition
No functional change.

I initially thought that pulling the Pat<> into the instruction pattern was
not possible because it was doing a transform on the index in order to convert
it from a per-element (extract_subvector) index into a per-chunk (vextract*x4)
index.

Turns out this also works inside the pattern because the vextract_extract
PatFrag has an OperandTransform EXTRACT_get_vextract{128,256}_imm, so the
index in $idx goes through the same conversion.

The existing test CodeGen/X86/avx512-insert-extract.ll extended in the
previous commit provides coverage for this change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218480 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 23:48:49 +00:00
Adam Nemet
4007b30ede [AVX512] Refactor subvector extracts
No functional change.

These are now implemented as two levels of multiclasses heavily relying on the
new X86VectorVTInfo class.  The multiclass at the first level that is called
with float or int provides the 128 or 256 bit subvector extracts.  The second
level provides the register and memory variants and some more Pat<>s.

I've compared the td.expanded files before and after.  One change is that
ExeDomain for 64x4 is SSEPackedDouble now.  I think this is correct, i.e. a
bugfix.

(BTW, this is the change that was blocked on the recent tablegen fix.  The
class-instance values X86VectorVTInfo inside vextract_for_type weren't
properly evaluated.)

Part of <rdar://problem/17688758>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218478 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 23:48:45 +00:00
Adam Nemet
1973ffefcf [AVX512] Fix typo
F->I in VEXTRACTF32x4rr.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218477 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 23:48:42 +00:00
Bruno Cardoso Lopes
f4230250a1 [MachineSink+PGO] Teach MachineSink to use BlockFrequencyInfo
Machine Sink uses loop depth information to select between successors BBs to
sink machine instructions into, where BBs within smaller loop depths are
preferable.  This patch adds support for choosing between successors by using
profile information from BlockFrequencyInfo instead, whenever the information
is available.

Tested it under SPEC2006 train (average of 30 runs for each program); ~1.5%
execution speedup in average on x86-64 darwin.

<rdar://problem/18021659>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218472 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 23:14:26 +00:00
Nick Kledzik
e93da60ac4 [Support] Add type-safe alternative to llvm::format()
llvm::format() is somewhat unsafe. The compiler does not check that integer
parameter size matches the %x or %d size and it does not complain when a 
StringRef is passed for a %s.  And correctly using a StringRef with format() is  
ugly because you have to convert it to a std::string then call c_str().
 
The cases where llvm::format() is useful is controlling how numbers and
strings are printed, especially when you want fixed width output.  This
patch adds some new formatting functions to raw_streams to format numbers
and StringRefs in a type safe manner. Some examples:

   OS << format_hex(255, 6)        => "0x00ff"
   OS << format_hex(255, 4)        => "0xff"
   OS << format_decimal(0, 5)      => "    0"
   OS << format_decimal(255, 5)    => "  255"
   OS << right_justify(Str, 5)     => "  foo"
   OS << left_justify(Str, 5)      => "foo  "



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218463 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 20:30:58 +00:00
Anton Yartsev
f85d5cfbf6 Refactoring: raw pointer -> unique_ptr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218462 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 19:55:58 +00:00
Tom Stellard
8361c84894 ARM: Remove unneeded check for MI->hasPostISelHook()
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218459 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 18:59:23 +00:00
Tom Stellard
bdaf056545 SelectionDAG: Remove #if NDEBUG from check for a post-isel hook
The InstrEmitter will skip the check of MI.hasPostISelHook()
before calling AdjustInstrPostInstrSelection() when NDEBUG
is not defined.

This was added in r140228, and I'm not sure if it is intentional or not,
but it is a likely source for bugs, because it means with
Release+Asserts builds you can forget to set the hasPostISelHook
flag on TableGen definitions and AdjustInstrPostInstrSelection() will
still be called.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218458 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 18:59:22 +00:00
Tom Stellard
29d48e6a49 R600/SI: Add support for global atomic add
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218457 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 18:30:26 +00:00
Robin Morisset
79826e015e Lower idempotent RMWs to fence+load
Summary:
I originally tried doing this specifically for X86 in the backend in D5091,
but it was rather brittle and generally running too late to be general.
Furthermore, other targets may want to implement similar optimizations.
So I reimplemented it at the IR-level, fitting it into AtomicExpandPass
as it interacts with that pass (which could not be cleanly done before
at the backend level).

This optimization relies on a new target hook, which is only used by X86
for now, as the correctness of the optimization on other targets remains
an open question. If it is found correct on other targets, it should be
trivial to enable for them.

Details of the optimization are discussed in D5091.

Test Plan: make check-all + a new test

Reviewers: jfb

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5422

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218455 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 17:27:43 +00:00
Sid Manning
733681d3bd Add missing attributes !cmp.[eq,gt,gtu] instructions.
These instructions do not indicate they are extendable or the
number of bits in the extendable operand.  Rename to match
architected names.  Add a testcase for the intrinsics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218453 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 13:09:54 +00:00
Daniel Sanders
1d545d9acb Add llvm_unreachables() for [ASZ]ExtUpper to X86FastISel.cpp to appease the buildbots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218452 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 13:08:51 +00:00
Daniel Sanders
03fe69e90d [mips] Add CCValAssign::[ASZ]ExtUpper and CCPromoteToUpperBitsInType and handle struct's correctly on big-endian N32/N64 return values.
Summary:
The N32/N64 ABI's require that structs passed in registers are laid out
such that spilling the register with 'sd' places the struct at the lowest
address. For little endian this is trivial but for big-endian it requires
that structs are shifted into the upper bits of the register.

We also require that structs passed in registers have the 'inreg'
attribute for big-endian N32/N64 to work correctly. This is because the
tablegen-erated calling convention implementation only has access to the
lowered form of struct arguments (one or more integers of up to 64-bits
each) and is unable to determine the original type.

Reviewers: vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5286

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218451 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 12:15:05 +00:00
Renato Golin
6765c34b0c Add aliases for VAND imm to VBIC ~imm
On ARM NEON, VAND with immediate (16/32 bits) is an alias to VBIC ~imm with
the same type size. Adding that logic to the parser, and generating VBIC
instructions from VAND asm files.

This patch also fixes the validation routines for NEON splat immediates which
were wrong.

Fixes PR20702.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218450 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 11:31:24 +00:00
Chandler Carruth
4b667ee436 [x86] Teach the new vector shuffle lowering to use AVX2 instructions for
v4f64 and v8f32 shuffles when they are lane-crossing. We have fully
general lane-crossing permutation functions in AVX2 that make this easy.

Part of this also changes exactly when and how these vectors are split
up when we don't have AVX2. This isn't always a win but it usually is
a win, so on the balance I think its better. The primary regressions are
all things that just need to be fixed anyways such as modeling when
a blend can be completely accomplished via VINSERTF128, etc.

Also, this highlights one of the few remaining big features: we do
a really poor job of inserting elements into AVX registers efficiently.

This completes almost all of the big tricks I have in mind for AVX2. The
only things left that I plan to add:

1) element insertion smarts
2) palignr and other fairly specialized lowerings when they happen to
   apply

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218449 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 11:03:55 +00:00
Chandler Carruth
05901d80ba [x86] Teach the new vector shuffle lowering a fancier way to lower
256-bit vectors with lane-crossing.

Rather than immediately decomposing to 128-bit vectors, try flipping the
256-bit vector lanes, shuffling them and blending them together. This
reduces our worst case shuffle by a pretty significant margin across the
board.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218446 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 10:21:15 +00:00
Oliver Stannard
f220c5387b [Thumb2] BXJ should be undefined for v7M, v8A
The Thumb2 BXJ instruction (Branch and Exchange Jazelle) is not
defined for v7M or v8A. It is defined for all other Thumb2-supporting
architectures (v6T2, v7A and v7R).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218445 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 10:02:05 +00:00
Chandler Carruth
2e8d2c727c [x86] Fix an oversight in the v8i32 path of the new vector shuffle
lowering where it only used the mask of the low 128-bit lane rather than
the entire mask.

This allows the new lowering to correctly match the unpack patterns for
v8i32 vectors.

For reference, the reason that we check for the the entire mask rather
than checking the repeated mask is because the repeated masks don't
abide by all of the invariants of normal masks. As a consequence, it is
safer to use the full mask with functions like the generic equivalence
test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218442 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 04:10:27 +00:00
Chandler Carruth
e5fb4ad142 [x86] Rearrange the code for v16i16 lowering a bit for clarity and to
reduce the amount of checking we do here.

The first realization is that only non-crossing cases between 128-bit
lanes are handled by almost the entire function. It makes more sense to
handle the crossing cases first.

THe second is that until we actually are going to generate fancy shared
lowering strategies that use the repeated semantics of the v8i16
lowering, we should waste time checking for repeated masks. It is
simplest to directly test for the entire unpck masks anyways, so we
gained nothing from this.

This also matches the structure of v32i8 more closely.

No functionality changed here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218441 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 04:03:22 +00:00
Chandler Carruth
e3bb4bb2d5 [x86] Implement AVX2 support for v32i8 in the new vector shuffle
lowering.

This completes the basic AVX2 feature support, but there are still some
improvements I'd like to do to really get the last mile of performance
here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218440 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 02:52:12 +00:00
Reid Kleckner
dd8ce126d7 MC: Use @IMGREL instead of @IMGREL32, which we can't parse
Nico Rieck added support for this 32-bit COFF relocation some time ago
for Win64 stuff. It appears that as an oversight, the assembly output
used "foo"@IMGREL32 instead of "foo"@IMGREL, which is what we can parse.

Sadly, there were actually tests that took in IMGREL and put out
IMGREL32, and we didn't notice the inconsistency. Oh well. Now LLVM can
assemble it's own output with slightly more fidelity.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218437 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 02:09:18 +00:00
Chandler Carruth
6a289bb491 [x86] Remove the defunct X86ISD::BLENDV entry -- we use vector selects
for this now.

Should prevent folks from running afoul of this and not knowing why
their code won't instruction select the way I just did...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218436 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 01:16:01 +00:00
Chandler Carruth
ef673b3c73 [x86] Fix the v16i16 blend logic I added in the prior commit and add the
missing test cases for it.

Unsurprisingly, without test cases, there were bugs here. Surprisingly,
this bug wasn't caught at compile time. Yep, there is an X86ISD::BLENDV.
It isn't wired to anything. Oops. I'll fix than next.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218434 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 01:13:38 +00:00
Justin Bogner
aacc919bfd llvm-cov: Combine segments that cover the same location
If we have multiple coverage counts for the same segment, we need to
add them up rather than arbitrarily choosing one. This fixes that and
adds a test with template instantiations to exercise it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218432 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 00:34:18 +00:00
Akira Hatanaka
0253523c92 [X86,AVX] Add an isel pattern for X86VBroadcast.
This fixes PR21050 and rdar://problem/18434607.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218431 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 00:26:15 +00:00
Chandler Carruth
bdecfeb723 [x86] Implement v16i16 support with AVX2 in the new vector shuffle
lowering.

This also implements the fancy blend lowering for v16i16 using AVX2 and
teaches the X86 backend to print shuffle masks for 256-bit PSHUFB
and PBLENDW instructions. It also makes the mask decoding correct for
PBLENDW instructions. The yaks, they are legion.

Tests are updated accordingly. There are some missing tests for the
VBLENDVB lowering, but I'll add those in a follow-up as this commit has
accumulated enough cruft already.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218430 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-25 00:24:19 +00:00
Kostya Serebryany
0e9d114865 [asan] don't instrument module CTORs that may be run before asan.module_ctor. This fixes asan running together -coverage
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218421 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 22:41:55 +00:00
Renato Golin
bb994f55a4 Revert 218406 - Refactor the RelocVisitor::visit method
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218416 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 21:30:43 +00:00
Akira Hatanaka
d3d620b33d Revert r218380. This was breaking Apple internal build bots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218409 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 20:37:14 +00:00
Renato Golin
c0104e4001 Refactor the RelocVisitor::visit method
This change replaces the brittle if/else chain of string comparisons
with a switch statement on the detected target triple, removing the
need for testing arbitrary architecture names returned from
getFileFormatName, whose primary purpose seems to be for display
(user-interface) purposes. The visitor now takes a reference to the
object file, rather than its arbitrary file format name to figure out
whether the file is a 32 or 64-bit object file and what the detected
target triple is.

A set of tests have been added to help show that the refactoring processes
relocations for the same targets as the original code.

Patch by Charlie Turner.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218406 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 20:07:22 +00:00
Chris Bieneman
4bb780af42 Adding #ifdef around TermColorMutex based on feedback from Craig Topper.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218401 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 18:35:58 +00:00
Chandler Carruth
c88ae9687b [x86] Factor out the logic to generically decombose a vector shuffle
into unblended shuffles and a blend.

This is the consistent fallback for the lowering paths that have fast
blend operations available, and its getting quite repetitive.

No functionality changed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218399 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 18:20:09 +00:00
Kaelyn Takata
a0d6422afe Revert "Refactor the RelocVisitor::visit method"
This reverts commit faac033f73.

The test depends on all targets to be enabled in llc in order to pass,
and needs to be rewritten/refactored to not have that dependency.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218393 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 17:49:07 +00:00
Renato Golin
faac033f73 Refactor the RelocVisitor::visit method
This change replaces the brittle if/else chain of string comparisons
with a switch statement on the detected target triple, removing the
need for testing arbitrary architecture names returned from
getFileFormatName, whose primary purpose seems to be for display
(user-interface) purposes. The visitor now takes a reference to the
object file, rather than its arbitrary file format name to figure out
whether the file is a 32 or 64-bit object file and what the detected
target triple is.

A set of tests have been added to help show that the refactoring processes
relocations for the same targets as the original code.

Patch by Charlie Turner.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218388 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 17:00:42 +00:00
David Peixotto
cfc42962c8 Fix assertion in LICM doFinalization()
The doFinalization method checks that the LoopToAliasSetMap is
empty. LICM populates that map as it runs through the loop nest,
deleting the entries for child loops as it goes. However, if a child
loop is deleted by another pass (e.g. unrolling) then the loop will
never be deleted from the map because LICM walks the loop nest to
find entries it can delete.

The fix is to delete the loop from the map and free the alias set
when the loop is deleted from the loop nest.

Differential Revision: http://reviews.llvm.org/D5305


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218387 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 16:48:31 +00:00
Moritz Roth
8c4e64af8a [Thumb] Make load/store optimizer less conservative.
If it's safe to clobber the condition flags, we can do a few extra things:
it's then possible to reset the base register writeback using a SUBS, so
we can try to merge even if the base register isn't dead after the merged
instruction.

This is effectively a (heavily bug-fixed) rewrite of r208992.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218386 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 16:35:50 +00:00
Oliver Stannard
43c6b6be8f [Thumb] 32-bit encodings of 'cps' are not valid for v7M
v7M only allows the 16-bit encoding of the 'cps' (Change Processor
State) instruction, and does not have the 32-bit encoding which is
valid from v6T2 onwards.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218382 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 14:20:01 +00:00
Aaron Ballman
6a07014c57 Silencing an "enumeral and non-enumeral type in conditional expression" warning. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218381 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 13:54:56 +00:00
Benjamin Kramer
7a27231780 Replace a hand-written suffix compare with std::lexicographical_compare.
No functionality change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218380 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 13:19:28 +00:00
Chandler Carruth
10cd8098a7 [x86] Teach the instruction lowering to add comments describing constant
pool data being loaded into a vector register.

The comments take the form of:

  # ymm0 = [a,b,c,d,...]
  # xmm1 = <x,y,z...>

The []s are used for generic sequential data and the <>s are used for
specifically ConstantVector loads. Undef elements are printed as the
letter 'u', integers in decimal, and floating point values as floating
point values. Suggestions on improving the formatting or other aspects
of the display are very welcome.

My primary use case for this is to be able to FileCheck test masks
passed to vector shuffle instructions in-register. It isn't fantastic
for that (no decoding special zeroing semantics or other tricks), but it
at least puts the mask onto an instruction line that could reasonably be
checked. I've updated many of the new vector shuffle lowering tests to
leverage this in their test cases so that we're actually checking the
shuffle masks remain as expected.

Before implementing this, I tried a *bunch* of different approaches.
I looked into teaching the MCInstLower code to scan up the basic block
and find a definition of a register used in a shuffle instruction and
then decode that, but this seems incredibly brittle and complex.
I talked to Hal a lot about the "right" way to do this: attach the raw
shuffle mask to the instruction itself in some form of unencoded
operands, and then use that to emit the comments. I still think that's
the optimal solution here, but it proved to be beyond what I'm up for
here. In particular, it seems likely best done by completing the
plumbing of metadata through these layers and attaching the shuffle mask
in metadata which could have fully automatic dropping when encoding an
actual instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218377 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 09:39:41 +00:00
Michael Liao
35fdc092e0 Allow BB duplication threshold to be adjusted through JumpThreading's ctor
- BB duplication may not be desired on targets where there is no or small
  branch penalty and code duplication needs restrict control.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218375 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 04:59:06 +00:00
NAKAMURA Takumi
bfb2b180bf Windows/Host.inc: Reformat the header to fit 80-col.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218374 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 04:45:14 +00:00
NAKAMURA Takumi
d968b05f4d Unix/Host.inc: Remove <cstdlib>. It has been unused for a long time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218373 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 04:45:02 +00:00
NAKAMURA Takumi
4921f68311 Unix/Host.inc: Wrap a comment line in 80-col.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218371 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 04:44:50 +00:00
NAKAMURA Takumi
c44f94d681 Unix/Host.inc: Remove leading whitespace. It had been here since r56942!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218370 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 04:44:37 +00:00
Jiangning Liu
1fe409e347 Clear PreferredExtendType for in each function-specific state FunctionLoweringInfo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218364 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 03:22:56 +00:00
Chandler Carruth
f00b50b6ef [x86] More refactoring of the shuffle comment emission. The previous
attempt didn't work out so well. It looks like it will be much better
for introducing extra logic to find a shuffle mask if the finding logic
is totally separate. This also makes it easy to sink the opcode logic
completely out of the routine so we don't re-dispatch across it.

Still no functionality changed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218363 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 03:06:37 +00:00
Chandler Carruth
db5e1dafa4 [x86] Bypass the shuffle mask comment generation when not using verbose
asm. This can be somewhat expensive and there is no reason to do it
outside of tests or debugging sessions. I'm also likely to make it
significantly more expensive to support more styles of shuffles.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218362 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 03:06:34 +00:00
Chandler Carruth
7e0f903c6c [x86] Hoist the logic for extracting the relevant bits of information
from the MachineInstr into the caller which is already doing a switch
over the instruction.

This will make it more clear how to compute different operands to feed
the comment selection for example.

Also, in a drive-by-fix, don't append an empty comment string (which is
a no-op ultimately).

No functionality changed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218361 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 02:24:41 +00:00
Matt Arsenault
59da3f04ca R600/SI: Add new helper isSGPRClassID
Move these into header since they are trivial

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218360 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 02:17:12 +00:00
Matt Arsenault
2e67962e9b R600/SI: Fix hardcoded and wrong operand numbers.
Also fix leftover debug printing

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218359 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 02:17:09 +00:00
Matt Arsenault
9b50273e54 R600/SI: Enable named operand table for SALU instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218358 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 02:17:06 +00:00
Chandler Carruth
5f671bae7c [x86] Start refactoring the comment printing logic in the MC lowering of
vector shuffles.

This is just the beginning by hoisting it into its own function and
making use of early exit to dramatically simplify the flow of the
function. I'm going to be incrementally refactoring this until it is
a bit less magical how this applies to other instructions, and I can
teach it how to dig a shuffle mask out of a register. Then I plan to
hook it up to VPERMD so we get our mask comments for it.

No functionality changed yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218357 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 02:16:12 +00:00
Tom Stellard
81c6c9690a R600/SI: Enable selecting SALU inside branches
We can do this now that the FixSGPRLiveRanges pass is working.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218353 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 01:33:28 +00:00
Tom Stellard
abe9b2274d R600/SI: Move PHIs that define SGPRs to the VALU in most cases
This fixes a bug that is uncovered by a future commit and will
be tested by the test/CodeGen/R600/sgpr-control-flow.ll test case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218352 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 01:33:26 +00:00
Tom Stellard
36ba7962a4 R600/SI: Fix the FixSGPRLiveRanges pass
The previous implementation was extending the live range of SGPRs
by modifying the live intervals directly.  This was causing a lot
of machine verification errors when the machine scheduler was enabled.

The new implementation adds pseudo instructions with implicit uses to
extend the live ranges of SGPRs, which works much better.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218351 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 01:33:24 +00:00
Tom Stellard
90d1726693 R600/SI: Mark EXEC_LO and EXEC_HI as reserved
These registers can be allocated and used like other 32-bit registers,
but it seems like a likely source for bugs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218350 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 01:33:23 +00:00
Tom Stellard
b8112412cf R600/SI: Fix SIRegisterInfo::getPhysRegSubReg()
Correctly handle special registers: EXEC, EXEC_LO, EXEC_HI, VCC_LO,
VCC_HI, and M0.  The previous implementation would assertion fail
when passed these registers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218349 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-24 01:33:22 +00:00