Commit Graph

25803 Commits

Author SHA1 Message Date
Toma Tabacu
0b2081a05a [mips] Improve robustness of some tests.
Summary:
This is done by removing some hardcoded registers like $at or expecting a single digit register to be selected.

Contains work done by Matheus Almeida.

Reviewers: matheusalmeida, dsanders

Reviewed By: dsanders

Subscribers: tomatabacu

Differential Revision: http://reviews.llvm.org/D4227

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215640 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 13:10:48 +00:00
Chandler Carruth
cad1711154 [x86] Begin stubbing out the AVX support in the new vector shuffle
lowering scheme.

Currently, this just directly bails to the fallback path of splitting
the 256-bit vector into two 128-bit vectors, operating there, and then
joining the results back together. While the results are far from
perfect, they are *shockingly* good for what we're doing here. I'll be
layering the rest of the functionality on top of this piece by piece and
updating tests as I go.

Note that 256-bit vectors in this mode are still somewhat WIP. While
I think the code paths that I'm adding here are clean and good-to-go,
there are still a lot of 128-bit assumptions that I'll need to stomp out
as I march through the functional spread here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215637 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 12:13:59 +00:00
Toma Tabacu
cb43f81fc5 [mips] Add assembler support for the "la $reg,symbol" pseudo-instruction.
Summary:
This pseudo-instruction allows the programmer to load an address from a symbolic expression into a register.

Patch by David Chisnall.
His work was sponsored by: DARPA, AFRL

I've made some minor changes to the original, such as improving the formatting and adding some comments, and I've also added a test case.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D4808

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215630 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 10:29:17 +00:00
Chandler Carruth
369e0ef67d [SDAG] Fix a bug in the DAG combiner where we would fail to return the
input node after manually adding it to the worklist and using CombineTo.

Once we use CombineTo the input node may have been deleted. Despite this
being *completely confusing* and somewhat broken, the only way to
"correctly" return from a DAG combine after potentially deleting the
input node is to return *that exact node*....

But really, this code should just never have used CombineTo. It won't do
what it wants (returning the node as mentioned above just causes the
combine to infloop). The correct way to combine away a casted load to
a load of the correct type is to RAUW the chain directly and then return
the loaded value to replace the actual value node.

I managed to find this with the vector shuffle fuzzer even though it
clearly has nothing at all to do with vector shuffles and rather those
happen to trigger a load of a constant pool that hits this combine *just
right*. I've included the test as it is small and a nice stress test
that the infrastructure isn't asserting.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215622 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 08:18:34 +00:00
David Majnemer
eb323b2b3c InstCombine: ((A | ~B) ^ (~A | B)) to A ^ B
Proof using CVC3 follows:
$ cat t.cvc
A, B : BITVECTOR(32);
QUERY BVXOR((A | ~B),(~A |B)) = BVXOR(A,B);
$ cvc3 t.cvc
Valid.

Patch by Mayur Pandey!

Differential Revision: http://reviews.llvm.org/D4883

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215621 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 06:46:25 +00:00
David Majnemer
923556f8a8 Added InstCombine Transform for ((B | C) & A) | B -> B | (A & C)
Transform ((B | C) & A) | B --> B | (A & C)

Z3 Link: http://rise4fun.com/Z3/hP6p

Patch by Sonam Kumari!

Differential Revision: http://reviews.llvm.org/D4865

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215619 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 06:41:38 +00:00
Saleem Abdulrasool
0086358325 MC: AsmLexer: handle multi-character CommentStrings correctly
As X86MCAsmInfoDarwin uses '##' as CommentString although a single '#' starts a
comment a workaround for this special case is added.

Fixes divisions in constant expressions for the AArch64 assembler and other
targets which use '//' as CommentString.

Patch by Janne Grunau!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215615 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 02:51:43 +00:00
Chandler Carruth
14ee003f1a [SDAG] Fix a case where we would iteratively legalize a node during
combining by replacing it with something else but not re-process the
node afterward to remove it.

In a truly remarkable stroke of bad luck, this would (in the test case
attached) end up getting some other node combined into it without ever
getting re-processed. By adding it back on to the worklist, in addition
to deleting the dead nodes more quickly we also ensure that if it
*stops* being dead for any reason it makes it back through the
legalizer. Without this, the test case will end up failing during
instruction selection due to an and node with a type we don't have an
instruction pattern for.

It took many million runs of the shuffle fuzz tester to find this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215611 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 01:07:37 +00:00
Akira Hatanaka
d0ddfb0896 [AArch64, fast-isel] Fall back to SelectionDAG to select tail calls.
Certain functions such as objc_autoreleaseReturnValue have to be called as
tail-calls even at -O0. Since normal fast-isel doesn't emit calls as tail calls,
we have to fall back to SelectionDAG to select calls that are marked as tail.

<rdar://problem/17991614>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215600 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 23:23:58 +00:00
Juergen Ributzka
8c9a0319bb [FastISel][AArch64] Add support for more addressing modes.
FastISel didn't take much advantage of the different addressing modes available
to it on AArch64. This commit allows the ComputeAddress method to recognize more
addressing modes that allows shifts and sign-/zero-extensions to be folded into
the memory operation itself.

For Example:
  lsl x1, x1, #3     --> ldr x0, [x0, x1, lsl #3]
  ldr x0, [x0, x1]

  sxtw x1, w1
  lsl x1, x1, #3     --> ldr x0, [x0, x1, sxtw #3]
  ldr x0, [x0, x1]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215597 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 22:53:29 +00:00
Juergen Ributzka
b677a877c8 [FastISel][X86] Add large code model support for materializing floating-point constants.
In the large code model for X86 floating-point constants are placed in the
constant pool and materialized by loading from it. Since the constant pool
could be far away, a PC relative load might not work. Therefore we first
materialize the address of the constant pool with a movabsq and then load
from there the floating-point value.

Fixes <rdar://problem/17674628>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215595 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 22:25:35 +00:00
Juergen Ributzka
0701e5d43b [FastISel][X86] Use XOR to materialize the "0" value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215594 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 22:22:17 +00:00
Juergen Ributzka
f245d9aa77 [FastISel][X86] Emit more efficient instructions for integer constant materialization.
This mostly affects the i64 value type, which always resulted in an 15byte
mobavsq instruction to materialize any constant. The custom code checks the
value of the immediate and tries to use a different and smaller mov
instruction when possible.

This fixes <rdar://problem/17420988>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215593 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 22:18:11 +00:00
Juergen Ributzka
dc408e8069 [FastISel][AArch64] Make use of the zero register when possible.
This change materializes now the value "0" from the zero register.
The zero register can be folded by several instruction, so no
materialization is need at all.

Fixes <rdar://problem/17924413>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215591 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 22:13:14 +00:00
Juergen Ributzka
eb1c51f8b3 [FastISel] Let the target decide first if it wants to materialize a constant.
This changes the order in which FastISel tries to materialize a constant.
Originally it would try to use a simple target-independent approach, which
can lead to the generation of inefficient code.

On X86 this would result in the use of movabsq to materialize any 64bit
integer constant - even for simple and small values such as 0 and 1. Also
some very funny floating-point materialization could be observed too.

On AArch64 it would materialize the constant 0 in a register even the
architecture has an actual "zero" register.

On ARM it would generate unnecessary mov instructions or not use mvn.

This change simply changes the order and always asks the target first if it
likes to materialize the constant. This doesn't fix all the issues
mentioned above, but it enables the targets to implement such
optimizations.

Related to <rdar://problem/17420988>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215588 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 22:08:02 +00:00
Juergen Ributzka
047423787c [FastISel][ARM] Use MOVT/MOVW if the subtarget requests it.
This change is also in preparation for a future change to make sure that
the constant materialization uses MOVT/MOVW when available and not a load
from the constant pool.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215584 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 21:42:19 +00:00
Benjamin Kramer
4551342b53 Fix (re-)creation of unittest lit.site.cfg for clang-tools-extra.
This has been hiding really well. Hopefully brings the builders suffering from
outdated lit.site.cfg files back to life.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215575 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 20:41:26 +00:00
Jan Vesely
d3fa093dc9 utils: Fix segfault in flattencfg
v2: continue iterating through the rest of the bb
    use for loop

v3: initialize FlattenCFG pass in ScalarOps
    add test

v4: split off initializing flattencfg to a separate patch
    add comment

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215574 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 20:31:53 +00:00
Matt Arsenault
bd949eea85 R600: Correctly set the src value offset for scalarized kernel args
This for some reason fixes v1i64 kernel arguments on pre-SI. This
currently breaks some other cases in the kernel-args.ll test for R600,
but I'm not particularly confident in the new output. VTX_READ_* are not
used for some of the scalarized cases, and the code reading from the
constant buffer doesn't make much sense to me.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215564 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 18:14:11 +00:00
Andrea Di Biagio
05a76eb9f2 [DAGCombiner] Improved target independent vector shuffle combine rule.
This patch improves the existing algorithm in DAGCombiner that
attempts to fold shuffles according to rule:
  shuffle(shuffle(x, y, M1), undef, M2) -> shuffle(y, undef, M3)

Before this change, there were cases where the DAGCombiner conservatively
avoided folding shuffles even if the resulting mask would have been legal.
That is because the algorithm wrongly assumed that commuting
an illegal shuffle mask would always produce an illegal mask.

With this change, we now correctly compute the commuted shuffle mask before
calling method 'isShuffleMaskLegal' on it.
On X86, this improves for example the codegen for the following function:

define <4 x i32> @test(<4 x i32> %A, <4 x i32> %B) {
  %1 = shufflevector <4 x i32> %B, <4 x i32> %A, <4 x i32> <i32 1, i32 2, i32 6, i32 7>
  %2 = shufflevector <4 x i32> %1, <4 x i32> undef, <4 x i32> <i32 2, i32 3, i32 2, i32 3>
  ret <4 x i32> %2
}

Before this change the X86 backend (-mcpu=corei7) generated
the following assembly code for function @test:
  shufps $-23, %xmm0, %xmm1  # xmm1 = xmm1[1,2],xmm0[2,3]
  movhlps %xmm1, %xmm1       # xmm1 = xmm1[1,1]
  movaps %xmm1, %xmm0

Now we produce:
  movhlps %xmm0, %xmm0       # xmm0 = xmm0[1,1]

Added extra test cases in combine-vec-shuffle-2.ll to verify that we correctly
fold according to the above-mentioned rule.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215555 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 16:09:40 +00:00
Chandler Carruth
701073e58e [optnone] Make the optnone attribute effective at suppressing function
attribute and function argument attribute synthesizing and propagating.

As with the other uses of this attribute, the goal remains a best-effort
(no guarantees) attempt to not optimize the function or assume things
about the function when optimizing. This is particularly useful for
compiler testing, bisecting miscompiles, triaging things, etc. I was
hitting specific issues using optnone to isolate test code from a test
driver for my fuzz testing, and this is one step of fixing that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215538 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 10:49:33 +00:00
Robert Khasanov
232202439a [SKX] Extended non-temporal load/store instructions for AVX512VL subsets.
Added avx512_movnt_vl multiclass for handling 256/128-bit forms of instruction.
Added encoding and lowering tests.

Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215536 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 10:46:00 +00:00
Daniel Sanders
5d16d6c3f0 Re-commit: [mips] Implement .ent, .end, .frame, .mask and .fmask.
Patch by Matheus Almeida and Toma Tabacu

The lld test failure on the previous attempt to commit was caused by the
addition of the .pdr section causing the offsets it was checking to change.
This has been fixed by removing the .ent/.end directives from that test since
they weren't really needed.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215535 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 10:07:34 +00:00
Chandler Carruth
5e5aa9438d Revert r215415 which causse MSan to crash on a great deal of C++ code.
I've followed up on the original commit as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215532 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 09:19:39 +00:00
Elena Demikhovsky
4c97c1420b AVX-512: Fixed a bug in shufflevector lowering.
PALIGNR instruction does not exist in AVX-512F set.
Added a test.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215526 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 07:58:43 +00:00
Karthik Bhat
7ef167ae1f InstCombine: Combine (xor (or %a, %b) (xor %a, %b)) to (add %a, %b)
Correctness proof of the transform using CVC3-

$ cat t.cvc
A, B : BITVECTOR(32);
QUERY BVXOR(A | B, BVXOR(A,B) ) = A & B;

$ cvc3 t.cvc
Valid.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215524 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 05:13:14 +00:00
Chandler Carruth
6bb093bbe7 [x86] Rewrite a core part of the new vector shuffle lowering to handle
one pesky test case correctly.

This test case caused the old code to infloop occilating between solving
the low-half and the high-half. The 'side balancing' part of
single-input v8 shuffle lowering didn't handle the one pattern which can
cause it to occilate. Fortunately the fuzz testing found this case.
Unfortuately it was *terrible* to handle. I'm really sorry for the
amount and density of the code here, I'd love suggestions on how to
simplify it. I feel like there *must* be a simpler form here, but after
a lot of days I've not found it. This is the only one I've found that
even works. I've added the one pesky test case along with some nice
comments explaining the core problem that we have to solve here.

So far this has survived approximately 32k test cases. More strenuous
fuzzing commencing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215519 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 01:25:45 +00:00
Hal Finkel
e693d3c558 [PowerPC] Implement PPCTargetLowering::getTgtMemIntrinsic
This implements PPCTargetLowering::getTgtMemIntrinsic for Altivec load/store
intrinsics. As with the construction of the MachineMemOperands for the
intrinsic calls used for unaligned load/store lowering, the only slight
complication is that we need to represent a larger memory range than the
loaded/stored value-type size (because the address is rounded down to an
aligned address, and we need to conservatively represent the entire possible
range of the actual access). This required adding an extra size field to
TargetLowering::IntrinsicInfo, and this was done in a way that required no
modifications to other targets (the size defaults to the store size of the
provided memory data type).

This fixes test/CodeGen/PowerPC/unal-altivec-wint.ll (so it can be un-XFAILed).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215512 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 01:15:40 +00:00
Hal Finkel
695e914c03 Fix classof for ISD::INTRINSIC_W_CHAIN and INTRINSIC_VOID
Unfortunately, our use of the SDNode class hierarchy for INTRINSIC_W_CHAIN and
INTRINSIC_VOID nodes is somewhat broken right now. These nodes sometimes are
used for memory intrinsics (those with MachineMemOperands), and sometimes not.
When not, the nodes are not created as instances of MemIntrinsicSDNode, but
rather created as some other subclass of SDNode using DAG::getNode. When they
are memory intrinsics, they are created using DAG::getMemIntrinsicNode as
instances of MemIntrinsicSDNode. MemIntrinsicSDNode is a subclass of
MemSDNode, but prior to r214452, we had a non-self-consistent setup whereby
MemIntrinsicSDNode::classof on INTRINSIC_W_CHAIN and INTRINSIC_VOID would
return true but MemSDNode::classof on INTRINSIC_W_CHAIN and INTRINSIC_VOID
would return false. In r214452, MemSDNode::classof was changed to return true
for INTRINSIC_W_CHAIN and INTRINSIC_VOID, which is now self-consistent. The
problem is that neither the pre-r214452 logic and the post-r214452 logic are
really right. The truth is that not all INTRINSIC_W_CHAIN and INTRINSIC_VOID
nodes are instances of MemIntrinsicSDNode (or MemSDNode for that matter), and
the return value from classof needs to reflect that. This was broken before
r214452 (because MemIntrinsicSDNode::classof always returned true), and was
broken afterward (because MemSDNode::classof also always returned true), and
will now be correct.

The minimal solution is to grab one of the SubclassData bits (there is one left
for MemIntrinsicSDNode nodes) and use it to store whether or not a particular
INTRINSIC_W_CHAIN or INTRINSIC_VOID is really an instance of
MemIntrinsicSDNode or not. Doing this allows both MemIntrinsicSDNode::classof
and MemSDNode::classof to return the correct answer for the underlying object
for both the memory-intrinsic and non-memory-intrinsic cases.

This fixes the problem that r214452 created in the SelectionDAGDumper (thanks
to Matt Arsenault for pointing it out).

Because PowerPC does not implement getTgtMemIntrinsic, this change breaks
test/CodeGen/PowerPC/unal-altivec-wint.ll. I've XFAILed it for now, and will
fix it in a follow-up commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215511 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 01:15:37 +00:00
Adam Nemet
4c9467ea5d [AVX512] Verify the code generated for the intrinsic _mm512_broadcastsd_pd
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215487 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 00:30:05 +00:00
Adam Nemet
6ea7f36872 [AVX512] Handle valign masking intrinsic via C++ lowering
I think that this will scale better in most cases than adding a Pat<> for each
mapping from the intrinsic DAG to the intruction (i.e. rri, rrik, rrikz).  We
can just lower to the SDNode and have the resulting DAG be matches by the DAG
patterns.

Alternatively (long term), we could keep the Pat<>s but generate them via the
new AVX512_masking multiclass.  The difficulty is that in order to formulate
that we would have to concatenate DAGs.  Currently this is only supported if
the operators of the input DAGs are identical.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215473 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12 21:13:12 +00:00
Matt Arsenault
00139e51c9 Allwo bitcast + struct GEP transform to work with addrspacecast
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215467 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12 19:46:13 +00:00
Jan Vesely
3c57820bbb R600: Use optimized 24bit path in udivrem
v2: drop enum keyword
    use correct extension mode
    don't bother computing the sign in unsinged case

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215462 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12 17:31:20 +00:00
Jan Vesely
b40562c0ec R600: Use i24 optimized path for SREM
v2: add tests
    rename LowerSDIV24 to LowerSDIVREM24
    handle the rem part in this function

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215460 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12 17:31:17 +00:00
Duncan P. N. Exon Smith
49135bb76c Don't upgrade global constructors when reading bitcode
An optional third field was added to `llvm.global_ctors` (and
`llvm.global_dtors`) in r209015.  Most of the code has been changed to
deal with both versions of the variables.  Users of the C API might
create either version, the helper functions in LLVM create the two-field
version, and clang now creates the three-field version.

However, the BitcodeReader was changed to always upgrade to the
three-field version.  This created an unnecessary inconsistency in the
IR before/after serializing to bitcode.

This commit resolves the inconsistency by making the third field truly
optional (and not upgrading in the bitcode reader).  Since `llvm-link`
was relying on this upgrade code, rather than deleting it I've moved it
into `ModuleLinker`, where it upgrades these arrays as necessary to
resolve inconsistencies between modules.

The ideal resolution would be to remove the 2-field version and make the
third field required.  I filed PR20506 to track that.

I changed `test/Bitcode/upgrade-global-ctors.ll` to a negative test and
duplicated the `llvm-link` check in `test/Linker/global_ctors.ll` to
check both upgrade directions.

Since I came across this as part of PR5680 (serializing use-list order),
I've also added the missing `verify-uselistorder` RUN line to
`test/Bitcode/metadata-2.ll`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215457 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12 16:46:37 +00:00
Rafael Espindola
8b14a499f9 Make the test a bit more strict.
Before it would pass even if @b or @c ended up pointing to a variable named
@a123.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215450 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12 15:55:27 +00:00
Rafael Espindola
f531e92671 Add a plugin testcase for merging weak variables.
I initially thought I could implement COMDATs with aliases by just
internalizing GVs instead of dropping them. This is a counter
example: Internalizing one of the @a would make @b and @c point
to different variables.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215447 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12 15:39:14 +00:00
NAKAMURA Takumi
87d7e906ce llvm/test/TableGen/*Foreach*.td: Remove XFAIL:vg_leak. They have not been failing since r215176.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215445 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12 14:06:21 +00:00
Tim Northover
48bfab80aa llvm-objdump: print contents of MachO __unwind_info sections
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215437 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12 11:52:59 +00:00
Gerolf Hoflehner
392f7d970c [MachineCombiner] Fix for ICE bug 20598
The combiner ignored DBG nodes when checking
the uses of a virtual register.

It combined a sequence like
   %vreg1 = madd %vreg2, %vreg3,...
   DBG_VALUE (%vreg1 ...)
   %vreg4 = add %vreg1,...
to
  %vreg4 = madd %vreg2, %vreg3

leaving behind a dangling DBG_VALUE with
a definition. This triggered an assertion
in the MachineTraceMetrics.cpp module.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215431 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12 07:54:12 +00:00
Adrian Prantl
a6bd9bd30e DebugLocEntry: Restore the comparison predicate from before the
refactoring in 215384. This way it can unique multiple entries describing
the same piece even if they don't have the exact same location.
(The same piece may get merged in and be added from OpenRanges).
There ought to be a more elegant solution for this, though.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215418 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12 01:07:53 +00:00
Reid Kleckner
23761603fe msan: Handle musttail calls
First, avoid calling setTailCall(false) on musttail calls.  The funciton
prototypes should be "congruent", so the shadow layout should be exactly
the same.

Second, avoid inserting instrumentation after a musttail call to
propagate the return value shadow.  We don't need to propagate the
result of a tail call, it should already be in the right place.

Reviewed By: eugenis

Differential Revision: http://reviews.llvm.org/D4331

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215415 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12 00:12:43 +00:00
Michael J. Spencer
4935833df5 [x86] Fold extract_vector_elt of a load into the Load's address computation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215409 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 23:49:33 +00:00
David Majnemer
e8be18e8a3 InstCombine: Combine (add (and %a, %b) (or %a, %b)) to (add %a, %b)
What follows bellow is a correctness proof of the transform using CVC3.

$ < t.cvc
A, B : BITVECTOR(32);

QUERY BVPLUS(32, A & B, A | B) = BVPLUS(32, A, B);

$ cvc3 < t.cvc
Valid.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215400 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 22:32:02 +00:00
Tom Stellard
13f4476c55 R600/SI: Add a ComplexPattern for selecting MUBUF _OFFSET variant
This saves us from having to copy a 64-bit 0 value into VGPRs for
BUFFER_* instruction which only have a 12-bit immediate offset.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215399 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 22:18:17 +00:00
Tom Stellard
68e9ebbe44 R600/SI: Add check for low 32 bits of encoding to mubuf tests
There are no variable values like registers encoded in the low 32 bits of MUBUF
instructions, so it is relatively easy to check these bits, and it will
help prevent us from introducing encoding bugs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215397 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 22:18:11 +00:00
Tom Stellard
728d0e4218 R600/SI: Clear lds bit on MUBUF instructions used for private stores
This bit was left uninitialized, which was causing some random failures
of piglit tests.

NOTE: This is a candidate for the 3.5 branch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215396 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 22:18:09 +00:00
Tom Stellard
0df264a0fd R600/SI: Fix broken test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215395 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 22:18:05 +00:00
Quentin Colombet
7f4f923aa5 [AArch64] Fix registerAllocator assigns same register for base and wback in
pre/post-index load and store.

Patch by Steven Wu <stevenwu@apple.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215390 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 21:39:53 +00:00
Saleem Abdulrasool
6c2be4ff95 ARM: try harder to detect non-IT eligible instructions
For many Thumb-1 register register instructions, setting the CPSR is not
permitted inside an IT block.  We would not correctly flag those instructions.
The previous change to identify this scenario was insufficient as it did not
actually catch all the instances.  The current list is formed by manual
inspection of the ARMv6M ARM.

The change to the Thumb2 IT block test is due to the fact that the new more
stringent checking of the MIs results in the If Conversion pass being prevented
from executing (since not all the instructions in the BB are predicable).  This
results in code gen changes.

Thanks to Tim Northover for pointing out that the previous patch was
insufficient and hinting that the use of the v6M ARM would be much easier to use
than the v7 or v8!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215382 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 20:13:25 +00:00
Rafael Espindola
b2d2f28351 Fix using -plugin-opt=apiflie when also using -plugin-opt=emit-llvm.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215378 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 19:06:54 +00:00
Sanjay Patel
7c0fa0cfab Correct a missing RUN line in the ARM codegen test for fneg ops. We should also explicitly specify +/-neonfp.
The bug was introduced at r99570 when use of "-arm-use-neon-fp" was removed.

Differential Revision: http://reviews.llvm.org/D4846



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215377 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 19:04:28 +00:00
Reid Kleckner
d7f37d823b Add missing test for r215031
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215374 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 18:34:54 +00:00
Reid Kleckner
b81e6cf6c5 MC: Diagnose an unexpected token in COFF .section instead of asserting
This can easily arise when trying to assemble and ELF style .section
directive for a COFF object file.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215373 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 18:34:43 +00:00
Rafael Espindola
69ca26cc97 Fix use of uninitialized variable.
Fixes linking bitcode files that use the new style comdats for constructors
with ones that don't.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215364 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 17:07:34 +00:00
Daniel Sanders
e20f611baf Revert r215359 - [mips] Implement .ent, .end, .frame, .mask and .fmask assembler directives
It seems to cause an lld test (elf/Mips/hilo16-3.test) to fail. Reverted while we investigate.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215361 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 16:10:19 +00:00
Daniel Sanders
36cf28e70e [mips] Implement .ent, .end, .frame, .mask and .fmask assembler directives
Patch by Matheus Almeida and Toma Tabacu

Differential Revision: http://reviews.llvm.org/D4179


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215359 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 15:28:56 +00:00
Tim Northover
af713d3ba7 AArch64: add support for dynamic-loader relocations
LLD needs them, and it's good to be able to print them properly when
our object dumpers encounter them.

Patch by Daniel Stewart.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215352 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 10:10:27 +00:00
Tim Northover
f34a60c920 llvm-readobj: zero out timestamp in COFF auto-generated test files.
The timestamp meant these files changed with each invocation of
relocs.py, confusing matters when we add relocations and need to
update the tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215350 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 09:53:07 +00:00
Oliver Stannard
17ef00ea94 ARM: __gnu_h2f_ieee and __gnu_f2h_ieee always use the soft-float calling convention
By default, LLVM uses the "C" calling convention for all runtime
library functions. The half-precision FP conversion functions use the
soft-float calling convention, and are needed for some targets which
use the hard-float convention by default, so must have their calling
convention explicitly set.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215348 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 09:12:32 +00:00
Jiangning Liu
0679d2d0a4 In Machine CSE pass, the source register of a COPY machine instruction can
be propagated to all its users, and this propagation could increase the 
probability of finding common subexpressions. If the COPY has only one user,
the COPY itself can be removed.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215344 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 05:17:19 +00:00
Jiangning Liu
1505fa4376 In LVI(Lazy Value Info), originally value on a BB can only be caculated once,
and the lattice will be updated to be a state other than "undefined". This
limiation could miss some opportunities of lowering "overdefined" to be an
even accurate value. So this patch ask the algorithm to try to lower the
lattice value again even if the value has been lowered to be "overdefined".



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215343 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 05:02:04 +00:00
Petar Jovanovic
97b0c63f6b Add support for scalarizing cttz_zero_undef
Follow up to r214266. Add missing case in ScalarizeVectorResult() for
cttz_zero_undef.

Differential Revision: http://reviews.llvm.org/D4813


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215330 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-10 22:49:54 +00:00
Saleem Abdulrasool
3e5734dc38 ARM: correct isPredicable for MULS in ThHUMB mode
The ARM ARM states that CPSR may not be updated by a MUL in thumb mode.  Due to
an ordering of Thumb 2 Size Reduction and If Conversion, we would end up
generating a THUMB MULS inside an IT block.

The If Conversion pass uses the TTI isPredicable method to ensure that it can
transform a Basic Block.  However, because we only check for IT handling on
Thumb2 functions, we may miss some cases.  Even then, it only validates that the
CPSR is not *live* rather than it is not accessed.  This corrects the handling
for that particular case since the same restriction does not hold on the vast
majority of the instructions.

This does prevent the IfConversion optimization from kicking in in certain
cases, but generating correct code is more valuable.  Addresses PR20555.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215328 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-10 22:20:37 +00:00
Joerg Sonnenberger
4417c25b03 @l and friends adjust their value depending the context used in.
For ori, they are unsigned, for addi, signed. Create a new target
expression type to handle this and evaluate Fixups accordingly.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215315 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-10 12:41:50 +00:00
Joerg Sonnenberger
06e8dbef25 Allow the third argument for the subi family to be an expression.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215286 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-09 17:10:26 +00:00
Joerg Sonnenberger
0c5a408cf6 Update disassembler test to check the full dccci/iccci form.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215283 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-09 14:01:10 +00:00
Joerg Sonnenberger
26adfd1ca4 Use the full form of dccci and iccci from the early PPC 405 documents,
since the operands are actually used on those cores. Provide aliases for
the only documented case in the newer Power ISA speec.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215282 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-09 13:58:31 +00:00
Tom Stellard
4e8a136db8 R600/SI: Custom lower CONCAT_VECTORS
This will lower them using register copies rather than loads and stores
to the stack.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215270 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-09 01:06:56 +00:00
Tom Stellard
f1ba587963 R600/SI: Update concat_vectors.ll to check for scratch usage
These tests were using SI-NOT: MOVREL to make sure concat vectors
weren't being lowered to stack loads and stores, but we are using
scratch buffers for the stack now instead of registers, so we need
to add an additional SI-NOT check for scratch buffers.

With this change I was able to uncover one broken test which will
be fixed in a future commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215269 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-09 01:06:53 +00:00
Joerg Sonnenberger
996a304351 Allow large immediates for branch instructions in 32bit mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215240 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 20:57:58 +00:00
Joerg Sonnenberger
f0b70e2fbc Provide an implementation of getNoopForMachoTarget for PPC, otherwise
empty functions will assert in the MC object writer.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215238 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 19:13:23 +00:00
Juergen Ributzka
0980ef248e [FastISel][X86] Fix INC/DEC optimization (r215230)
I accidentally also used INC/DEC for unsigned arithmetic which doesn't work,
because INC/DEC don't set the required flag which is used for the overflow
check.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215237 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 18:47:04 +00:00
Juergen Ributzka
cbda4b32c6 [FastISel][X86] Use INC/DEC when possible for {sadd|ssub}.with.overflow intrinsics.
This is a small peephole optimization to emit INC/DEC when possible.

Fixes <rdar://problem/17952308>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215230 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 17:21:37 +00:00
Joerg Sonnenberger
c9def6b938 Add support for SPE load/store from memory.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215220 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 16:43:49 +00:00
Rafael Espindola
d272237f0b pr20589: Fix duplicated arch flag.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215216 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 16:18:29 +00:00
Daniel Sanders
a19fc6deb8 [mips] Invert the abicalls feature bit to be noabicalls so that it's possible for -mno-abicalls to take effect.
Also added the testcase that should have been in r215194.

This behaviour has surprised me a few times now. The problem is that the
generated MipsSubtarget::ParseSubtargetFeatures() contains code like this:

   if ((Bits & Mips::FeatureABICalls) != 0) IsABICalls = true;

so '-abicalls' means 'leave it at the default' and '+abicalls' means 'set it to
true'. In this case, (and the similar -modd-spreg case) I'd like the code to be

  IsABICalls = (Bits & Mips::FeatureABICalls) != 0;

or possibly:

   if ((Bits & Mips::FeatureABICalls) != 0)
     IsABICalls = true;
   else
     IsABICalls = false;

and preferably arrange for 'Bits & Mips::FeatureABICalls' to be true by default
(on some triples).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215211 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 15:47:17 +00:00
Josh Klontz
b79931d94e Add missing Interpreter intrinsic lowering for sin, cos and ceil
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215209 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 15:00:12 +00:00
Jiangning Liu
3b85f30319 [AArch64] Fix a type conversion bug for anlyzing compare.
The bug can cause spec2006/483.xalancbmk failure.

Patched by David Xu.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215206 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 14:19:29 +00:00
Daniel Sanders
1952807c9c [mips] Remove reason for XFAIL from a test that isn't actually XFAILed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215201 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 12:58:17 +00:00
James Molloy
414df79b80 [LoopVectorizer] Enable support for floating-point subtraction reductions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215200 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 12:41:08 +00:00
James Molloy
3a106a2813 [AArch64] Add an FP load balancing pass for Cortex-A57
For best-case performance on Cortex-A57, we should try to use a balanced mix of odd and even D-registers when performing a critical sequence of independent, non-quadword FP/ASIMD floating-point multiply or multiply-accumulate operations.

This pass attempts to detect situations where the register allocation may adversely affect this load balancing and to change the registers used so as to better utilize the CPU.

Ideally we'd just take each multiply or multiply-accumulate in turn and allocate it alternating even or odd registers. However, multiply-accumulates are most efficiently performed in the same functional unit as their accumulation operand. Therefore this pass tries to find maximal sequences ("Chains") of multiply-accumulates linked via their accumulation operand, and assign them all the same "color" (oddness/evenness).

This optimization affects S-register and D-register floating point multiplies and FMADD/FMAs, as well as vector (floating point only) muls and FMADD/FMA. Q register instructions (and 128-bit vector instructions) are not affected.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215199 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 12:33:21 +00:00
Tim Northover
bbdf1e0432 AArch64: stop trying to take control of all UnknownArch triples.
This short-circuited our error reporting for incorrectly specified
target triples (you'd get AArch64 code instead).

Should fix PR20567.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215191 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 08:27:44 +00:00
Patrik Hagglund
cf403861a3 [pr19635] Revert most of r170537, and add new testcase.
Patch provided by Andrey Kuharev.

Sorry, r170537 was obviously wrong.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215190 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 08:21:19 +00:00
David Majnemer
8e5c298a17 GlobalOpt: Optimize in the face of insertvalue/extractvalue
GlobalOpt didn't know how to simulate InsertValueInst or
ExtractValueInst.  Optimizing these is pretty straightforward.

N.B. This came up when looking at clang's IRGen for MS ABI member
pointers; they are represented as aggregates.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215184 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 05:50:43 +00:00
NAKAMURA Takumi
2477ef9014 Fix llvm/test/DebugInfo/X86/recursive_inlining.ll to use %llc_dwarf.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215181 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 02:24:05 +00:00
Adam Nemet
a8e1cda622 [AVX512] Add zero-masking variant to AVX512_masking multiclass
This completes one item from the todo-list of r215125 "Generate masking
instruction variants with tablegen".

The AddedComplexity is needed just like for the k variant.

Added a codegen test based on valignq.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215173 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 23:53:38 +00:00
Adam Nemet
690499ed49 [AVX512] Add codegen test for the masking variant of valign
The AddedComplexity is needed just like in avx512_perm_3src.  There may be a
bug in the complexity computation...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215168 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 23:18:18 +00:00
Akira Hatanaka
43f6ce9289 [stack protector] Look through bitcasts to get global variable
__stack_chk_guard.

Handle the case where the pointer operand of the load instruction that loads the
stack guard is not a global variable but instead a bitcast.

%StackGuard = load i8** bitcast (i64** @__stack_chk_guard to i8**)
call void @llvm.stackprotector(i8* %StackGuard, i8** %StackGuardSlot)

Original test case provided by Ana Pazos.

This fixes PR20558.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215167 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 23:08:24 +00:00
Adrian Prantl
7f48f056f7 Make these regexes stricter by disallowing any additional characters in the output.
Thanks to dblaikie for pointing this out!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215166 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 23:04:07 +00:00
Arnold Schwaighofer
2158dec965 SLPVectorizer: Use the type of the value loaded/stored to get the ABI alignment
We were using the pointer type which is incorrect.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215162 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 22:47:27 +00:00
Adrian Prantl
b93c57c0a9 Add a separate testcase for a DWARF expression describing a value in a
subregister.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215161 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 22:44:34 +00:00
Adrian Prantl
2ead89ae61 Reflow this comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215160 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 22:44:24 +00:00
David Blaikie
263998aa1d DebugInfo: Fix overwriting/loss of inlined arguments to recursively inlined functions.
Due to an unnecessary special case, inlined arguments that happened to
be from the same function as they were inlined into were misclassified
as non-inline arguments and would overwrite the non-inlined arguments.

Assert that we never overwrite a function's arguments, and stop
misclassifying inlined arguments as non-inline arguments to fix this
issue.

Excuse the rather crappy test case - handcrafted IR might do better, or
someone who understands better how to tickle the inliner to create a
recursive inlining situation like this (though it may also be necessary
to tickle the variable in a particular way to cause it to be recorded in
the MMI side table and go down this particular path for location
information).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215157 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 22:22:49 +00:00
Reed Kotler
cf76da912c fix materialization of one bit constants and global values which are accessed through
a base GOT entry.

Summary:
get tip of tree mips fast-isel to pass test-suite

Two bugs were fixed:

1) one bit booleans were treated as 1 bit signed integers and so the literal '1' could become sign extended.
2) mips uses got for pic but in certain cases, as with string constants for example, many items can be referenced from the same got entry and this case was not handled properly.

Test Plan: test-suite

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: mcrosier

Differential Revision: http://reviews.llvm.org/D4801

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215155 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 22:09:01 +00:00
Eric Christopher
aa5b9c0f6f Temporarily Revert "Nuke the old JIT." as it's not quite ready to
be deleted. This will be reapplied as soon as possible and before
the 3.6 branch date at any rate.

Approved by Jim Grosbach, Lang Hames, Rafael Espindola.

This reverts commits r215111, 215115, 215116, 215117, 215136.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215154 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 22:02:54 +00:00
Gerolf Hoflehner
e4fa341dde MachineCombiner Pass for selecting faster instruction sequence on AArch64
Re-commit of r214832,r21469 with a work-around that
avoids the previous problem with gcc build compilers

The work-around is to use SmallVector instead of ArrayRef
of basic blocks in preservesResourceLen()/MachineCombiner.cpp



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215151 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 21:40:58 +00:00
Owen Anderson
d4748bbd49 Fix a case in SROA where lifetime intrinsics could inhibit alloca promotion. In
this case, the code path dealing with vector promotion was missing the explicit
checks for lifetime intrinsics that were present on the corresponding integer
promotion path.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215148 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 21:07:35 +00:00
Rafael Espindola
1f4ef379d2 Fix test failure on ARM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215140 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 20:33:06 +00:00
Rafael Espindola
a7e87a12f9 Remove a few XFAILs.
These tests now pass with MCJIT.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215136 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 19:35:22 +00:00
Akira Hatanaka
70b56056a1 [Branch probability] Recompute branch weights of tail-merged basic blocks.
BranchFolderPass was not correctly setting the basic block branch weights when
tail-merging created or merged blocks. This patch recomutes the weights of
tail-merged blocks using the following formula:

branch_weight(merged block to successor j) =
sum(block_frequency(bb) * branch_probability(bb -> j))

bb is a block that is in the set of merged blocks.

<rdar://problem/16256423>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215135 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 19:30:13 +00:00
Joerg Sonnenberger
71c5eed711 Add the majority of the remaining SPE instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215131 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 18:52:39 +00:00
Justin Bogner
b7c9534f75 FileCheck: Add a flag to allow checking empty input
Currently FileCheck errors out on empty input. This is usually the
right thing to do, but makes testing things like "this command does
not emit some error message" hard to test. This usually leads to
people using "command 2>&1 | count 0" instead, and then the bots that
use guard malloc fail a few hours later.

By adding a flag to FileCheck that allows empty inputs, we can make
tests that consist entirely of "CHECK-NOT" lines feasible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215127 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 18:40:37 +00:00
Joerg Sonnenberger
39637ed35e Indent
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215126 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 18:05:32 +00:00
NAKAMURA Takumi
94287f1561 llvm/test/tools/llvm-objdump: Reorganize target-dependent some tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215122 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 17:17:19 +00:00
Rafael Espindola
875710a2fd Nuke the old JIT.
I am sure we will be finding bits and pieces of dead code for years to
come, but this is a good start.

Thanks to Lang Hames for making MCJIT a good replacement!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215111 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 14:21:18 +00:00
Joerg Sonnenberger
7ad7c75048 Add mfasr and mtasr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215110 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 13:35:34 +00:00
Joerg Sonnenberger
d94b6e8895 Add mfrtcu and mfrtcl instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215109 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 13:16:58 +00:00
Joerg Sonnenberger
5b1fba4a83 Support mttbl and mttbu mnemonic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215108 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 13:06:23 +00:00
Joerg Sonnenberger
80de56ebde Add RFID instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215105 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 12:39:59 +00:00
Joerg Sonnenberger
445a9f9729 Add first bunch of SPE instructions. As they overlap with Altivec, mark
them as parser-only until the disassembler is extended to handle
predicates properly.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215102 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 12:18:21 +00:00
Daniel Sanders
6cc2e1a132 [mips] Add assembler support for .set msa/nomsa directive.
Summary:
These directives are used to toggle whether the assembler accepts MSA-specific instructions or not.

Patch by Matheus Almeida and Toma Tabacu.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D4783


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215099 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 12:03:36 +00:00
Chandler Carruth
0e89fbb120 [x86] Fix another miscompile found through fuzz testing the new vector
shuffle lowering.

This is closely related to the previous one. Here we failed to use the
source offset when swapping in the other case -- where we end up
swapping the *final* shuffle. The cause of this bug is a bit different:
I simply wasn't thinking about the fact that this mask is actually
a slice of a wide mask and thus has numbers that need SourceOffset
applied. Simple fix. Would be even more simple with an algorithm-y thing
to use here, but correctness first. =]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215095 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 10:37:35 +00:00
Chandler Carruth
0651861b7b [x86] Fix another miscompile in the new vector shuffle lowering found
via the fuzz tester.

Here I missed an offset when round-tripping a value through a shuffle
mask. I got it right 2 lines below. See a problem? I do. ;] I'll
probably be adding a little "swap" algorithm which accepts a range and
two values and swaps those values where they occur in the range. Don't
really have a name for it, let me know if you do.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215094 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 10:14:27 +00:00
Chandler Carruth
b3364512fc [x86] Fix another miscompile in the new vector shuffle lowering found
through the new fuzzer.

This one is great: bad operator precedence led the modulus to happen at
the wrong point. All the asserts didn't fire because there were usually
the right values past the end of the 4 element region we were looking
at. Probably could have gotten a crash here with ASan + fuzzing, but the
correctness tests pinpointed this really nicely.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215092 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 09:45:02 +00:00
Pavel Chupin
5d8c984e54 [x32] Use ebp/esp as frame and stack pointer
Summary:
Since pointers are 32-bit on x32 we can use ebp and esp as frame and stack
pointer. Some operations like PUSH/POP and CFI_INSTRUCTION still
require 64-bit register, so using 64-bit MachineFramePtr where required.

X86_64 NaCl uses 64-bit frame/stack pointers, however it's been found that
both isTarget64BitLP64 and isTarget64BitILP32 are true for NaCl. Addressing
this issue here as well by making isTarget64BitLP64 false.

Also mark hasReservedSpillSlot unreachable on X86. See inlined comments.

Test Plan: Add one new simple test and upgrade 2 existing with x32 target case.

Reviewers: nadav, dschuff

Subscribers: llvm-commits, zinovy.nis

Differential Revision: http://reviews.llvm.org/D4617

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215091 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 09:41:19 +00:00
Chandler Carruth
15d82b7d33 [x86] Fix a miscompile in the new shuffle lowering found through the new
fuzz testing.

The function which tested for adjacency did what it said on the tin, but
when I called it, I wanted it to do something more thorough: I wanted to
know if the *pairs* of shuffle elements were adjacent and started at
0 mod 2. In one place I had the decency to try to test for this, but in
the other it was completely skipped, miscompiling this test case. Fix
this by making the helper actually do what I wanted it to do everywhere
I called it (and removing the now redundant code in one place).

I *really* dislike the name "canWidenShuffleElements" for this
predicate. If anyone can come up with a better name, please let me know.
The other name I thought about was "canWidenShuffleMask" but is it
really widening the mask to reduce the number of lanes shuffled? I don't
know. Naming things is hard.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215089 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 08:11:31 +00:00
Pete Cooper
ecdbbbefea Update BitRecTy::convertValue to allow if expressions with bit values on both sides of the if
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215087 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 05:47:10 +00:00
Pete Cooper
64a334da36 Change the { } expression in tablegen to accept sized binary literals which are not just 0 and 1.
It also allows nested { } expressions, as now that they are sized, we can merge pull bits from the nested value.

In the current behaviour, everything in { } must have been convertible to a single bit.
However, now that binary literals are sized, its useful to be able to initialize a range of bits.

So, for example, its now possible to do

bits<8> x = { 0, 1, { 0b1001 }, 0, 0b0 }

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215086 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 05:47:07 +00:00
Pete Cooper
2093e2cb43 Change BitsInit to inherit from TypedInit.
This is useful in a later patch where binary literals such as 0b000 will become BitsInit values instead of IntInit values.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215085 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 05:47:04 +00:00
Pete Cooper
42c1227fd9 Change TableGen so that binary literals such as 0b001 are now sized.
Instead of these becoming an integer literal internally, they now become bits<n> values.

Prior to this change, 0b001 was 1 bit long.  This is confusing as clearly the user gave 3 bits.
This new type holds both the literal value and the size, and so can ensure sizes match on initializers.

For example, this used to be legal

bits<1> x = 0b00;

but now it must be written as

bits<2> x = 0b00;

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215084 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 05:47:00 +00:00
Pete Cooper
f5b7351124 TableGen: Change { } to only accept bits<n> entries when n == 1.
Prior to this change, it was legal to do something like

  bits<2> opc = { 0, 1 };
  bits<2> opc2 = { 1, 0 };
  bits<2> a = { opc, opc2 };

This involved silently dropping bits from opc and opc2 which is very hard to debug.

Now the above test would be an error.  Having tested with an assert, none of LLVM/clang was relying on this behaviour.

Thanks to Adam Nemet for the above test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215083 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 05:46:57 +00:00
Justin Bogner
971bf283f8 DebugInfo: Make a test more portable
mach-o doesn't like sections without segments, and elf is perfectly
happy with commas in section names, so use a Darwin-like section name.

Suggestion by Eric Christopher.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215052 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 03:47:28 +00:00
Kevin Enderby
75d423feed Add the -mcpu= option to llvm-objdump for use with the disassemblers.
Also make the disassembler created with the Mach-O parser (the -m option)
pick up the Target specific attributes specified with -mattr option.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215032 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 23:24:41 +00:00
Reid Kleckner
41d6599bb1 MC X86: Accept ".att_syntax prefix" and diagnose noprefix
Fixes PR18916.  I don't think we need to implement support for either
hybrid syntax.  Nobody should write Intel assembly with '%' prefixes on
their registers or AT&T assembly without them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215031 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 23:21:13 +00:00
Sanjay Patel
b9736caa6a Fix a test that has no checks.
X86 doesn't have fneg, so check for xor.

Differential Revision: http://reviews.llvm.org/D4812


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214992 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 20:45:30 +00:00
Matt Arsenault
60178b180f R600: Cleanup fadd and fsub tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214991 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 20:27:55 +00:00
Rui Ueyama
2764f3ded3 Revert "r214897 - Remove dead zero store to calloc initialized memory"
It broke msan.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214989 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 19:30:38 +00:00
David Blaikie
e7a280ac86 DebugInfo: Fix ranges+gmlt test case to actually exercise the gmlt situation.
Originally this test case tested the specified behavior (that -gmlt
would not produce DW_AT_ranges and that when no CU DW_AT_ranges were
produced, no debug_ranges section (not even an empty list) would be
produced) but then the ranges emission code was improved not to create
ranges of a single element (instead favoring high_pc/low_pc) and so this
test case no longer exercised the -gmlt portion of the behavior.

This caused me some confusion when reading the comments and trying to
update this test case for future changes to -gmlt. I've made this test
resilient to those changes (by using the {{DW_TAG|NULL}} pattern to
block the end of the attribute search at the end of the CU's attribute
list without mandating that it must (or must not) be followed by another
tag (the future changes to -gmlt should produce no subprograms in this
CU))

Fix the test case to have two functions in distinct sections to force
the use of DW_AT_ranges.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214985 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 18:24:19 +00:00
Reid Kleckner
7911f2db78 Add a triple to this test to get the right IR mangling
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214982 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 18:09:15 +00:00
Reid Kleckner
5d04e520c0 Don't count inreg params when mangling fastcall functions
This is consistent with MSVC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214981 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 18:09:04 +00:00
Reid Kleckner
9688239469 Round up the size of byval arguments to MinAlign
Otherwise we can end up with an argument frame size that is not a
multiple of stack slot size, which is very awkward.

This fixes PR20547, which was a bug in x86_64 Sys V vararg handling.
However, it's much easier to test this with x86 callee-cleanup
functions, which previously ended in "retl $6" instead of "retl $8".

This does affect behavior of all backends, but it presumably fixes the
same bug in all of them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214980 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 17:57:23 +00:00
Chad Rosier
4e78deb350 Add test case omitted in r214974.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214975 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 16:06:41 +00:00
Robert Khasanov
ec4188bad7 [AVX512] Added load/store instructions to Register2Memory opcode tables.
Added lowering tests for load/store.

Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214972 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 15:40:34 +00:00
James Molloy
e0243fb42d [AArch64] Add a testcase for r214957.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214965 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 13:31:32 +00:00
Tim Northover
2c0d42ac9a ARM: do not generate BLX instructions on Cortex-M CPUs.
Particularly on MachO, we were generating "blx _dest" instructions on M-class
CPUs, which don't actually exist. They happen to get fixed up by the linker
into valid "bl _dest" instructions (which is why such a massive issue has
remained largely undetected), but we shouldn't rely on that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214959 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 11:13:14 +00:00
Tim Northover
08828a979a ARM-MachO: materialize callee address correctly on v4t.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214958 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 11:13:06 +00:00
Chandler Carruth
a341a8070a [x86] Fix two independent miscompiles in the process of getting the same
test case to actually generate correct code.

The primary miscompile fixed here is that we weren't correctly handling
in-place elements in one half of a single-input v8i16 shuffle when
moving a dword of elements from that half to the other half. Some times,
we would clobber the in-place elements in forming the dword to move
across halves.

The fix to this involves forcibly marking the in-place inputs even when
there is no need to gather them into a dword, and to much more carefully
re-arrange the elements when grouping them into a dword to move across
halves. With these two changes we would generate correct shuffles for
the test case, but found another miscompile. There are also some random
perturbations of the generated shuffle pattern in SSE2. It looks like
a wash; more instructions in some cases fewer in others.

The second miscompile would corrupt the results into nonsense. This is
a buggy pattern in one of the added DAG combines. Mapping elements
through a PSHUFD when pairing redundant half-shuffles is *much* harder
than this code makes it out to be -- it requires reasoning about *all*
of where the input is used in the PSHUFD, not just one part of where it
is used. Plus, we can't combine a half shuffle *into* a PSHUFD but the
code didn't guard against it. I think this was just a bad idea and I've
just removed that aspect of the combine. No tests regress as
a consequence so seems OK.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214954 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 10:16:36 +00:00
Adam Nemet
2b9b50379b [X86] Fixes commit r214890 to match the posted patch
This was another fallout from my local rebase where something went wrong :(

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214951 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 07:13:12 +00:00
Peter Collingbourne
95d1d442c9 [dfsan] Try not to create too many additional basic blocks in functions which
already have a large number of blocks. Works around a performance issue with
the greedy register allocator.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214944 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 00:33:40 +00:00
Matt Arsenault
85dc7da6f3 R600: Increase nearby load scheduling threshold.
This partially fixes weird looking load scheduling
in memcpy test. The load clustering doesn't seem
particularly smart, but this method seems to be partially
deprecated so it might not be worth trying to fix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214943 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 00:29:49 +00:00
Matt Arsenault
c9c70b1651 R600/SI: Implement areLoadsFromSameBasePtr
This currently has a noticable effect on the kernel argument loads.
LDS and global loads are more problematic, I think because of how copies
are currently inserted to ensure that the address is a VGPR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214942 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 00:29:43 +00:00
David Blaikie
8480beefd0 DebugInfo: Assert that any CU for which debug_loc lists are emitted, has at least one range.
This was coming in weird debug info that had variables (and hence
debug_locs) but was in GMLT mode (because it was missing the 13th field
of the compile_unit metadata) so no ranges were constructed. We should
always have at least one range for any CU with a debug_loc in it -
because the range should cover the debug_loc.

The assertion just ensures that the "!= 1" range case inside the
subsequent loop doesn't get entered for the case where there are no
ranges at all, which should never reach here in the first place.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214939 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 00:21:25 +00:00
David Blaikie
406658c31b DebugInfo: Fix a bunch of tests that, owing to their compile_unit metadata not including a 13th field, had some subtle behavior.
Without the 13th field, the "emission kind" field defaults to 0 (which
is not equal to either of the values of the emission kind enum (1 ==
full debug info, 2 == line tables only)).

In this particular instance, the comparison with "FullDebugInfo" was
done when adding elements to the ranges list - so for these test cases
no values were added to the ranges list.

This got weirder when emitting debug_loc entries as the addresses should
be relative to the range of the CU if the CU has only one range (the
reasonable assumption is that if we're emitting debug_loc lists for a CU
that CU has at least one range - but due to the above situation, it has
zero) so the ranges were emitted relative to the start of the section
rather than relative to the start of the CU's singular range.

Fix these tests by accounting for the difference in the description of
debug_loc entries (in some cases making the test ignorant to these
differences, in others adding the extra label difference expression,
etc) or the presence/absence of high/low_pc on the CU, and add the 13th
field to their CUs to enable proper "full debug info" emission here.

In a future commit I'll fix up a bunch of other test cases that are not
so rigorously depending on this behavior, but still doing similarly
weird things due to the missing 13th field.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214937 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 23:57:31 +00:00
Jonathan Roelofs
b23c2d9b2c Re-apply r214881: Fix return sequence on armv4 thumb
This reverts r214893, re-applying r214881 with the test case relaxed a bit to
satiate the build bots.

POP on armv4t cannot be used to change thumb state (unilke later non-m-class
architectures), therefore we need a different return sequence that uses 'bx'
instead:

  POP {r3}
  ADD sp, #offset
  BX r3

This patch also fixes an issue where the return value in r3 would get clobbered
for functions that return 128 bits of data. In that case, we generate this
sequence instead:

  MOV ip, r3
  POP {r3}
  ADD sp, #offset
  MOV lr, r3
  MOV r3, ip
  BX lr

http://reviews.llvm.org/D4748



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214928 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 21:32:21 +00:00
Bill Schmidt
bb639a1f96 [PowerPC] Swap arguments and adjust shift count for vsldoi on little endian
Commits r213915 and r214718 fix recognition of shuffle masks for vmrg*
and vpku*um instructions for a little-endian target, by swapping the
input arguments.  The vsldoi instruction requires similar treatment,
and also needs its shift count adjusted for little endian.

Reviewed by Ulrich Weigand.

This is a bug fix candidate for release 3.5 (and hopefully the last of
those for PowerPC).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214923 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 20:47:25 +00:00
Sanjay Patel
998edc6187 Improved test cases that were added with r214892.
1. Added ':' to CHECK-LABELs
2. Added more CHECKs
3. Added CHECK-NEXTs
4. Added verbose hex immediate comments to CHECKs



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214921 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 20:16:35 +00:00
Rafael Espindola
0ca286752e Don't internalize all but main by default.
This is mostly a cleanup, but it changes a fairly old behavior.

Every "real" LTO user was already disabling the silly internalize pass
and creating the internalize pass itself. The difference with this
patch is for "opt -std-link-opts" and the C api.

Now to get a usable behavior out of opt one doesn't need the funny
looking command line:

opt -internalize -disable-internalize -internalize-public-api-list=foo,bar -std-link-opts

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214919 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 20:10:38 +00:00
Rafael Espindola
936a1b573b Add a test showing the interaction of linker scripts and plugin.
In particular, the linker script is processed early enough for function g
to be internalized.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214916 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 19:56:53 +00:00
Chandler Carruth
fadc91beec [x86] Fix a crasher due to shuffles which cancel each other out and add
a test case.

We also miscompile this test case which is showing a serious flaw in the
single-input v8i16 shuffle code. I've left the specific instruction
checks FIXME-ed out until I can address the bug in the single-input
code, but I wanted to separate out a significant functionality change to
produce correct code from a very simple and targeted crasher fix.

The miscompile problem stems from keeping track of inputs by value
rather than by index. As a consequence of doing this, we can't reliably
update those inputs because they might swap and we can't detect this
without copying the mask.

The blend code now uses indices for the input lists and this seems
strictly better. It also should make it easier to sort things and do
other cleanups. I think the time has come to simplify The Great Lambda
here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214914 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 18:45:49 +00:00
Philip Reames
b835f3446f Remove dead zero store to calloc initialized memory
Optimize the following IR:

%1 = tail call noalias i8* @calloc(i64 1, i64 4)
%2 = bitcast i8* %1 to i32*
; This store is dead and should be removed
store i32 0, i32* %2, align 4

Memory returned by calloc is guaranteed to be zero initialized. If the value being stored is the constant zero (and the store is not otherwise observable across threads), we can delete the store.  If the store is to an out of bounds address, it is undefined and thus also removable.

Reviewed By: nicholas

Differential Revision: http://reviews.llvm.org/D3942




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214897 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 17:48:20 +00:00
Jonathan Roelofs
c2feb6bb3a Revert r214881 because it broke lots of build-bots
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214893 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 17:36:05 +00:00
Sanjay Patel
6902c687b0 Optimize vector fabs of bitcasted constant integer values.
Allow vector fabs operations on bitcasted constant integer values to be optimized
in the same way that we already optimize scalar fabs.

So for code like this:
%bitcast = bitcast i64 18446744069414584320 to <2 x float> ; 0xFFFF_FFFF_0000_0000
%fabs = call <2 x float> @llvm.fabs.v2f32(<2 x float> %bitcast)
%ret = bitcast <2 x float> %fabs to i64

Instead of generating something like this:

movabsq (constant pool loadi of mask for sign bits)
vmovq   (move from integer register to vector/fp register)
vandps  (mask off sign bits)
vmovq   (move vector/fp register back to integer return register)

We should generate:

mov     (put constant value in return register)

I have also removed a redundant clause in the first 'if' statement:
N0.getOperand(0).getValueType().isInteger()

is the same thing as:
IntVT.isInteger()

Testcases for x86 and ARM added to existing files that deal with vector fabs.
One existing testcase for x86 removed because it is no longer ideal.

For more background, please see:
http://reviews.llvm.org/D4770

And:
http://llvm.org/bugs/show_bug.cgi?id=20354

Differential Revision: http://reviews.llvm.org/D4785


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214892 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 17:35:22 +00:00
Adam Nemet
545f89213d [AVX512] Add masking variant and intrinsics for valignd/q
This is similar to what I did with the two-source permutation recently.  (It's
almost too similar so that we should consider generating the masking variants
with some tablegen help.)

Both encoding and intrinsic tests are added as well.  For the latter, this is
what the IR that the intrinsic test on the clang side generates.

Part of <rdar://problem/17688758>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214890 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 17:23:04 +00:00
Jonathan Roelofs
77327b8520 Fix return sequence on armv4 thumb
POP on armv4t cannot be used to change thumb state (unilke later non-m-class
architectures), therefore we need a different return sequence that uses 'bx'
instead:

  POP {r3}
  ADD sp, #offset
  BX r3

This patch also fixes an issue where the return value in r3 would get clobbered
for functions that return 128 bits of data. In that case, we generate this
sequence instead:

  MOV ip, r3
  POP {r3}
  ADD sp, #offset
  MOV lr, r3
  MOV r3, ip
  BX lr

http://reviews.llvm.org/D4748



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214881 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 17:13:17 +00:00
David Blaikie
effcb72a40 Improve test for merged global debug info by using llvm-dwarfdump.
It's a bit of a tradeoff, since llvm-dwarfdump doesn't print the name of
the global symbol being used as an address in the addressing mode, but
this avoids the dependence on hardcoded set labels that keep changing
(5+ commits over the last few years that each update the set label as it
changes due to other, unrelated differences in output). This could've,
instead, been changed to match the set name then match the name in the
string pool but that would present other issues (needing to skip over
the sets that weren't of interest, etc) and checking that the addresses
(granted, without relocations applied - so it's not the whole story)
match in the two variable location descriptions seems sufficient and
fairly stable here.

There are a few similar other tests with similar label dependence that
I'll update soonish.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214878 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 16:20:25 +00:00
Joerg Sonnenberger
2888b08b44 Add accessors for the PPC 403 bank registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214875 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 15:45:15 +00:00
Renato Golin
84bd563975 Add tests for cp10/cp11 on ARMv5/6
Tests for ARMv7/8 are already on diagnostics.s

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214872 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 15:29:41 +00:00
Keith Walker
f2115a0b0d Specify that the thumb setend and blx <immed> instructions are not valid on an m-class target
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214871 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 15:11:59 +00:00
Keith Walker
966fc9344b Define stc2/stc2l/ldc2/ldc2l as thumb2 instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214868 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 14:58:05 +00:00
Joerg Sonnenberger
bb97134ffe Accessors for SSR2 and SSR3 on PPC 403.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214867 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 14:53:05 +00:00
Tom Stellard
94dfb8818d R600/SI: Update MUBUF assembly string to match AMD proprietary compiler
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214866 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 14:48:12 +00:00
Tom Stellard
9a7e35aecc R600/SI: Avoid generating REGISTER_LOAD instructions.
SI doesn't use REGISTER_LOAD anymore, but it was still hitting this code
path for 8-bit and 16-bit private loads.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214865 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 14:40:52 +00:00
Joerg Sonnenberger
deaa09e169 Add dci/ici instructions for PPC 476 and friends.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214864 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 14:40:32 +00:00
Joerg Sonnenberger
2bdd960ae3 Add mftblo and mftbhi for PPC 4xx.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214863 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 14:18:16 +00:00
Joerg Sonnenberger
0f365741f3 Add lswi / stswi for assembler use with a warning to not add patterns
for them.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214862 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 13:34:01 +00:00
Yi Kong
68b3d680f2 AArch64: Add support for instruction prefetch intrinsic
Instruction prefetch is not implemented for AArch64, it is incorrectly
translated into data prefetch instruction.

Differential Revision: http://reviews.llvm.org/D4777


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214860 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 12:46:47 +00:00
James Molloy
72035e9a8e Teach the SLP Vectorizer that keeping some values live over a callsite can have a cost.
Some types, such as 128-bit vector types on AArch64, don't have any callee-saved registers. So if a value needs to stay live over a callsite, it must be spilled and refilled. This cost is now taken into account.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214859 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 12:30:34 +00:00
Joerg Sonnenberger
c754b579ae Allow binary and for tblgen math.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214851 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 09:43:25 +00:00
Chandler Carruth
e6329cf303 [x86] Fix a crash and wrong-code bug in the new vector lowering all
found by a single test reduced out of a failure on llvm-stress.

The start of the problem (and the crash) came when we tried to use
a find of a non-used slot in the move-to half of the move-mask as the
target for two bad-half inputs. While if lucky this will be the first of
a pair of slots which we can place the bad-half inputs into, it isn't
actually guaranteed. This really isn't surprising, not sure what I was
thinking. The correct way to find the two unused slots is to look for
one of the *used* slots. We know it isn't that pair, and we can use some
modular arithmetic to find the other pair by masking off the odd bit and
adding 2 modulo 4. With this, we reliably found a viable pair of slots
for the bad-half inputs.

Sadly, that wasn't enough. We also had a wrong code bug that surfaced
when I reduced the test case for this where we would use the same slot
twice for the two bad inputs. This is because both of the bad inputs
could be in odd slots originally and thus the mod-2 mapping would
actually be the same. The whole point of the weird indexing into the
pair of empty slots was to try to leverage when the end result needed
the two bad-half inputs to be paired in a dword and pre-pair them in the
correct orrientation. This is less important with the powerful combining
we're now doing, and also easier and more reliable to achieve be noting
that we add the bad-half inputs in order. Thus, if they are in a dword
pair, the low part of that will be the first input in the sequence.
Always putting that in the low element will just do the right thing in
addition to computing the correct result.

Test case added. =]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214849 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 08:19:21 +00:00
Juergen Ributzka
eee659a076 [FastISel][AArch64] Implement the FastLowerArguments hook.
This implements basic argument lowering for AArch64 in FastISel. It only
handles a small subset of the C calling convention. It supports simple
arguments that can be passed in GPR and FPR registers.

This should cover most of the trivial cases without falling back to
SelectionDAG.

This fixes <rdar://problem/17890986>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214846 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 05:43:48 +00:00
Kevin Qin
6739735812 Revert "r214832 - MachineCombiner Pass for selecting faster instruction"
It broke compiling of most Benchmark and internal test, as clang got
clashed by segmentation fault or assertion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214845 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 05:43:47 +00:00
Juergen Ributzka
7e9c0bc511 [FastISel][AArch64] Don't perform sign-/zero-extension for function arguments that have already been sign-/zero-extended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214844 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 05:43:44 +00:00
Gerolf Hoflehner
c2328d552c MachineCombiner Pass for selecting faster instruction
sequence on AArch64

Re-commit of r214669 without changes to test cases
LLVM::CodeGen/AArch64/arm64-neon-mul-div.ll and
LLVM:: CodeGen/AArch64/dp-3source.ll
This resolves the reported compfails of the original commit.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214832 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 01:16:13 +00:00
Joerg Sonnenberger
e2f9c8d663 Add TCR register access
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214826 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 23:53:42 +00:00
Joerg Sonnenberger
7c5b978254 Add PPC 603's tlbld and tlbli instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214825 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 23:49:45 +00:00
Bill Schmidt
84fef1f55d [PPC64LE] Fix wrong IR for vec_sld and vec_vsldoi
My original LE implementation of the vsldoi instruction, with its
altivec.h interfaces vec_sld and vec_vsldoi, produces incorrect
shufflevector operations in the LLVM IR.  Correct code is generated
because the back end handles the incorrect shufflevector in a
consistent manner.

This patch and a companion patch for Clang correct this problem by
removing the fixup from altivec.h and the corresponding fixup from the
PowerPC back end.  Several test cases are also modified to reflect the
now-correct LLVM IR.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214800 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 23:21:01 +00:00
Kevin Enderby
2a78a0cd47 Enable Darwin vararg parameters support in assembler macros.
Duplicate the vararg tests for linux and add a tests which mixed
vararg arguments with darwin positional parameters.

Patch by: Janne Grunau <j@jannau.net>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214799 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 23:14:37 +00:00
Joerg Sonnenberger
db3ce56a58 Add simplified aliases for access to DCCR, ICCR, DEAR and ESR
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214797 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 22:56:42 +00:00
Juergen Ributzka
2c68cde701 [FastISel][AArch64] Fix shift lowering for i8 and i16 value types.
This fix changes the parameters #r and #s that are passed to the UBFM/SBFM
instruction to get the zero/sign-extension for free.

The original problem was that the shift left would use the 32-bit shift even for
i8/i16 value types, which could leave the upper bits set with "garbage" values.

The arithmetic shift right on the other side would use the wrong MSB as sign-bit
to determine what bits to shift into the value.

This fixes <rdar://problem/17907720>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214788 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 21:49:51 +00:00
Chandler Carruth
8a74e2fc07 [SDAG] Fix a really, really terrible bug in the DAG combiner.
This code is completely wrong. It is also dead, as if it were to *ever*
run, it would crash. Fortunately, after my work to the combiner, it is
at least *possible* to reach the code, and llvm-stress has found a test
case. Thanks to Patrick for reporting.

It would be really good if anyone who remembers how this code works and
what it was intended to do could add some more obvious test coverage
instead of my completely contrived and reduced test case. My test case
was so brittle I left a bread crumb comment in it to help the next
person to stumble on it and not know what it was actually testing for.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214785 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 21:29:59 +00:00
Joerg Sonnenberger
25c8b4774b tlbre / tlbwe / tlbsx / tlbsx. variants for the PPC 4xx CPUs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214784 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 21:28:22 +00:00
Chad Rosier
82c93451f3 [AArch64] Extend the number of scalar instructions supported in the AdvSIMD
scalar integer instruction pass.

This is a patch I had lying around from a few months ago.  The pass is
currently disabled by default, so nothing to interesting.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214779 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 21:20:25 +00:00
Joerg Sonnenberger
a770bd6cae MC uses .lcomm now, so adjust.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214776 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 21:06:00 +00:00
Reid Kleckner
0b04cf6e71 Fix failure to invoke exception handler on Win64
When the last instruction prior to a function epilogue is a call, we
need to emit a nop so that the return address is not in the epilogue IP
range.  This is consistent with MSVC's behavior, and may be a workaround
for a bug in the Win64 unwinder.

Differential Revision: http://reviews.llvm.org/D4751

Patch by Vadim Chugunov!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214775 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 21:05:27 +00:00
Joerg Sonnenberger
355845b437 Recognize mftbl as alias for mftb, for symmetry with mttb.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214769 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 20:28:34 +00:00
Joerg Sonnenberger
e8f23a8a5d Allow .lcomm with alignment on ELF targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214754 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 18:45:10 +00:00
Joerg Sonnenberger
df64464ad2 Add support for m[ft][di]bat[ul] instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214731 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 17:07:41 +00:00
Joerg Sonnenberger
977b978f93 Add features for PPC 4xx and e500/e500mc instructions.
Move the test cases for them into separate files.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214724 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 15:47:38 +00:00
Ulrich Weigand
29ec7479a1 [PowerPC] Add target triple to vec_urem_const.ll test case
This should hopefully fix build bots on other architectures.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214721 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 14:55:26 +00:00
Robert Khasanov
7017934668 [SKX] Enabling load/store instructions: encoding
Instructions: VMOVAPD, VMOVAPS, VMOVDQA8, VMOVDQA16, VMOVDQA32,VMOVDQA64, VMOVDQU8, VMOVDQU16, VMOVDQU32,VMOVDQU64, VMOVUPD, VMOVUPS,

Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214719 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 14:35:15 +00:00
Ulrich Weigand
c568629589 [PowerPC] Swap arguments to vpkuhum/vpkuwum on little-endian
In commit r213915, Bill fixed little-endian usage of vmrgh* and vmrgl*
by swapping the input arguments.  As it turns out, the exact same fix
is also required for the vpkuhum/vpkuwum patterns.

This fixes another regression in llvmpipe when vector support is
enabled.

Reviewed by Bill Schmidt.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214718 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 13:53:40 +00:00
Ulrich Weigand
d5e9497c88 [PowerPC] MULHU/MULHS are not legal for vector types
I ran into some test failures where common code changed vector division
by constant into a multiply-high operation (MULHU).  But these are not
implemented by the back-end, so we failed to recognize the insn.

Fixed by marking MULHU/MULHS as Expand for vector types.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214716 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 13:27:12 +00:00
Ulrich Weigand
3b7a193521 [PowerPC] Fix and improve vector comparisons
This patch refactors code generation of vector comparisons.

This fixes a wrong code-gen bug for ISD::SETGE for floating-point types,
and improves generated code for vector comparisons in general.

Specifically, the patch moves all logic deciding how to implement vector
comparisons into getVCmpInst, which gets two extra boolean outputs
indicating to its caller whether its needs to swap the input operands
and/or negate the result of the comparison.  Apart from implementing
these two modifications as directed by getVCmpInst, there is no need
to ever implement vector comparisons in any other manner; in particular,
there is never a need to perform two separate comparisons (e.g. one for
equal and one for greater-than, as code used to do before this patch).

Reviewed by Bill Schmidt.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214714 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 13:13:57 +00:00
Daniel Sanders
98b419bff7 [mips] Add assembler support for '.set mipsX'.
Summary:
This patch also fixes an issue with the way the Mips assembler enables/disables architecture
features. Before this patch, the assembler never disabled feature bits. For example,
.set mips64
.set mips32r2

would result in the 'OR' of mips64 with mips32r2 feature bits which isn't right.
Unfortunately this isn't trivial to fix because there's not an easy way to clear
feature bits as the algorithm in MCSubtargetInfo (ToggleFeature) only clears the bits
that imply the feature being cleared and not the implied bits by the feature (there's a
better explanation to the code I added).

Patch by Matheus Almeida and updated by Toma Tabacu

Reviewers: vmedic, matheusalmeida, dsanders

Reviewed By: dsanders

Subscribers: tomatabacu, llvm-commits

Differential Revision: http://reviews.llvm.org/D4123


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214709 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 12:20:00 +00:00
Chandler Carruth
48593d7934 [x86] Just unilaterally prefer SSSE3-style PSHUFB lowerings over clever
use of PACKUS. It's cleaner that way.

I looked at implementing clever combine-based folding of PACKUS chains
into PSHUFB but it is quite hard and doesn't seem likely to be worth it.
The most annoying part would be detecting that the correct masking had
been done to use PACKUS-style instructions as a blend operation rather
than there being any saturating as is indicated by its name. We generate
really nice code for what few test cases I've come up with that aren't
completely contrived for this by just directly prefering PSHUFB and so
let's go with that strategy for now. =]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214707 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 10:17:35 +00:00
Chandler Carruth
93f5d9f093 [x86] Implement more aggressive use of PACKUS chains for lowering common
patterns of v16i8 shuffles.

This implements one of the more important FIXMEs for the SSE2 support in
the new shuffle lowering. We now generate the optimal shuffle sequence
for truncate-derived shuffles which show up essentially everywhere.

Unfortunately, this exposes a weakness in other parts of the shuffle
logic -- we can no longer form PSHUFB here. I'll add the necessary
support for that and other things in a subsequent commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214702 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 09:40:02 +00:00
Kevin Qin
534100b31e Revert "r214669 - MachineCombiner Pass for selecting faster instruction"
This commit broke "make check" for several hours, so get it reverted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214697 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 05:10:33 +00:00
Chandler Carruth
73100d8f33 [x86] Handle single input shuffles in the SSSE3 case more intelligently.
I spent some time looking into a better or more principled way to handle
this. For example, by detecting arbitrary "unneeded" ORs... But really,
there wasn't any point. We just shouldn't build blatantly wrong code so
late in the pipeline rather than adding more stages and logic later on
to fix it. Avoiding this is just too simple.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214680 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 01:14:24 +00:00
Chandler Carruth
f3ef961a94 [x86] Fix the test case added in r214670 and tweaked in r214674 further.
Fundamentally, there isn't a really portable way to test the constant
pool contents. Instead, pin this test to the bare-metal triple. This
also makes it a 64-bit triple which allows us to only match a single
constant pool rather than two. It can also just hard code the '.' prefix
as the format should be stable now that it has a fixed triple. Finally,
I've switched it to use CHECK-NEXT to be more precise in the instruction
sequence expected and to use variables rather than hard coding decisions
by the register allocator.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214679 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 00:54:28 +00:00