Commit Graph

13202 Commits

Author SHA1 Message Date
Daniel Jasper
bf2e6a6be2 Change test to accept an additional critical edge split.
The two hot blocks are right next to each other and I verified that
there is no performance regression by compressing/uncompressing some
files with a minigzip built with the different options.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232629 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-18 12:45:45 +00:00
John Brawn
0328ca6cd7 [ARM] Align stack objects passed to memory intrinsics
Memcpy, and other memory intrinsics, typically tries to use LDM/STM if
the source and target addresses are 4-byte aligned. In CodeGenPrepare
look for calls to memory intrinsics and, if the object is on the
stack, 4-byte align it if it's large enough that we expect that memcpy
would want to use LDM/STM to copy it.

Differential Revision: http://reviews.llvm.org/D7908


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232627 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-18 12:01:59 +00:00
John Brawn
bf60cd0751 Add missing newline to end of test file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232626 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-18 10:45:12 +00:00
Josh Magee
cbaefea0c0 Add testcases for BEXTR.
These BEXTR cases are a check for the 64-bit load form and two negative cases where the bitrange is non-contiguous.  From a private patch equivalent to r189742/PR17028.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232580 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-18 01:34:06 +00:00
Krzysztof Parzyszek
dbe964d3a6 Missed testcase for r232577
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232578 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-18 00:44:46 +00:00
David Majnemer
8f01b96d93 DAGCombiner: fold (xor (shl 1, x), -1) -> (rotl ~1, x)
Targets which provide a rotate make it possible to replace a sequence of
(XOR (SHL 1, x), -1) with (ROTL ~1, x).  This saves an instruction on
architectures like X86 and POWER(64).

Differential Revision: http://reviews.llvm.org/D8350

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232572 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-18 00:03:36 +00:00
David Majnemer
7605cdd6e4 COFF: Let globals with private linkage reside in their own section
COFF COMDATs (for selection kinds other than 'select any') require at
least one non-section symbol in the symbol table.
Satisfy this by morally enhancing the linkage from private to internal.

Differential Revision: http://reviews.llvm.org/D8394

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232570 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 23:54:51 +00:00
Pirama Arumuga Nainar
5e15d64948 Fix bug while building FP16 constant vectors for AArch64
Summary: Building FP16 constant vectors caused the FP16 data to be bitcast to i64.  This patch creates a BITCAST node with the correct value, and adds a test to verify correct handling.

Reviewers: mcrosier

Reviewed By: mcrosier

Subscribers: mcrosier, jmolloy, ab, srhines, llvm-commits, rengolin, aemerson

Differential Revision: http://reviews.llvm.org/D8369

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232562 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 23:10:29 +00:00
David Majnemer
76d3a99d10 Revert "COFF: Let globals with private linkage reside in their own section"
This reverts commit r232539.  This was committed accidently.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232543 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 20:41:11 +00:00
David Majnemer
6526150f82 COFF: Let globals with private linkage reside in their own section
Summary:
COFF COMDATs (for selection kinds other than 'select any') require at
least one non-section symbol in the symbol table.
Satisfy this by morally enhancing the linkage from private to internal.

Reviewers: rafael

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8374

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232539 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 20:39:25 +00:00
Richard Barton
b59aee170f [ARM] Fix offset calculation in ARMBaseRegisterInfo::needsFrameBaseReg
The input offset to needsFrameBaseReg is a negative value below the top of the
stack frame, but when converting to a positive offset from the bottom of the
stack frame this value was negated, causing the final offset to be too large
by twice the input offset's magnitude. Fix that by not negating the offset.

Patch by John Brawn

Differential Revision: http://reviews.llvm.org/D8316

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232513 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 18:20:47 +00:00
Samuel Antao
7684e7d987 Fix R0 use in PowerPC VSX store for FastIsel.
The VSX stores are sometimes generated with a undefined index register, causing %noreg to be used and R0 to be emitted later on. The semantics of the VSX store (e.g. stdsdx) requires R0 to be used as base if we want zero to be used in the computation of the effective address instead of the content of R0. This patch checks if no index register was generated and forces R0 to be used as base address.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232486 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 15:00:57 +00:00
Rafael Espindola
cebed4aaf1 Use createTempSymbol to avoid collisions instead of an ad hoc method.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232483 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 14:50:32 +00:00
Rafael Espindola
99739705ac Call EmitFunctionHeader just before EmitFunctionBody.
This avoids switching to .AMDGPU.config and back and hardcoding the
section it switches back to.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232479 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 14:34:42 +00:00
Rafael Espindola
a480f88b3c Move the EH symbol to the asm printer and use it for the SJLJ case too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232475 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 13:57:48 +00:00
Rafael Espindola
4d3df54336 Replace a use of GetTempSymbol with createTempSymbol.
This is cleaner and avoids a crash in a corner case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232471 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 12:54:04 +00:00
Renato Golin
ce1f16421f [ARM] Add support for ARMV6K subtarget (LLVM)
ARMv6K is another layer between ARMV6 and ARMV6T2. This is the LLVM
side of the changes.

ARMV6 family LLVM implementation.

+-------------------------------------+
| ARMV6                               |
+----------------+--------------------+
| ARMV6M (thumb) | ARMV6K (arm,thumb) | <- From ARMV6K and ARMV6M processors
+----------------+--------------------+    have support for hint instructions
| ARMV6T2 (arm,thumb,thumb2)          |    (SEV/WFE/WFI/NOP/YIELD). They can
+-------------------------------------+    be either real or default to NOP.
| ARMV7 (arm,thumb,thumb2)            |    The two processors also use
+-------------------------------------+    different encoding for them.

Patch by Vinicius Tinti.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232468 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 11:55:28 +00:00
Ahmed Bougacha
df08543f48 [AArch64] Use intermediate step for concat_vectors of illegal truncs.
Optimize concat_vectors of truncated vectors, where the intermediate
type is illegal, to avoid said illegality,  e.g.,
  (v4i16 (concat_vectors (v2i16 (truncate (v2i64))),
                         (v2i16 (truncate (v2i64)))))
->
  (v4i16 (truncate (v4i32 (concat_vectors (v2i32 (truncate (v2i64))),
                                          (v2i32 (truncate (v2i64)))))))

This isn't really target-specific, and, as such, would best go in the
DAGCombiner.  However, ISD::TRUNCATE legality isn't keyed on both input
and result type, so we might generate worse code when we don't know
better.  On AArch64 we know it's fine for v2i64->v4i16 and v4i32->v8i8.
rdar://20022387


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232459 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 03:23:09 +00:00
David Majnemer
759acf348d CodeGen: @llvm.eh.typeid.for replaced @llvm.eh.typeid.for.i32
We removed @llvm.eh.typeid.for.i32 and replaced it with
@llvm.eh.typeid.for quite some time ago.  Fix up some test cases which
never got updated.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232421 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-16 21:36:38 +00:00
Duncan P. N. Exon Smith
763e18696f DebugInfo: Fix testcases that fail -verify-debug-info=true
As part of PR22777, fix testcases that fail the debug info verifier.
The changes fall into the following categories:

  - Empty `filename:` fields in `MDFile`s.  Compile units and some types
    require non-empty filenames.  A number of testcases have empty
    filenames, probably due to hand-reduction of testcases.
  - Not-quite empty arrays: `!{i32 0}`.  This used to be equivalent in
    the debug info schema to `!{}`.  They cause problems for
    `!MDSubroutineType`'s `types:` array, since it requires all operands
    to be valid types.  (Note that `!{null}` is the correct type array
    for functions that take no arguments and return `void`.)
  - Significantly bitrotted testcases.  Nodes got left behind a few
    upgrades ago because of missing or invalid tags.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232415 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-16 21:10:12 +00:00
Sanjay Patel
8233e8c233 fixed to test feature, not CPU
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232398 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-16 18:24:28 +00:00
Sanjay Patel
89095a7882 add CHECK-LABELs for more reliable testing
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232391 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-16 17:59:07 +00:00
Sanjay Patel
8f74fd0883 fixed to test feature, not CPU; removed unnecessary declaration
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232387 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-16 17:01:34 +00:00
Tom Stellard
6ebc34281f R600/SI: don't try min3/max3/med3 with f64
There are no opcodes for this. This also adds a test case.

v2: make test more robust

Patch by: Grigori Goronzy

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232386 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-16 15:53:55 +00:00
Petar Jovanovic
b3b90bd679 [MIPS] Fix justify error for small structures
Fix justify error for small structures bigger than 32 bits in fixed
arguments for MIPS64 big endian. There was a problem when small structures
are passed as fixed arguments. The structures that are bigger than 32 bits
but smaller than 64 bits were not left justified properly on MIPS64 big
endian. This is fixed by shifting the value to make it left justified when
appropriate.

Patch by Aleksandar Beserminji.

Differential Revision: http://reviews.llvm.org/D8174


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232382 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-16 15:01:09 +00:00
Rafael Espindola
8d8c155a61 Use the i8 immediate cmp instructions when possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232378 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-16 14:25:08 +00:00
Simon Pilgrim
4f3864d05f [SSE} Added tests for float4-float3 conversions (PR11580)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232324 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-15 16:19:15 +00:00
Simon Pilgrim
54db4092c1 Simplified some stack folding tests.
Replaced explicit pmovzx* intrinsic tests with general shuffles

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232286 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-14 23:16:43 +00:00
Daniel Jasper
439cc2c5de [MachineLICM] First steps of sinking GEPs near calls.
Specifically, if there are copy-like instructions in the loop header
they are moved into the loop close to their uses. This reduces the live
intervals of the values and can avoid register spills.

This is working towards a fix for http://llvm.org/PR22230.
Review: http://reviews.llvm.org/D7259

Next steps:
- Find a better cost model (which non-copy instructions should be sunk?)
- Make this dependent on register pressure

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232262 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-14 10:58:38 +00:00
Ahmed Bougacha
4a2d95826e Add a bunch of CHECK missing colons in tests. NFC.
Some wouldn't pass;  fixed most, the rest will be fixed separately.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232239 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-14 01:43:57 +00:00
Rafael Espindola
89c84b0c83 Use add32ri8 and friends on fast isel.
This fixes pr22854.

The core issue on the bug is that there are multiple instructions that
print the same in assembly. In fact, there doesn't seem to be any
syntax for specifying that a constant that fits in 8 bits should use a 32 bit
immediate.

The attached patch changes fast isel to consider i16immSExt8,
i32immSExt8, and i64immSExt8. They were disabled because fastisel didn’t know
to call the predicate back in the day.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232223 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-13 22:18:18 +00:00
David Blaikie
5a70dd1d82 [opaque pointer type] Add textual IR support for explicit type parameter to gep operator
Similar to gep (r230786) and load (r230794) changes.

Similar migration script can be used to update test cases, which
successfully migrated all of LLVM and Polly, but about 4 test cases
needed manually changes in Clang.

(this script will read the contents of stdin and massage it into stdout
- wrap it in the 'apply.sh' script shown in previous commits + xargs to
apply it over a large set of test cases)

import fileinput
import sys
import re

rep = re.compile(r"(getelementptr(?:\s+inbounds)?\s*\()((<\d*\s+x\s+)?([^@]*?)(|\s*addrspace\(\d+\))\s*\*(?(3)>)\s*)(?=$|%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|zeroinitializer|<|\[\[[a-zA-Z]|\{\{)", re.MULTILINE | re.DOTALL)

def conv(match):
  line = match.group(1)
  line += match.group(4)
  line += ", "
  line += match.group(2)
  return line

line = sys.stdin.read()
off = 0
for match in re.finditer(rep, line):
  sys.stdout.write(line[off:match.start()])
  sys.stdout.write(conv(match))
  off = match.end()
sys.stdout.write(line[off:])

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232184 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-13 18:20:45 +00:00
Andrea Di Biagio
d288259ccd [X86][AVX] Fix wrong lowering of v4x64 shuffles into concat_vector plus extract_subvector nodes.
This patch fixes a bug in the shuffle lowering logic implemented by function
'lowerV2X128VectorShuffle'.

The are few cases where function 'lowerV2X128VectorShuffle' wrongly expands a
shuffle of two v4X64 vectors into a CONCAT_VECTORS of two EXTRACT_SUBVECTOR
nodes. The problematic expansion only occurs when the shuffle mask M has an
'undef' element at position 2, and M is equivalent to mask <0,1,4,5>.
In that case, the algorithm propagates the wrong vector to one of the two
new EXTRACT_SUBVECTOR nodes.

Example:
;;
define <4 x double> @test(<4 x double> %A, <4 x double> %B) {
entry:
  %0 = shufflevector <4 x double> %A, <4 x double> %B, <4 x i32><i32 undef, i32 1, i32 undef, i32 5>
  ret <4 x double> %0
}
;;

Before this patch, llc (-mattr=+avx) generated:
  vinsertf128 $1, %xmm0, %ymm0, %ymm0

With this patch, llc correctly generates:
  vinsertf128 $1, %xmm1, %ymm0, %ymm0

Added test lower-vec-shuffle-bug.ll

Differential Revision: http://reviews.llvm.org/D8259


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232179 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-13 17:29:49 +00:00
Matt Arsenault
462f98dd60 R600/SI: Add test for min / max with immediate
Make sure this isn't getting confused by canonicalizations
of comparisons with a constant.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232177 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-13 16:43:48 +00:00
Hao Liu
fcc897cc45 [MachineCopyPropagation] Fix a bug causing incorrect removal for the instruction sequences as follows
%Q5_Q6<def> = COPY %Q2_Q3
   %D5<def> =
   %D3<def> =
   %D3<def> = COPY %D6     // Incorrectly removed in MachineCopyPropagation
   Using of %D3 results in incorrect result ...

   Reviewed in http://reviews.llvm.org/D8242 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232142 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-13 05:15:23 +00:00
Sanjay Patel
cae9695fbb [X86, AVX2] Replace inserti128 and extracti128 intrinsics with generic shuffles
This should complete the job started in r231794 and continued in r232045:
We want to replace as much custom x86 shuffling via intrinsics
as possible because pushing the code down the generic shuffle
optimization path allows for better codegen and less complexity
in LLVM.

AVX2 introduced proper integer variants of the hacked integer insert/extract
C intrinsics that were created for this same functionality with AVX1.

This should complete the removal of insert/extract128 intrinsics.

The Clang precursor patch for this change was checked in at r232109.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232120 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 23:16:18 +00:00
Simon Pilgrim
7385cafb7a Removed useless palignr test - we don't actually provide a llvm.x86.ssse3.palign.r.128 intrinsic
Differential Revision: http://reviews.llvm.org/D8302

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232108 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 21:42:03 +00:00
Tom Stellard
3d712a6373 R600/SI: Remove _e32 and _e64 suffixes from mnemonics
Instead print them as part of the $dst operand.  The AsmMatcher
requires the 32-bit and 64-bit encodings have the same mnemonic in
order to parse them correctly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232105 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 21:34:22 +00:00
Andrew Kaylor
d434c0d548 Adding WinEHPrepare tests (currently XFAILs)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232104 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 21:32:59 +00:00
Krzysztof Parzyszek
49fa37992d Unxfail passing test on Hexagon
test/CodeGen/Generic/2008-02-20-MatchingMem.ll


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232098 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 20:38:10 +00:00
Quentin Colombet
be45e0e669 [X86] Fix a regression introduced by r223641.
The permps and permd instructions have their operands swapped compared to the
intrinsic definition. Therefore, they do not fall into the INTR_TYPE_2OP
category.

I did not create a new category for those two, as they are the only one AFAICT
in that case.

<rdar://problem/20108262>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232085 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 19:34:12 +00:00
Krzysztof Parzyszek
7b110fe366 Remove unused complex patterns for addressing modes on Hexagon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232057 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 16:44:50 +00:00
Andrea Di Biagio
be9322ae7c [X86] Fix wrong target specific combine on SETCC nodes.
Part of the folding logic implemented by function 'PerformISDSETCCCombine'
only worked under the assumption that the condition code in input could have
been either SETNE or SETEQ.
Unfortunately that assumption was incorrect, and in some cases the algorithm
ended up incorrectly folding SETCC nodes.

The incorrect folding only affected SETCC dag nodes where:
 - one of the operands was a build_vector of all zeroes;
 - the other operand was a SIGN_EXTEND from a vector of MVT:i1 elements;
 - the condition code was neither SETNE nor SETEQ.

Example:
  (setcc (v4i32 (sign_extend v4i1:%A)), (v4i32 VectorOfAllZeroes), setge)

Before this patch, the entire dag node sequence from the example was
incorrectly folded to node %A.

With this patch, the dag node sequence is folded to a
  (xor %A, (v4i1 VectorOfAllOnes)).

Added test setcc-combine.ll.

Thanks to Greg Bedwell for spotting this issue.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232046 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 15:16:58 +00:00
Sanjay Patel
b4c1547749 [X86, AVX] replace vextractf128 intrinsics with generic shuffles
Now that we've replaced the vinsertf128 intrinsics, 
do the same for their extract twins.

This is very much like D8086 (checked in at r231794):
We want to replace as much custom x86 shuffling via intrinsics
as possible because pushing the code down the generic shuffle
optimization path allows for better codegen and less complexity
in LLVM.

This is also the LLVM sibling to the cfe D8275 patch.

Differential Revision: http://reviews.llvm.org/D8276



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232045 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 15:15:19 +00:00
Simon Pilgrim
df0acf35f3 [X86][AVX2] Added missing palignr stack folding test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232033 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 13:12:33 +00:00
Jingyue Wu
3ea0adcdd5 [NVPTXAsmPrinter] do not print .align on function headers
Summary:
PTX does not allow .align directives on function headers.

Fixes PR21551.

Test Plan: test/Codegen/NVPTX/function-align.ll

Reviewers: eliben, jholewinski

Reviewed By: eliben, jholewinski

Subscribers: llvm-commits, eliben, jpienaar, jholewinski

Differential Revision: http://reviews.llvm.org/D8274

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232004 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 01:50:30 +00:00
Reid Kleckner
b53bb04b2f Remove some CHECK-NOT lines in favor of CHECK-NEXT
NFC, this is just shorter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232000 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 01:38:48 +00:00
Reid Kleckner
7dedaabcae Stop calling DwarfEHPrepare from WinEHPrepare
Instead, run both EH preparation passes, and have them both ignore
functions with unrecognized EH personalities. Pass delegation involved
some hacky code for creating an AnalysisResolver that we don't need now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231995 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 00:36:20 +00:00
Reid Kleckner
70992ac969 Handle big index in getelementptr instruction
CodeGen incorrectly ignores (assert from APInt) constant index bigger
than 2^64 in getelementptr instruction. This is a test and fix for that.

Patch by Paweł Bylica!

Reviewed By: rnk

Subscribers: majnemer, rnk, mcrosier, resistor, llvm-commits

Differential Revision: http://reviews.llvm.org/D8219

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231984 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-11 23:36:10 +00:00
Andrew Kaylor
1134ac4a0f Extended support for native Windows C++ EH outlining
Differential Review: http://reviews.llvm.org/D7886



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231981 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-11 23:22:06 +00:00
Jozef Kolek
a2b4e9a30e [mips][microMIPS] Make usage of NOT16 by code generator
Differential Revision: http://reviews.llvm.org/D7748


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231963 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-11 20:28:31 +00:00
Sanjay Patel
02402b3cc1 add CHECK-LABELs for better reliability
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231962 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-11 20:12:07 +00:00
Rafael Espindola
0c78583bf6 Put jump tables in unique sections on COFF.
If a function is going in an unique section (because of -ffunction-sections
for example), putting a jump table in .rodata will keep .rodata alive and
that will keep alive any other function that also has a jump table.

Instead, put the jump table in a unique section that is associated with the
function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231961 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-11 19:58:37 +00:00
Tim Northover
52f83a9ab3 ARM: simplify and extend byval handling
The main issue being fixed here is that APCS targets handling a "byval align N"
parameter with N > 4 were miscounting what objects were where on the stack,
leading to FrameLowering setting the frame pointer incorrectly and clobbering
the stack.

But byval handling had grown over many years, and had multiple layers of cruft
trying to compensate for each other and calculate padding correctly. This only
really needs to be done once, in the HandleByVal function. Elsewhere should
just do what it's told by that call.

I also stripped out unnecessary APCS/AAPCS distinctions (now that Clang emits
byvals with the correct C ABI alignment), which simplified HandleByVal.

rdar://20095672

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231959 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-11 18:54:22 +00:00
Derek Schuff
87e6561f34 Make NaCl's use of .init_array for static constructors match Linux
Summary:
The generic ELF TargetObjectFile defaults to .ctors, but Linux's
defaults to .init_array by calling InitializeELF with the value of
UseInitArray from TargetMachine. Make NaCl's behavior match.

Reviewers: jvoung
Differential Revision: http://reviews.llvm.org/D8240

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231934 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-11 16:16:09 +00:00
Quentin Colombet
7775242da3 [CodeGenPrepare] Refine the cost model provided by the promotion helper.
- Use TargetLowering to check for the actual cost of each extension.
- Provide a factorized method to check for the cost of an extension:
  TargetLowering::isExtFree.
- Provide a virtual method TargetLowering::isExtFreeImpl for targets to be able
  to tune the cost of non-free extensions.

This refactoring offers a better granularity to model what really happens on
different targets.

No performance changes and very few code differences.

Part of <rdar://problem/19267165> 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231855 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-10 21:48:15 +00:00
Nemanja Ivanovic
dc12298109 Add support for part-word atomics for PPC
http://reviews.llvm.org/D8090#inline-67337


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231843 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-10 20:51:07 +00:00
Ahmed Bougacha
4a3cd42601 [AArch64] Avoid going through GPRs for across-vector instructions.
This adds new node types for each intrinsic.
For instance, for addv, we have AArch64ISD::UADDV, such that:
  (v4i32 (uaddv ...))
is the same as
  (v4i32 (scalar_to_vector (i32 (int_aarch64_neon_uaddv ...))))
that is,
  (v4i32 (INSERT_SUBREG (v4i32 (IMPLICIT_DEF)),
           (i32 (int_aarch64_neon_uaddv ...)), ssub)

In a combine, we transform all such across-vector-lanes intrinsics to:

  (i32 (extract_vector_elt (uaddv ...), 0))

This has one big advantage: by making the extract_element explicit, we
enable the existing patterns for lane-aware instructions to fire.
This lets us avoid needlessly going through the GPRs.  Consider:

    uint32x4_t test_mul(uint32x4_t a, uint32x4_t b) {
        return vmulq_n_u32(a, vaddvq_u32(b));
    }

We now generate:
    addv.4s  s1, v1
    mul.4s   v0, v0, v1[0]
instead of the previous:
    addv.4s  s1, v1
    fmov     w8, s1
    dup.4s   v1, w8
    mul.4s   v0, v1, v0

rdar://20044838


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231840 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-10 20:45:38 +00:00
Kit Barton
1f9ea3a230 Change the generation of the vmuluwm instruction to be based on the MUL opcode.
Phabricator review: http://reviews.llvm.org/D8185


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231827 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-10 19:49:38 +00:00
Igor Laevsky
68beb2a9ec Teach lowering to correctly handle invoke statepoint and gc results tied to them. Note that we still can not lower gc.relocates for invoke statepoints.
Also it extracts getCopyFromRegs helper function in SelectionDAGBuilder as we need to be able to customize type of the register exported from basic block during lowering of the gc.result.
(Resubmitting this change after not being able to reproduce buildbot failure)

Differential Revision: http://reviews.llvm.org/D7760



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231800 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-10 16:26:48 +00:00
Sanjay Patel
137e1f3f28 [X86, AVX] replace vinsertf128 intrinsics with generic shuffles
We want to replace as much custom x86 shuffling via intrinsics
as possible because pushing the code down the generic shuffle
optimization path allows for better codegen and less complexity
in LLVM.

This is the sibling patch for the Clang half of this change:
http://reviews.llvm.org/D8088

Differential Revision: http://reviews.llvm.org/D8086



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231794 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-10 16:08:36 +00:00
Colin LeMahieu
376b961126 [Hexagon] Removing unused patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231723 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-09 23:08:46 +00:00
Ahmed Bougacha
55a060641f [CodeGen] Replace the reused stores' chain for extractelt expansion.
This fixes a subtle issue that was introduced in r205153.

When reusing a store for the extractelement expansion (to load directly
from it, inserting of going through the stack), later stores to the
same location might have overwritten the data we were expecting to
extract from.

To fix that, we need to explicitly replace the chain going out of the
reused store, so that later stores also have an explicit dependency on
the generated element-extracting loads, and can't clobber them.

rdar://20066785
Differential Revision: http://reviews.llvm.org/D8180


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231721 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-09 22:51:05 +00:00
Ahmed Bougacha
fad749559c [X86] Add nounwind to vector-idiv.ll testcases. NFC.
In preparation for a patch where cfi directives get in the way.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231720 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-09 22:46:02 +00:00
Reid Kleckner
4c27f8d49e Reland r229944: EH: Prune unreachable resume instructions during Dwarf EH preparation
Fix the double-deletion of AnalysisResolver when delegating through to
Dwarf EH preparation by creating one from scratch. Hopefully the new
pass manager simplifies this.

This reverts commit r229952.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231719 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-09 22:45:16 +00:00
Colin LeMahieu
ffc2de43d9 [Hexagon] Reapply r231699. Remove assumption that second operand is an immediate when checking if A2_tfrsi is combinable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231710 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-09 21:48:13 +00:00
Colin LeMahieu
c2d30aebf3 [Hexagon] Reverting r231699
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231703 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-09 21:19:02 +00:00
Colin LeMahieu
8c2919a34e [Hexagon] Updating constant set to simpler versions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231699 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-09 20:33:12 +00:00
Colin LeMahieu
a0ce232a65 [Hexagon] Eliminating immediate condition set.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231693 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-09 19:57:18 +00:00
Rafael Espindola
be886690bd Print jump tables before exception tables.
In the case where just tables are part of the function section, this produces
more readable assembly by avoiding switching to the eh section and back
to .text.

This would also break with non unique section names, as trying to switch to
a unique section actually creates a new one.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231677 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-09 18:29:12 +00:00
Reed Kotler
18afdb3210 Add logical ops to Mips fast-isel
Summary:
Code is mostly copied from AArch64 port and modified where needed for Mips.

This handles the "non" legal cases of logical ops. Legal cases are handled by tablegen patterns.

Test Plan:
Make check test logopm.ll

All of test-suite passes at O0/O2 and mips32 r1/r2 with this new change.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: echristo, llvm-commits, aemerson, rfuhler

Differential Revision: http://reviews.llvm.org/D6599

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231665 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-09 16:28:10 +00:00
Marek Olsak
c4ca7b59db R600/SI: Limit SGPRs to 80 on Tonga and Iceland
This is a candidate for stable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231659 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-09 15:48:09 +00:00
Andrea Di Biagio
d1232fa1c0 Fix line ending in test CodeGen/X86/pr22774.ll. NFC.
Also, replaced line with 'target triple' with flag -mtriple on the RUN line.
Removed the data layout string as it is not needed.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231654 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-09 15:02:01 +00:00
Andrea Di Biagio
692f7382b5 [X86][AVX] Fix wrong lowering of VPERM2X128 nodes
There were cases where the backend computed a wrong permute mask for a VPERM2X128 node.

Example:
\code
define <8 x float> @foo(<8 x float> %a, <8 x float> %b) {
  %shuffle = shufflevector <8 x float> %a, <8 x float> %b, <8 x i32> <i32 undef, i32 undef, i32 6, i32 7, i32 undef, i32 undef, i32 6, i32 7>
  ret <8 x float> %shuffle
}
\code end

Before this patch, llc (with -mattr=+avx) emitted the following vperm2f128:
  vperm2f128 $0, %ymm0, %ymm0, %ymm0  # ymm0 = ymm0[0,1,0,1]

With this patch, llc emits a vperm2f128 with a correct permute mask:
  vperm2f128 $17, %ymm0, %ymm0, %ymm0  # ymm0 = ymm0[2,3,2,3]

Differential Revision: http://reviews.llvm.org/D8119


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231601 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-08 16:28:47 +00:00
Andrea Di Biagio
15d2c3fb00 [DAGCombiner] Fix wrong folding of AND dag nodes.
This patch fixes the logic in the DAGCombiner that folds an AND node according
to rule: (and (X (load V)), C) -> (X (load V))

An AND between a vector load 'X' and a constant build_vector 'C' can be folded
into the load itself only if we can prove that the AND operation is redundant.
The algorithm implemented by 'visitAND' firstly computes the splat value 'S'
from C, and then checks if S has the lower 'B' bits set (where B is the size in
bits of the vector element type). The algorithm takes into account also the
'undef' bits in the splat mask.

Unfortunately, the algorithm only worked under the assumption that the size of S
is a multiple of the vector element type. With this patch, we conservatively
avoid folding the AND if the splat bits are not compatible with the vector
element type.

Added X86 test and-load-fold.ll

Differential Revision: http://reviews.llvm.org/D8085


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231563 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-07 12:24:55 +00:00
Simon Pilgrim
62ba058dea [DAGCombiner] SCALAR_TO_VECTOR(EXTRACT_VECTOR_ELT(V,C)) -> VECTOR_SHUFFLE
This patch attempts to convert a SCALAR_TO_VECTOR using an operand from an EXTRACT_VECTOR_ELT into a VECTOR_SHUFFLE.

This prevents many cases of spilling scalar data between the gpr + simd registers. 

At present the optimization only accepts cases where there is no TRUNC of the scalar type (i.e. all types must match).

Differential Revision: http://reviews.llvm.org/D8132

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231554 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-07 05:52:42 +00:00
Eric Christopher
80d55e6bcb Remove use of misched-bench from this test and replace it with
non-temporary enabling options. This is part of removing misched-bench
as an option.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231546 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-07 01:39:06 +00:00
Eric Christopher
5dc2251b4e Recommit r231324 with a fix to the ARM execution domain code
to disable lane switching if we don't actually have the instruction
set we want to switch to. Models the earlier check above the
conditional for the pass.

The testcase is one that triggered with the assert that's added
as part of the fix, use it to avoid adding a new testcase as it
highlights the same problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231539 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-07 00:12:22 +00:00
Quentin Colombet
05a3f9120a [AArch64][LoadStoreOptimizer] Generate LDP + SXTW instead of LD[U]R + LD[U]RSW.
Teach the load store optimizer how to sign extend a result of a load pair when
it helps creating more pairs.
The rational is that loads are more expensive than sign extensions, so if we
gather some in one instruction this is better!

<rdar://problem/20072968>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231527 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-06 22:42:10 +00:00
Sanjay Patel
5bac8f9b95 fixed to test features, not CPUs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231524 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-06 21:50:42 +00:00
Sanjay Patel
f528a40240 fixed to test features, not CPUs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231523 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-06 21:50:27 +00:00
Sanjay Patel
5ff9d3e6ed loosen checking for buildbots
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231522 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-06 21:30:18 +00:00
Sanjay Patel
aedb16fc6f fixed to test only the feature, not the feature and a CPU
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231521 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-06 21:24:56 +00:00
Sanjay Patel
b00a131bda fixed to test only the feature, not the feature and a CPU
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231520 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-06 21:19:32 +00:00
Sanjay Patel
50652e746b fixed test to use FileCheck
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231519 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-06 21:16:15 +00:00
Sanjay Patel
6deb05e63a fixed to use CHECK-LABELs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231517 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-06 21:05:02 +00:00
Sanjay Patel
d532d10275 fixed to test only the feature, not the feature and a CPU
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231516 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-06 20:58:15 +00:00
Sanjay Patel
f2b34a568b fixed to test only the feature, not the feature and a CPU
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231515 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-06 20:57:40 +00:00
Sanjay Patel
9e03e85cac fixed to test feature, not CPU
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231513 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-06 20:51:25 +00:00
Sanjay Patel
17392758a1 fixed to test features, not CPUs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231512 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-06 20:46:16 +00:00
Sanjay Patel
91b79125ec fixed test to use SSE2 attribute
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231510 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-06 20:38:55 +00:00
Sanjay Patel
e7abe0cdbc fixed to test only the feature, not the feature and a CPU
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231509 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-06 20:34:20 +00:00
Matthias Braun
47941aa098 DAGCombiner: Canonicalize select(and/or,x,y) depending on target.
This is based on the following equivalences:
select(C0 & C1, X, Y) <=> select(C0, select(C1, X, Y), Y)
select(C0 | C1, X, Y) <=> select(C0, X, select(C1, X, Y))

Many target cannot perform and/or on the CPU flags and therefore the
right side should be choosen to avoid materializign the i1 flags in an
integer register. If the target can perform this operation efficiently
we normalize to the left form.

Differential Revision: http://reviews.llvm.org/D7622

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231507 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-06 19:49:10 +00:00
Michael Zolotukhin
6023ad2d37 LegalizeTypes: Handle shift by 0 in ExpandShiftByConstant.
Though such shifts are usually optimized away by combiner, we still can
encounter them after a vector shift is legalized.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231443 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-06 01:13:01 +00:00
Sanjay Patel
5f79fd2f02 [AVX] Lower / fast-isel scalar FP selects into VBLENDV instructions (PR22483)
This patch reduces code size for all AVX targets and increases speed for some chips.

SSE 4.1 introduced the useless (see code comments) 2-register form of BLENDV and
only in the packed float/double flavors.

AVX subsequently made the instruction useful by adding a 4-register operand form.

So we just need to paper over the lack of scalar forms of this instruction, complicate
the code to choose float or double forms, and use blendv on scalars since all FP is in
xmm registers anyway.

This gives us an approximately 50% speed up for a blendv microbenchmark sequence
on SandyBridge and Haswell:
blendv : 29.73 cycles/iter
logic : 43.15 cycles/iter

No new test cases with this patch because:

1. fast-isel-select-sse.ll tests the positive side for regular X86 lowering and fast-isel
2. sse-minmax.ll and fp-select-cmp-and.ll confirm that we're not firing for scalar selects without AVX
3. fp-select-cmp-and.ll and logical-load-fold.ll confirm that we're not firing for scalar selects with constants.

http://llvm.org/bugs/show_bug.cgi?id=22483

Differential Revision: http://reviews.llvm.org/D8063



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231408 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-05 21:46:54 +00:00
Ahmed Bougacha
77f46f4f9f [AArch64] Teach AsmPrinter about GlobalAddress operands.
Fixes PR22761, rdar://20024866.
Differential Revision: http://reviews.llvm.org/D8042


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231400 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-05 20:04:21 +00:00
Rafael Espindola
2f76abe7d7 Use the correct func begin symbol in all places in ppc.
I missed an occurrence of the old symbol in my previous patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231398 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-05 19:47:50 +00:00
Ahmed Bougacha
67297cd956 [ARM] Enable vector extload combine for legal types.
This commit enables forming vector extloads for ARM.
It only does so for legal types, and when we can't fold the extension
in a wide/long form of the user instruction.

Enabling it for larger types isn't as good an idea on ARM as it is on
X86, because: 
- we pretend that extloads are legal, but end up generating vld+vmov
- we have instructions like vld {dN, dM}, which can't be generated
  when we "manually expand" extloads to vld+vmov.

For legal types, the combine doesn't fire that often: in the
integration tests only in a big endian testcase, where it removes a
pointless AND.

Related to rdar://19723053
Differential Revision: http://reviews.llvm.org/D7423


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231396 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-05 19:37:53 +00:00
Rafael Espindola
2e2dbc35da Use the generic Lfunc_begin label on ppc.
This removes yet another custom label to mark the start of a function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231390 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-05 18:55:50 +00:00
David Majnemer
42fcf79f36 X86: Optimize address mode matching for FRAME_ALLOC_RECOVER nodes
We know that the absolute symbol will be less than 2GB and thus will
always fit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231389 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-05 18:50:12 +00:00
Reid Kleckner
9f7c861416 Replace llvm.frameallocate with llvm.frameescape
Turns out it's pretty straightforward and simplifies the implementation.

Reviewers: andrew.w.kaylor

Differential Revision: http://reviews.llvm.org/D8051

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231386 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-05 18:26:34 +00:00
Simon Pilgrim
a744a15e97 [DagCombiner] Allow shuffles to merge through bitcasts
Currently shuffles may only be combined if they are of the same type, despite the fact that bitcasts are often introduced in between shuffle nodes (e.g. x86 shuffle type widening).

This patch allows a single input shuffle to peek through bitcasts and if the input is another shuffle will merge them, shuffling using the smallest sized type, and re-applying the bitcasts at the inputs and output instead.

Dropped old ShuffleToZext test - this patch removes the use of the zext and vector-zext.ll covers these anyhow.

Differential Revision: http://reviews.llvm.org/D7939

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231380 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-05 17:14:04 +00:00
Kit Barton
b98636a0f8 While reviewing the changes to Clang to add builtin support for the vsld, vsrd, and vsrad instructions, it was pointed out that the builtins are generating the LLVM opcodes (shl, lshr, and ashr) not calls to the intrinsics. This patch changes the implementation of the vsld, vsrd, and vsrad instructions from from intrinsics to VXForm_1 instructions and makes them legal with P8 Altivec. It also removes the definition of the int_ppc_altivec_vsld, int_ppc_altivec_vsrd, and int_ppc_altivec_vsrad intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231378 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-05 16:24:38 +00:00
Igor Laevsky
684d323b9b Revert change r231366 as it broke clang-native-arm-cortex-a9 Analysis/properties.m test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231374 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-05 15:41:14 +00:00
Elena Demikhovsky
e670dc7848 AVX-512, SKX: Enabled masked_load/store operations for this target.
Added lowering for ISD::CONCAT_VECTORS and ISD::INSERT_SUBVECTOR for i1 vectors,
it is needed to pass all masked_memop.ll tests for SKX.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231371 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-05 15:11:35 +00:00
Igor Laevsky
f8b3003ab8 Teach lowering to correctly handle invoke statepoint and gc results tied to them. Note that we still can not lower gc.relocates for invoke statepoints.
Also it extracts getCopyFromRegs helper function in SelectionDAGBuilder as we need to be able to customize type of the register exported from basic block during lowering of the gc.result.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231366 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-05 14:11:21 +00:00
Craig Topper
62eaac6087 [X86] Use vmovss to handle inserting an element into index 0 of a v8f32 vector of zeros.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231354 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-05 06:38:42 +00:00
Chandler Carruth
4197c13062 [MBP] Revert r231238 which attempted to fix a nasty bug where MBP is
just arbitrarily interleaving unrelated control flows once they get
moved "out-of-line" (both outside of natural CFG ordering and with
diamonds that cannot be fully laid out by chaining fallthrough edges).

This easy solution doesn't work in practice, and it isn't just a small
bug. It looks like a very different strategy will be required. I'm
working on that now, and it'll again go behind some flag so that
everyone can experiment and make sure it is working well for them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231332 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-05 01:07:03 +00:00
Matthias Braun
29aeaf5408 Improve test robustness
Improve test robustness in preparation of coming commits:
- Avoid undefs which may get propagated too much.
- Remove several pointless add 0, instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231307 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-04 22:31:18 +00:00
Nemanja Ivanovic
b69d556c37 Add LLVM support for PPC cryptography builtins
Review: http://reviews.llvm.org/D7955


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231285 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-04 20:44:33 +00:00
Mehdi Amini
c94da20917 Make DataLayout Non-Optional in the Module
Summary:
DataLayout keeps the string used for its creation.

As a side effect it is no longer needed in the Module.
This is "almost" NFC, the string is no longer
canonicalized, you can't rely on two "equals" DataLayout
having the same string returned by getStringRepresentation().

Get rid of DataLayoutPass: the DataLayout is in the Module

The DataLayout is "per-module", let's enforce this by not
duplicating it more than necessary.
One more step toward non-optionality of the DataLayout in the
module.

Make DataLayout Non-Optional in the Module

Module->getDataLayout() will never returns nullptr anymore.

Reviewers: echristo

Subscribers: resistor, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D7992

From: Mehdi Amini <mehdi.amini@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231270 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-04 18:43:29 +00:00
Adrian Prantl
2e74ddea3a Update the out-of-date dwarf expressions in these testcases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231261 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-04 17:39:59 +00:00
Marek Olsak
506d4b2cb4 R600/SI: Add an intrinsic for S_FLBIT_I32 / V_FFBH_I32
Required by OpenGL (ARB_gpu_shader5).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231259 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-04 17:33:45 +00:00
Jozef Kolek
2e37a6f306 [mips][microMIPS] Make usage of ADDU16 and SUBU16 by code generator
Differential Revision: http://reviews.llvm.org/D7609


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231249 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-04 15:47:42 +00:00
Andrea Di Biagio
da5e5688e9 [X86][FastISel] Simplify the logic in method X86SelectSIToFP.
The target-independent selection algorithm in FastISel already knows how
to select a SINT_TO_FP if the target is SSE but not AVX.

On targets that have SSE but not AVX, the tablegen'd 'fastEmit' functions
for ISD::SINT_TO_FP know how to select instruction X86::CVTSI2SSrr
(for an i32 to f32 conversion) and X86::CVTSI2SDrr (for an i32 to f64
conversion).

This patch simplifies the logic in method X86SelectSIToFP knowing that
the code would not be reachable if the subtarget doesn't have AVX.
No functional change intended.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231243 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-04 14:23:25 +00:00
Chandler Carruth
67fade9110 [MBP] Fix a really horrible bug in MachineBlockPlacement, but behind
a flag for now.

First off, thanks to Daniel Jasper for really pointing out the issue
here. It's been here forever (at least, I think it was there when
I first wrote this code) without getting really noticed or fixed.

The key problem is what happens when two reasonably common patterns
happen at the same time: we outline multiple cold regions of code, and
those regions in turn have diamonds or other CFGs for which we can't
just topologically lay them out. Consider some C code that looks like:

  if (a1()) { if (b1()) c1(); else d1(); f1(); }
  if (a2()) { if (b2()) c2(); else d2(); f2(); }
  done();

Now consider the case where a1() and a2() are unlikely to be true. In
that case, we might lay out the first part of the function like:

  a1, a2, done;

And then we will be out of successors in which to build the chain. We go
to find the best block to continue the chain with, which is perfectly
reasonable here, and find "b1" let's say. Laying out successors gets us
to:

  a1, a2, done; b1, c1;

At this point, we will refuse to lay out the successor to c1 (f1)
because there are still un-placed predecessors of f1 and we want to try
to preserve the CFG structure. So we go get the next best block, d1.

... wait for it ...

Except that the next best block *isn't* d1. It is b2! d1 is waaay down
inside these conditionals. It is much less important than b2. Except
that this is exactly what we didn't want. If we keep going we get the
entire set of the rest of the CFG *interleaved*!!!

  a1, a2, done; b1, c1; b2, c2; d1, f1; d2, f2;

So we clearly need a better strategy here. =] My current favorite
strategy is to actually try to place the block whose predecessor is
closest. This very simply ensures that we unwind these kinds of CFGs the
way that is natural and fitting, and should minimize the number of cache
lines instructions are spread across.

It also happens to be *dead simple*. It's like the datastructure was
specifically set up for this use case or something. We only push blocks
onto the work list when the last predecessor for them is placed into the
chain. So the back of the worklist *is* the nearest next block.

Unfortunately, a change like this is going to cause *soooo* many
benchmarks to swing wildly. So for now I'm adding this under a flag so
that we and others can validate that this is fixing the problems
described, that it seems possible to enable, and hopefully that it fixes
more of our problems long term.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231238 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-04 12:18:08 +00:00
Daniel Jasper
f68f28a41d Add a flag to experiment with outlining optional branches.
In a CFG with the edges A->B->C and A->C, B is an optional branch.

LLVM's default behavior is to lay the blocks out naturally, i.e. A, B,
C, in order to improve code locality and fallthroughs. However, if a
function contains many of those optional branches only a few of which
are taken, this leads to a lot of unnecessary icache misses. Moving B
out of line can work around this.

Review: http://reviews.llvm.org/D7719

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231230 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-04 11:05:34 +00:00
Kristof Beyls
78c4ef5120 Fix PR22408 - LLVM producing AArch64 TLS relocations that GNU linkers cannot handle yet.
As is described at http://llvm.org/bugs/show_bug.cgi?id=22408, the GNU linkers
ld.bfd and ld.gold currently only support a subset of the whole range of AArch64
ELF TLS relocations. Furthermore, they assume that some of the code sequences to
access thread-local variables are produced in a very specific sequence.
When the sequence is not as the linker expects, it can silently mis-relaxe/mis-optimize
the instructions.
Even if that wouldn't be the case, it's good to produce the exact sequence,
as that ensures that linkers can perform optimizing relaxations.

This patch:

* implements support for 16MiB TLS area size instead of 4GiB TLS area size. Ideally clang
  would grow an -mtls-size option to allow support for both, but that's not part of this patch.
* by default doesn't produce local dynamic access patterns, as even modern ld.bfd and ld.gold
  linkers do not support the associated relocations. An option (-aarch64-elf-ldtls-generation)
  is added to enable generation of local dynamic code sequence, but is off by default.
* makes sure that the exact expected code sequence for local dynamic and general dynamic
  accesses is produced, by making use of a new pseudo instruction. The patch also removes
  two (AArch64ISD::TLSDESC_BLR, AArch64ISD::TLSDESC_CALL) pre-existing AArch64-specific pseudo
  SDNode instructions that are superseded by the new one (TLSDESC_CALLSEQ).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231227 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-04 09:12:08 +00:00
Michael Kuperstein
bbfda9c125 [DAGCombine] Fix a bug in a BUILD_VECTOR combine
When trying to convert a BUILD_VECTOR into a shuffle, we try to split a single source vector that is twice as wide as the destination vector. 
We can not do this when we also need the zero vector to create a blend.
This fixes PR22774.

Differential Revision: http://reviews.llvm.org/D8040

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231219 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-04 07:27:39 +00:00
Filipe Cabecinhas
7eefc249b8 Fix the test for r231201. We don't crash anymore.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231207 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-04 02:09:40 +00:00
Rafael Espindola
bd490c174e Use the vanilla func_end symbol for .size.
No need to create yet another temp symbol.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231198 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-04 01:35:23 +00:00
Eric Christopher
1df6d33c5e Weaken the check for a specific movl on the twoaddr-coalesce-3
test - we only care that there are two moves in the loop and not
which part is relative to which register anyhow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231191 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-04 01:19:17 +00:00
Filipe Cabecinhas
08efe825e4 Fix the x86-upgrade-avx2-vbroadcast.ll test by commenting the CHECK lines
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231187 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-04 00:49:12 +00:00
Rafael Espindola
c82398a2ac Drop the "eh_" from eh_func_begin and eh_func_end.
They will be used for more than eh tables.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231185 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-04 00:27:43 +00:00
Juergen Ributzka
e49da9aff1 Remove 'llvm.x86.avx2.vbroadcasti128' intrinsic.
The intrinsic is no longer generated by the front-end. Remove the intrinsic and
auto-upgrade it to a vector shuffle.

Reviewed by Nadav

This is related to rdar://problem/18742778.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231182 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-04 00:13:25 +00:00
Eric Christopher
f2bf51c593 Update twoaddr-coalesce-3.ll to run on darwin and linux machines:
a) Default relocation model differences,
b) Different numbers of # in comments

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231178 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-03 23:56:20 +00:00
Reid Kleckner
ec0a396ffa WinEH: Remove vestigial EH object
Ultimately, we'll need to leave something behind to indicate which
alloca will hold the exception, but we can figure that out when it comes
time to emit the __CxxFrameHandler3 catch handler table.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231164 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-03 23:20:30 +00:00
Andrew Kaylor
92dabb5710 Moving WinEH outlining tests to an architecture neutral location
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231155 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-03 22:33:39 +00:00
Eric Christopher
63295d884c Fix a problem where the TwoAddressInstructionPass which generate redundant register moves in a loop.
From:
int M, total;
void foo() {
int i;
for (i = 0; i < M; i++) {
  total = total + i / 2;
}
}

This is the kernel loop:

.LBB0_2: # %for.body

=>This Inner Loop Header: Depth=1
movl %edx, %esi
movl %ecx, %edx
shrl $31, %edx
addl %ecx, %edx
sarl %edx
addl %esi, %edx
incl %ecx
cmpl %eax, %ecx
jl .LBB0_2
--------------------------
The first mov insn "movl %edx, %esi" could be removed if we change "addl %esi, %edx" to "addl %edx, %esi".

The IR before TwoAddressInstructionPass is:
BB#2: derived from LLVM BB %for.body

Predecessors according to CFG: BB#1 BB#2
    %vreg3<def> = COPY %vreg12<kill>; GR32:%vreg3,%vreg12
    %vreg2<def> = COPY %vreg11<kill>; GR32:%vreg2,%vreg11
    %vreg7<def,tied1> = SHR32ri %vreg3<tied0>, 31, %EFLAGS<imp-def,dead>; GR32:%vreg7,%vreg3
    %vreg8<def,tied1> = ADD32rr %vreg3<tied0>, %vreg7<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg8,%vreg3,%vreg7
    %vreg9<def,tied1> = SAR32r1 %vreg8<kill,tied0>, %EFLAGS<imp-def,dead>; GR32:%vreg9,%vreg8
    %vreg4<def,tied1> = ADD32rr %vreg9<kill,tied0>, %vreg2<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg4,%vreg9,%vreg2
    %vreg5<def,tied1> = INC64_32r %vreg3<kill,tied0>, %EFLAGS<imp-def,dead>; GR32:%vreg5,%vreg3
    CMP32rr %vreg5, %vreg0, %EFLAGS<imp-def>; GR32:%vreg5,%vreg0
    %vreg11<def> = COPY %vreg4; GR32:%vreg11,%vreg4
    %vreg12<def> = COPY %vreg5<kill>; GR32:%vreg12,%vreg5
    JL_4 <BB#2>, %EFLAGS<imp-use,kill>
Now TwoAddressInstructionPass will choose vreg9 to be tied with vreg4. However, it doesn't see that there is copy from vreg4 to vreg11 and another copy from vreg11 to vreg2 inside the loop body. To remove those copies, it is necessary to choose vreg2 to be tied with vreg4 instead of vreg9. This code pattern commonly appears when there is reduction operation in a loop.

So check for a reversed copy chain and if we encounter one then we can commute the add instruction so we can avoid a copy.

Patch by Wei Mi.
http://reviews.llvm.org/D7806

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231148 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-03 22:03:03 +00:00
Andrew Kaylor
83eaade2b4 Outline cleanup handlers for native Windows C++ exception handling
Differential Revision: http://reviews.llvm.org/D7865



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231117 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-03 20:00:16 +00:00
Kit Barton
40057e8ee8 Add the following 64-bit vector integer arithmetic instructions added in POWER8:
vaddudm
vsubudm
vmulesw
vmulosw
vmuleuw
vmulouw
vmuluwm
vmaxsd
vmaxud
vminsd
vminud
vcmpequd
vcmpequd.
vcmpgtsd
vcmpgtsd.
vcmpgtud
vcmpgtud.
vrld
vsld
vsrd
vsrad

Phabricator review: http://reviews.llvm.org/D7959


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231115 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-03 19:55:45 +00:00
Reid Kleckner
9c28314a68 Make llvm.eh.begincatch use an outparam
Ultimately, __CxxFrameHandler3 needs us to put a stack offset in a
table, and it will take responsibility for copying the exception object
into that slot. Modelling the exception object as an SSA value returned
by begincatch isn't going to work in general, so make it use an output
parameter.

Reviewers: andrew.w.kaylor

Differential Revision: http://reviews.llvm.org/D7920

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231086 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-03 17:41:09 +00:00
Chad Rosier
f1de1adc82 [AArch64] When combining constant mul of -3, prefer (sub x, (shl x, N)).
This change only effects codegen when the constant is -3.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231085 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-03 17:31:01 +00:00
Duncan P. N. Exon Smith
b056aa798d DebugInfo: Move new hierarchy into place
Move the specialized metadata nodes for the new debug info hierarchy
into place, finishing off PR22464.  I've done bootstraps (and all that)
and I'm confident this commit is NFC as far as DWARF output is
concerned.  Let me know if I'm wrong :).

The code changes are fairly mechanical:

  - Bumped the "Debug Info Version".
  - `DIBuilder` now creates the appropriate subclass of `MDNode`.
  - Subclasses of DIDescriptor now expect to hold their "MD"
    counterparts (e.g., `DIBasicType` expects `MDBasicType`).
  - Deleted a ton of dead code in `AsmWriter.cpp` and `DebugInfo.cpp`
    for printing comments.
  - Big update to LangRef to describe the nodes in the new hierarchy.
    Feel free to make it better.

Testcase changes are enormous.  There's an accompanying clang commit on
its way.

If you have out-of-tree debug info testcases, I just broke your build.

  - `upgrade-specialized-nodes.sh` is attached to PR22564.  I used it to
    update all the IR testcases.
  - Unfortunately I failed to find way to script the updates to CHECK
    lines, so I updated all of these by hand.  This was fairly painful,
    since the old CHECKs are difficult to reason about.  That's one of
    the benefits of the new hierarchy.

This work isn't quite finished, BTW.  The `DIDescriptor` subclasses are
almost empty wrappers, but not quite: they still have loose casting
checks (see the `RETURN_FROM_RAW()` macro).  Once they're completely
gutted, I'll rename the "MD" classes to "DI" and kill the wrappers.  I
also expect to make a few schema changes now that it's easier to reason
about everything.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231082 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-03 17:24:31 +00:00
Daniel Jasper
bbaf4fd14c During PHI elimination, split critical edges that move copies out of loops.
This prevents the behavior observed in llvm.org/PR22369. I am not sure
whether I am reading the code correctly, but the early exit based on
isLiveOutPastPHIs() seems to make the wrong assumption that
RegisterCoalescer won't be able to coalesce those copies later.

This change hides the new behavior behind -no-phi-elim-live-out-early-exit
as it currently breaks four tests:
 * Assertion in:
     CodeGen/Hexagon/hwloop-cleanup.ll
 * Worse code in:
     CodeGen/X86/coalescer-commute4.ll
     CodeGen/X86/phys_subreg_coalesce-2.ll
     CodeGen/X86/zlib-longest-match.ll
   The root cause here seems to be that the heuristic that determines
   the visitation order in RegisterCoalescer gets less lucky.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231064 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-03 10:23:11 +00:00
Ahmed Bougacha
14593eb417 [X86] Special-case 2x CMOV when custom-inserting.
This lets us avoid a few copies that are otherwise hard to get rid of.
The way this is done is, the custom-inserter looks at the following
instruction for another CMOV, and replaces both at the same time.
A previous version used a new CMOV2 opcode, but the custom inserter
is expected to be able to return a different basic block anyway, which
means it's OK - though far from ideal - to alter that block's contents.
Explicitly document that, in case it ever makes a difference.
Alternatives welcome!

Follow-up to r231045.

rdar://19767934
Closes http://reviews.llvm.org/D8019


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231046 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-03 01:21:16 +00:00
Ahmed Bougacha
8b5527deef [X86] Combine (cmov (and/or (setcc) (setcc))) into (cmov (cmov)).
Fold and/or of setcc's to double CMOV:

(CMOV F, T, ((cc1 | cc2) != 0)) -> (CMOV (CMOV F, T, cc1), T, cc2)
(CMOV F, T, ((cc1 & cc2) != 0)) -> (CMOV (CMOV T, F, !cc1), F, !cc2)

When we can't use the CMOV instruction, it might increase branch
mispredicts.  When we can, or when there is no mispredict, this
improves throughput and reduces register pressure.

These can't be catched by generic combines, because the pattern can
appear when legalizing some instructions (such as fcmp une).

rdar://19767934
http://reviews.llvm.org/D7634


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231045 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-03 01:09:14 +00:00
Reid Kleckner
a8182029a1 Fix cppeh breakage due to racing commits
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231044 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-03 01:04:39 +00:00
Andrew Kaylor
f89c6af16b Remap arguments and non-alloca values used by outlined C++ exception handlers.
Differential Revision: http://reviews.llvm.org/D7844



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231042 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-03 00:41:03 +00:00
Reid Kleckner
8594a2aa79 WinEH: Run opt -instnamer over some cppeh tests and update CHECKs
In the future, we should run the output of clang through instnamer to
make it easier to manually edit test cases.

No functionality change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231037 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-03 00:05:35 +00:00
Adrian Prantl
994176ad7c Refactor DebugLocDWARFExpression so it doesn't require access to the
TargetRegisterInfo. DebugLocEntry now holds a buffer with the raw bytes
of the pre-calculated DWARF expression.

Ought to be NFC, but it does slightly alter the output format of the
textual assembly.

This reapplies 230930 without the assertion in DebugLocEntry::finalize()
because not all Machine registers can be lowered into DWARF register
numbers and floating point constants cannot be expressed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231023 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-02 22:02:33 +00:00
Adrian Prantl
a2e69c9c58 Revert "Refactor DebugLocDWARFExpression so it doesn't require access to the"
This reverts commit 230975 to investigate buildbot breakage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231004 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-02 20:01:54 +00:00
David Blaikie
7f620e56cd Change SystemZ large tests to use the existing long_tests property
(this is already used in Clang for a couple of tests)

Reviewers: uweigand

Differential Revision: http://reviews.llvm.org/D7965

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230998 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-02 19:34:11 +00:00
Adrian Prantl
9680c9c1a8 Refactor DebugLocDWARFExpression so it doesn't require access to the
TargetRegisterInfo. DebugLocEntry now holds a buffer with the raw bytes
of the pre-calculated DWARF expression.

Ought to be NFC, but it does slightly alter the output format of the
textual assembly.

This reapplies 230930 with a relaxed assertion in DebugLocEntry::finalize()
that allows for empty DWARF expressions for constant FP values.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230975 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-02 17:21:06 +00:00
Vasileios Kalintiris
5a393cab69 [mips] Optimize conditional moves where RHS is zero.
Summary:
When the RHS of a conditional move node is zero, we can utilize the $zero
register by inverting the conditional move instruction and by swapping the
order of its True/False operands.

Reviewers: dsanders

Differential Revision: http://reviews.llvm.org/D7945

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230956 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-02 12:47:32 +00:00
Nico Weber
5e871d0b9c Revert r230930, it caused PR22747.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230932 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-02 04:37:11 +00:00
Adrian Prantl
d21acaf6a1 Refactor DebugLocDWARFExpression so it doesn't require access to the
TargetRegisterInfo. DebugLocEntry now holds a buffer with the raw bytes
of the pre-calculated DWARF expression.

Ought to be NFC, but it does slightly alter the output format of the
textual assembly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230930 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-02 02:38:18 +00:00
Elena Demikhovsky
975e9b99aa AVX-512: Added mask and rounding mode for scalar arithmetics
Added more tests for scalar instructions to destinguish between AVX and AVX-512 forms.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230891 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-01 07:44:04 +00:00
Sanjay Patel
821bba7fda avoid infinite looping when folding vector multiplies of constants (PR22698)
We were missing a check for the following fold in DAGCombiner:

// fold (fmul (fmul x, c1), c2) -> (fmul x, (fmul c1, c2))

If 'x' is also a constant, then we shouldn't do anything. Otherwise, we could end up swapping the operands back and forth forever.

This should fix:
http://llvm.org/bugs/show_bug.cgi?id=22698

Differential Revision: http://reviews.llvm.org/D7917



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230884 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-01 00:09:35 +00:00
Sanjay Patel
7497834516 fixed to test only the feature, not the feature and a CPU
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230883 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-01 00:02:03 +00:00
Sanjay Patel
4f00dcbf22 make the tested feature (SSE2) explicit
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230881 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-28 23:55:24 +00:00
Duncan P. N. Exon Smith
aee20c05e6 DebugInfo: Fix invalid file reference in CodeGen/X86/unknown-location.ll
There are two types of files in the old (current) debug info schema.

    !0 = !{!"some/filename", !"/path/to/dir"}
    !1 = !{!"0x29", !0} ; [ DW_TAG_file_type ]

!1 has a wrapper class called `DIFile` which inherits from `DIScope` and
is referenced in 'scope' fields.

!0 is called a "file node", and debug info nodes with a 'file' field
point at one of these directly -- although they're built in `DIBuilder`
by sending in a `DIFile` and reaching into it.

In the new hierarchy, I unified these nodes as `MDFile` (which `DIFile`
is a lightweight wrapper for) in r230057.  Moving the new hierarchy into
place (and upgrading testcases) caused CodeGen/X86/unknown-location.ll
to start failing -- apparently "0x29" was previously showing up in the
linetable as a filename, causing:

    .loc 2 4 3

(where 2 points at filename "0x29") instead of:

    .loc 1 4 3

(where 1 points at the actual filename).

Change the testcase to use the old schema correctly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230880 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-28 23:52:24 +00:00
Sanjay Patel
be657e70b1 fixed to test only the feature, not the feature and a CPU
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230878 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-28 23:47:09 +00:00
Craig Topper
8df1c6ef09 [X86] Remove the blendpd/blendps/pblendw/pblendd intrinsics. They can represented by shuffle_vector instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230860 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-28 19:33:17 +00:00
Bill Schmidt
88bbdc790e Regenerated test case from pr 230801 for change in LLVM IR syntax
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230811 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 23:29:57 +00:00
David Blaikie
ee62d93f1b Update SystemZ/Large test generators to handle new gep IR syntax
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230810 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 23:29:39 +00:00
David Blaikie
834cb56c1b Update SystemZ/Large test generators to handle new load IR syntax
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230809 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 23:29:33 +00:00
Bill Schmidt
52a5087a4b Revert test case until it can be fixed
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230803 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 22:31:14 +00:00
Bill Schmidt
0e1e8e2f62 [PowerPC] Fix PR22711 - Misaligned .toc section
Straightforward patch to emit an alignment directive when emitting a
TOC entry.  The test case was generated from the test in PR22711 that
demonstrated a misaligned .toc section.  The object code is run
through llvm-readobj to verify that the correct alignment has been
applied to the .toc section.

Thanks to Ulrich Weigand for running down where the fix was needed.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230801 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 22:14:10 +00:00
David Blaikie
7c9c6ed761 [opaque pointer type] Add textual IR support for explicit type parameter to load instruction
Essentially the same as the GEP change in r230786.

A similar migration script can be used to update test cases, though a few more
test case improvements/changes were required this time around: (r229269-r229278)

import fileinput
import sys
import re

pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)")

for line in sys.stdin:
  sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line))

Reviewers: rafael, dexonsmith, grosser

Differential Revision: http://reviews.llvm.org/D7649

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230794 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 21:17:42 +00:00
Charles Davis
dc64962c86 Target/X86: Never use the redzone for Win64 ABI functions.
Summary:
Until now, we did this (among other things) based on whether or not the
target was Windows. This is clearly wrong, not just for Win64 ABI functions
on non-Windows, but for System V ABI functions on Windows, too. In this
change, we make this decision based on the ABI the calling convention
specifies instead.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7953

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230793 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 21:11:16 +00:00
Hal Finkel
e03aac601f [PowerPC] Use vector types for memcpy and friends (sometimes)
When using Altivec, we can use vector loads and stores for aligned memcpy and
friends. Starting with the P7 and VXS, we have reasonable unaligned vector
stores. Starting with the P8, we have fast unaligned loads too.

For QPX, we use vector loads are stores, but only for aligned memory accesses.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230788 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 19:58:28 +00:00
David Blaikie
198d8baafb [opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction
One of several parallel first steps to remove the target type of pointers,
replacing them with a single opaque pointer type.

This adds an explicit type parameter to the gep instruction so that when the
first parameter becomes an opaque pointer type, the type to gep through is
still available to the instructions.

* This doesn't modify gep operators, only instructions (operators will be
  handled separately)

* Textual IR changes only. Bitcode (including upgrade) and changing the
  in-memory representation will be in separate changes.

* geps of vectors are transformed as:
    getelementptr <4 x float*> %x, ...
  ->getelementptr float, <4 x float*> %x, ...
  Then, once the opaque pointer type is introduced, this will ultimately look
  like:
    getelementptr float, <4 x ptr> %x
  with the unambiguous interpretation that it is a vector of pointers to float.

* address spaces remain on the pointer, not the type:
    getelementptr float addrspace(1)* %x
  ->getelementptr float, float addrspace(1)* %x
  Then, eventually:
    getelementptr float, ptr addrspace(1) %x

Importantly, the massive amount of test case churn has been automated by
same crappy python code. I had to manually update a few test cases that
wouldn't fit the script's model (r228970,r229196,r229197,r229198). The
python script just massages stdin and writes the result to stdout, I
then wrapped that in a shell script to handle replacing files, then
using the usual find+xargs to migrate all the files.

update.py:
import fileinput
import sys
import re

ibrep = re.compile(r"(^.*?[^%\w]getelementptr inbounds )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")
normrep = re.compile(       r"(^.*?[^%\w]getelementptr )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")

def conv(match, line):
  if not match:
    return line
  line = match.groups()[0]
  if len(match.groups()[5]) == 0:
    line += match.groups()[2]
  line += match.groups()[3]
  line += ", "
  line += match.groups()[1]
  line += "\n"
  return line

for line in sys.stdin:
  if line.find("getelementptr ") == line.find("getelementptr inbounds"):
    if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("):
      line = conv(re.match(ibrep, line), line)
  elif line.find("getelementptr ") != line.find("getelementptr ("):
    line = conv(re.match(normrep, line), line)
  sys.stdout.write(line)

apply.sh:
for name in "$@"
do
  python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name"
  rm -f "$name.tmp"
done

The actual commands:
From llvm/src:
find test/ -name *.ll | xargs ./apply.sh
From llvm/src/tools/clang:
find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}"
From llvm/src/tools/polly:
find test/ -name *.ll | xargs ./apply.sh

After that, check-all (with llvm, clang, clang-tools-extra, lld,
compiler-rt, and polly all checked out).

The extra 'rm' in the apply.sh script is due to a few files in clang's test
suite using interesting unicode stuff that my python script was throwing
exceptions on. None of those files needed to be migrated, so it seemed
sufficient to ignore those cases.

Reviewers: rafael, dexonsmith, grosser

Differential Revision: http://reviews.llvm.org/D7636

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230786 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 19:29:02 +00:00
Eric Christopher
930da21265 Remove the Forward Control Flow Integrity pass and its dependencies.
This work is currently being rethought along different lines and
if this work is needed it can be resurrected out of svn. Remove it
for now as no current work in ongoing on it and it's unused. Verified
with the authors before removal.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230780 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 19:03:38 +00:00
Mehdi Amini
26d628d6ce Change the fast-isel-abort option from bool to int to enable "levels"
Summary:
Currently fast-isel-abort will only abort for regular instructions,
and just warn for function calls, terminators, function arguments.
There is already fast-isel-abort-args but nothing for calls and
terminators.

This change turns the fast-isel-abort options into an integer option,
so that multiple levels of strictness can be defined.
This will help no being surprised when the "abort" option indeed does
not abort, and enables the possibility to write test that verifies
that no intrinsics are forgotten by fast-isel.

Reviewers: resistor, echristo

Subscribers: jfb, llvm-commits

Differential Revision: http://reviews.llvm.org/D7941

From: Mehdi Amini <mehdi.amini@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230775 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 18:32:11 +00:00
Rafael Espindola
150233c378 Centralize handling of the eh_begin and eh_end labels.
This removes a bit of duplicated code and more importantly, remembers the
labels so that they don't need to be looked up by name.

This in turn allows for any name to be used and avoids a crash if the name
we wanted was already taken.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230772 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 18:18:39 +00:00
Renato Golin
636aacf211 Equally to NetBSD, Bitrig/ARM uses the Itanium-ABI.
Patch by Patrick Wildt.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230762 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 16:35:27 +00:00
Zoran Jovanovic
2846ef3680 [mips][microMIPS] Change register class for GP register
Differential Revision: http://reviews.llvm.org/D7934


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230760 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 15:03:50 +00:00
Petar Jovanovic
8407da0dbc Pass correct -mtriple for krait-cpu-div-attribute.ll
Not passing mtriple for one of the tests caused a regression failure
on MIPS buildbot. The issue was introduced by r230651.

Differential Revision: http://reviews.llvm.org/D7938


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230756 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 14:46:41 +00:00
Chandler Carruth
c4179ffed3 [x86] Run most of the rest of the shuffle combining over non-128-bit
vectors. This lets us fix the rest of the v16 lowering problems when
pshufb is clearly better.

We might still be able to improve some of the lowerings by enabling the
other combine-based rewriting to fire for non-128-bit vectors, but this
at least should remove any regressions from using the fancy v16i16
lowering strategy.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230753 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 12:13:14 +00:00
Chandler Carruth
2d58cc5f1b [x86] Teach a bunch of the x86-specific shuffle combining to work with
256-bit vectors as well as 128-bit vectors. Fixes some of the redundant
shuffles for v16i16.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230752 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 11:45:13 +00:00
Chandler Carruth
8c71e440a2 [x86] Make the v8i16 clever single-input shuffle lowering usable for
repeated 128-bit lane shuffles of wider vector types and use it to lower
256-bit v16i16 vector shuffles where applicable.

This should let us perfectly lowering the pattern of pshuflw and pshufhw
even for AVX2 256-bit patterns.

I've not added AVX-512 support, but it should be trivial for someone
working on that to wire up.

Note that currently this generates bad, long shuffle chains because we
don't combine 256-bit target shuffles. The subsequent patches will fix
that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230751 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 11:33:46 +00:00
Chandler Carruth
f5651f8ab6 [x86] Add a bunch more tests for v16i16 shuffles. All of these are taken
by mirroring v8i16 test cases across both 128-bit lanes. This should
highlight problems where we aren't correctly using 128-bit shuffles to
implement things.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230750 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 11:25:10 +00:00
Vasileios Kalintiris
912e816cc2 [mips] Account for constant-zero operands in ADDE nodes.
Summary:
We identify the cases where the operand to an ADDE node is a constant
zero. In such cases, we can avoid generating an extra ADDu instruction
disguised as an identity move alias (ie. addu $r, $r, 0 --> move $r, $r).

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7906

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230742 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 09:01:39 +00:00
Charles Davis
d51be017f0 Target/X86: Save Win64 non-volatile registers in a Win64 ABI function.
Summary:
This change causes us to actually save non-volatile registers in a Win64
ABI function that calls a System V ABI function, and vice-versa.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7919

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230714 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 00:57:01 +00:00
Rafael Espindola
fc0ad8d28d Put jump tables in distinct sections if -ffunction-sections is used.
A small regression in r230411 was that we were basing the decision on
-fdata-sections.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230707 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-26 23:55:11 +00:00
Chandler Carruth
b54c36fb4d [x86] Fix PR22706 where we would incorrectly try lower a v32i8 dynamic
blend as legal.

We made the same mistake in two different places. Whenever we are custom
lowering a v32i8 blend we need to check whether we are custom lowering
it only for constant conditions that can be shuffled, or whether we
actually have AVX2 and full dynamic blending support on bytes. Both are
fixed, with comments added to make it clear what is going on and a new
test case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230695 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-26 22:15:34 +00:00
Reid Kleckner
783f7f989e Don't sibcall between SysV and Win64 convention functions
The shadow stack space expectations won't match.

Fixes PR22709.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230667 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-26 19:43:20 +00:00
Paul Robinson
b2f521b647 When the source has a series of assignments, users reasonably want to
have the debugger step through each one individually. Turn off the
combine for adjacent stores at -O0 so we get this behavior.

Possibly, DAGCombine shouldn't run at all at -O0, but that's for
another day; see PR22346.

Differential Revision: http://reviews.llvm.org/D7181


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230659 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-26 18:47:57 +00:00
Petar Jovanovic
e53d9df042 Fix justify error for small structures in varargs for MIPS64BE
There was a problem when passing structures as variable arguments.
The structures smaller than 64 bit were not left justified on MIPS64
big endian. This is now fixed by shifting the value to make it left-
justified when appropriate.

This fixes the bug http://llvm.org/bugs/show_bug.cgi?id=21608

Patch by Aleksandar Beserminji.

Differential Revision: http://reviews.llvm.org/D7881


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230657 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-26 18:35:15 +00:00
Sumanth Gundapaneni
adaebc8b56 Use ".arch_extension" ARM directive to support hwdiv on krait
In case of "krait" CPU, asm printer doesn't emit any ".cpu" so the
features bits are not computed. This patch lets the asm printer
emit ".cpu cortex-a9" directive for krait and the hwdiv feature is
enabled through ".arch_extension". In short, krait is treated
as "cortex-a9" with hwdiv. We can not emit ".krait" as CPU since
it is not supported bu GNU GAS yet


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230651 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-26 18:08:41 +00:00
Tom Stellard
89e4328381 R600/SI: Remove M0 from DS assembly strings
This matches the assembly syntax for the proprietary compiler.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230645 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-26 17:08:43 +00:00
Bruno Cardoso Lopes
bfa9a71f23 [X86][MMX] Fix a typo in a couple of tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230638 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-26 15:16:09 +00:00
Bruno Cardoso Lopes
dde2e4f7b9 [X86][MMX] Remove widening experimental flag from MMX tests.
Turns out that after the past MMX commits, we don't need to rely on this
flag to get better codegen for MMX. Also update the tests to become
triple neutral.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230637 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-26 15:10:38 +00:00
Vladimir Medic
d89ac8f158 Replace obsolete -mattr=n64 command line option with -target-abi=n64. No functional changes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230628 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-26 12:29:48 +00:00
Hal Finkel
7840990de8 [PowerPC] Make LDtocL and friends invariant loads
LDtocL, and other loads that roughly correspond to the TOC_ENTRY SDAG node,
represent loads from the TOC, which is invariant. As a result, these loads can
be hoisted out of loops, etc. In order to do this, we need to generate
GOT-style MMOs for TOC_ENTRY, which requires treating it as a legitimate memory
intrinsic node type. Once this is done, the MMO transfer is automatically
handled for TableGen-driven instruction selection, and for nodes generated
directly in PPCISelDAGToDAG, we need to transfer the MMOs manually.

Also, we were not transferring MMOs associated with pre-increment loads, so do
that too.

Lastly, this fixes an exposed bug where R30 was not added as a defined operand of
UpdateGBR.

This problem was highlighted by an example (used to generate the test case)
posted to llvmdev by Francois Pichet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230553 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-25 21:36:59 +00:00
David Majnemer
92d1637e2f X86, Win64: Allow 'mov' to restore the stack pointer if we have a FP
The Win64 epilogue structure is very restrictive, it permits a very
small number of opcodes and none of them are 'mov'.

This means that given:
  mov %rbp, %rsp
  pop %rbp

The mov isn't the epilogue, only the pop is.  This is problematic unless
a frame pointer is present in which case we are free to do whatever we'd
like in the "body" of the function.  If a frame pointer is present,
unwinding will undo the prologue operations in reverse order regardless
of the fact that we are at an instruction which is reseting the stack
pointer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230543 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-25 21:13:37 +00:00
Sanjoy Das
a0a0b40aa3 Bugfix: SCEVExpander incorrectly marks increment operations as no-wrap
(The change was landed in r230280 and caused the regression PR22674.
This version contains a fix and a test-case for PR22674).
    
When emitting the increment operation, SCEVExpander marks the
operation as nuw or nsw based on the flags on the preincrement SCEV.
This is incorrect because, for instance, it is possible that {-6,+,1}
is <nuw> while {-6,+,1}+1 = {-5,+,1} is not.
    
This change teaches SCEV to mark the increment as nuw/nsw only if it
can explicitly prove that the increment operation won't overflow.
    
Apart from the attached test case, another (more realistic)
manifestation of the bug can be seen in
Transforms/IndVarSimplify/pr20680.ll.

Differential Revision: http://reviews.llvm.org/D7778



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230533 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-25 20:02:59 +00:00
Vladimir Medic
d692ee81e8 [MIPS]Multiple and add instructions for Mips are currently available in mips32r2/mips64r2 and later but should also be available in mips4, mips5, and mips64. This patch fixes the requested features and updates the corresponding test files.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230500 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-25 15:24:37 +00:00
Bruno Cardoso Lopes
51fc7f5afa [X86][MMX] Reapply: Add MMX instructions to foldable tables
Reapply r230248.

Teach the peephole optimizer to work with MMX instructions by adding
entries into the foldable tables. This covers folding opportunities not
handled during isel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230499 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-25 15:14:02 +00:00
Renato Golin
b451f4e376 Improve handling of stack accesses in Thumb-1
Thumb-1 only allows SP-based LDR and STR to be word-sized, and SP-base LDR,
STR, and ADD only allow offsets that are a multiple of 4. Make some changes
to better make use of these instructions:

* Use word loads for anyext byte and halfword loads from the stack.
* Enforce 4-byte alignment on objects accessed in this way, to ensure that
  the offset is valid.
* Do the same for objects whose frame index is used, in order to avoid having
  to use more than one ADD to generate the frame index.
* Correct how many bits of offset we think AddrModeT1_s has.

Patch by John Brawn.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230496 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-25 14:41:06 +00:00
Vladimir Medic
551fd3d820 Replace obsolete -mattr=n64 command line option with -target-abi=n64. No functional changes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230482 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-25 11:43:01 +00:00
Hal Finkel
d37914a662 [PowerPC] Add triples to QPX tests
Some of these tests fail on Darwin systems because of a lack of a triple;
fix that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230421 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-25 01:26:59 +00:00
Hal Finkel
f8d179ba76 [PowerPC] Add support for the QPX vector instruction set
This adds support for the QPX vector instruction set, which is used by the
enhanced A2 cores on the IBM BG/Q supercomputers. QPX vectors are 256 bytes
wide, holding 4 double-precision floating-point values. Boolean values, modeled
here as <4 x i1> are actually also represented as floating-point values
(essentially  { -1, 1 } for { false, true }). QPX shares many features with
Altivec and VSX, but is distinct from both of them. One major difference is
that, instead of adding completely-separate vector registers, QPX vector
registers are extensions of the scalar floating-point registers (lane 0 is the
corresponding scalar floating-point value). The operations supported on QPX
vectors mirrors that supported on the scalar floating-point values (with some
additional ones for permutations and logical/comparison operations).

I've been maintaining this support out-of-tree, as part of the bgclang project,
for several years. This is not the entire bgclang patch set, but is most of the
subset that can be cleanly integrated into LLVM proper at this time. Adding
this to the LLVM backend is part of my efforts to rebase bgclang to the current
LLVM trunk, but is independently useful (especially for codes that use LLVM as
a JIT in library form).

The assembler/disassembler test coverage is complete. The CodeGen test coverage
is not, but I've included some tests, and more will be added as follow-up work.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230413 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-25 01:06:45 +00:00
Rafael Espindola
76bdd01e0e Support SHF_MERGE sections in COMDATs.
This patch unifies the comdat and non-comdat code paths. By doing this
it add missing features to the comdat side and removes the fixed
section assumptions from the non-comdat side.

In ELF there is no one true section for "4 byte mergeable" constants.
We are better off computing the required properties of the section
and asking the context for it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230411 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-25 00:52:15 +00:00
Eric Christopher
7c5314a076 Make this test even more OS and register allocation neutral.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230404 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-25 00:12:11 +00:00
Eric Christopher
8269a59b1c Make this test not dependent upon the triple. All that was needed
was some flexibility in the check line for the comment basic block.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230400 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-24 23:43:26 +00:00
Simon Pilgrim
41cda40157 Reapplied D7816 & rL230177 & rL230278 - with an additional fix toensure that the smallest build vector input scalar type is always used. Additional (crash) test cases already committed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230388 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-24 22:08:56 +00:00
Simon Pilgrim
0115755020 Added test case for PR22678 (check CONCAT_VECTORS DAG combiner pass doesn't introduce illegal types)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230386 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-24 21:46:23 +00:00
Andrew Kaylor
8f475e9d77 Fixing eol-style
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230378 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-24 20:49:35 +00:00