13044 Commits

Author SHA1 Message Date
David Majnemer
76d3a99d10 Revert "COFF: Let globals with private linkage reside in their own section"
This reverts commit r232539.  This was committed accidently.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232543 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 20:41:11 +00:00
David Majnemer
6526150f82 COFF: Let globals with private linkage reside in their own section
Summary:
COFF COMDATs (for selection kinds other than 'select any') require at
least one non-section symbol in the symbol table.
Satisfy this by morally enhancing the linkage from private to internal.

Reviewers: rafael

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8374

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232539 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 20:39:25 +00:00
Richard Barton
b59aee170f [ARM] Fix offset calculation in ARMBaseRegisterInfo::needsFrameBaseReg
The input offset to needsFrameBaseReg is a negative value below the top of the
stack frame, but when converting to a positive offset from the bottom of the
stack frame this value was negated, causing the final offset to be too large
by twice the input offset's magnitude. Fix that by not negating the offset.

Patch by John Brawn

Differential Revision: http://reviews.llvm.org/D8316

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232513 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 18:20:47 +00:00
Samuel Antao
7684e7d987 Fix R0 use in PowerPC VSX store for FastIsel.
The VSX stores are sometimes generated with a undefined index register, causing %noreg to be used and R0 to be emitted later on. The semantics of the VSX store (e.g. stdsdx) requires R0 to be used as base if we want zero to be used in the computation of the effective address instead of the content of R0. This patch checks if no index register was generated and forces R0 to be used as base address.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232486 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 15:00:57 +00:00
Rafael Espindola
cebed4aaf1 Use createTempSymbol to avoid collisions instead of an ad hoc method.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232483 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 14:50:32 +00:00
Rafael Espindola
99739705ac Call EmitFunctionHeader just before EmitFunctionBody.
This avoids switching to .AMDGPU.config and back and hardcoding the
section it switches back to.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232479 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 14:34:42 +00:00
Rafael Espindola
a480f88b3c Move the EH symbol to the asm printer and use it for the SJLJ case too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232475 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 13:57:48 +00:00
Rafael Espindola
4d3df54336 Replace a use of GetTempSymbol with createTempSymbol.
This is cleaner and avoids a crash in a corner case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232471 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 12:54:04 +00:00
Renato Golin
ce1f16421f [ARM] Add support for ARMV6K subtarget (LLVM)
ARMv6K is another layer between ARMV6 and ARMV6T2. This is the LLVM
side of the changes.

ARMV6 family LLVM implementation.

+-------------------------------------+
| ARMV6                               |
+----------------+--------------------+
| ARMV6M (thumb) | ARMV6K (arm,thumb) | <- From ARMV6K and ARMV6M processors
+----------------+--------------------+    have support for hint instructions
| ARMV6T2 (arm,thumb,thumb2)          |    (SEV/WFE/WFI/NOP/YIELD). They can
+-------------------------------------+    be either real or default to NOP.
| ARMV7 (arm,thumb,thumb2)            |    The two processors also use
+-------------------------------------+    different encoding for them.

Patch by Vinicius Tinti.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232468 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 11:55:28 +00:00
Ahmed Bougacha
df08543f48 [AArch64] Use intermediate step for concat_vectors of illegal truncs.
Optimize concat_vectors of truncated vectors, where the intermediate
type is illegal, to avoid said illegality,  e.g.,
  (v4i16 (concat_vectors (v2i16 (truncate (v2i64))),
                         (v2i16 (truncate (v2i64)))))
->
  (v4i16 (truncate (v4i32 (concat_vectors (v2i32 (truncate (v2i64))),
                                          (v2i32 (truncate (v2i64)))))))

This isn't really target-specific, and, as such, would best go in the
DAGCombiner.  However, ISD::TRUNCATE legality isn't keyed on both input
and result type, so we might generate worse code when we don't know
better.  On AArch64 we know it's fine for v2i64->v4i16 and v4i32->v8i8.
rdar://20022387


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232459 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 03:23:09 +00:00
David Majnemer
759acf348d CodeGen: @llvm.eh.typeid.for replaced @llvm.eh.typeid.for.i32
We removed @llvm.eh.typeid.for.i32 and replaced it with
@llvm.eh.typeid.for quite some time ago.  Fix up some test cases which
never got updated.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232421 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-16 21:36:38 +00:00
Duncan P. N. Exon Smith
763e18696f DebugInfo: Fix testcases that fail -verify-debug-info=true
As part of PR22777, fix testcases that fail the debug info verifier.
The changes fall into the following categories:

  - Empty `filename:` fields in `MDFile`s.  Compile units and some types
    require non-empty filenames.  A number of testcases have empty
    filenames, probably due to hand-reduction of testcases.
  - Not-quite empty arrays: `!{i32 0}`.  This used to be equivalent in
    the debug info schema to `!{}`.  They cause problems for
    `!MDSubroutineType`'s `types:` array, since it requires all operands
    to be valid types.  (Note that `!{null}` is the correct type array
    for functions that take no arguments and return `void`.)
  - Significantly bitrotted testcases.  Nodes got left behind a few
    upgrades ago because of missing or invalid tags.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232415 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-16 21:10:12 +00:00
Sanjay Patel
8233e8c233 fixed to test feature, not CPU
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232398 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-16 18:24:28 +00:00
Sanjay Patel
89095a7882 add CHECK-LABELs for more reliable testing
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232391 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-16 17:59:07 +00:00
Sanjay Patel
8f74fd0883 fixed to test feature, not CPU; removed unnecessary declaration
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232387 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-16 17:01:34 +00:00
Tom Stellard
6ebc34281f R600/SI: don't try min3/max3/med3 with f64
There are no opcodes for this. This also adds a test case.

v2: make test more robust

Patch by: Grigori Goronzy

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232386 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-16 15:53:55 +00:00
Petar Jovanovic
b3b90bd679 [MIPS] Fix justify error for small structures
Fix justify error for small structures bigger than 32 bits in fixed
arguments for MIPS64 big endian. There was a problem when small structures
are passed as fixed arguments. The structures that are bigger than 32 bits
but smaller than 64 bits were not left justified properly on MIPS64 big
endian. This is fixed by shifting the value to make it left justified when
appropriate.

Patch by Aleksandar Beserminji.

Differential Revision: http://reviews.llvm.org/D8174


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232382 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-16 15:01:09 +00:00
Rafael Espindola
8d8c155a61 Use the i8 immediate cmp instructions when possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232378 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-16 14:25:08 +00:00
Simon Pilgrim
4f3864d05f [SSE} Added tests for float4-float3 conversions (PR11580)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232324 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-15 16:19:15 +00:00
Simon Pilgrim
54db4092c1 Simplified some stack folding tests.
Replaced explicit pmovzx* intrinsic tests with general shuffles

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232286 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-14 23:16:43 +00:00
Daniel Jasper
439cc2c5de [MachineLICM] First steps of sinking GEPs near calls.
Specifically, if there are copy-like instructions in the loop header
they are moved into the loop close to their uses. This reduces the live
intervals of the values and can avoid register spills.

This is working towards a fix for http://llvm.org/PR22230.
Review: http://reviews.llvm.org/D7259

Next steps:
- Find a better cost model (which non-copy instructions should be sunk?)
- Make this dependent on register pressure

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232262 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-14 10:58:38 +00:00
Ahmed Bougacha
4a2d95826e Add a bunch of CHECK missing colons in tests. NFC.
Some wouldn't pass;  fixed most, the rest will be fixed separately.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232239 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-14 01:43:57 +00:00
Rafael Espindola
89c84b0c83 Use add32ri8 and friends on fast isel.
This fixes pr22854.

The core issue on the bug is that there are multiple instructions that
print the same in assembly. In fact, there doesn't seem to be any
syntax for specifying that a constant that fits in 8 bits should use a 32 bit
immediate.

The attached patch changes fast isel to consider i16immSExt8,
i32immSExt8, and i64immSExt8. They were disabled because fastisel didn’t know
to call the predicate back in the day.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232223 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-13 22:18:18 +00:00
David Blaikie
5a70dd1d82 [opaque pointer type] Add textual IR support for explicit type parameter to gep operator
Similar to gep (r230786) and load (r230794) changes.

Similar migration script can be used to update test cases, which
successfully migrated all of LLVM and Polly, but about 4 test cases
needed manually changes in Clang.

(this script will read the contents of stdin and massage it into stdout
- wrap it in the 'apply.sh' script shown in previous commits + xargs to
apply it over a large set of test cases)

import fileinput
import sys
import re

rep = re.compile(r"(getelementptr(?:\s+inbounds)?\s*\()((<\d*\s+x\s+)?([^@]*?)(|\s*addrspace\(\d+\))\s*\*(?(3)>)\s*)(?=$|%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|zeroinitializer|<|\[\[[a-zA-Z]|\{\{)", re.MULTILINE | re.DOTALL)

def conv(match):
  line = match.group(1)
  line += match.group(4)
  line += ", "
  line += match.group(2)
  return line

line = sys.stdin.read()
off = 0
for match in re.finditer(rep, line):
  sys.stdout.write(line[off:match.start()])
  sys.stdout.write(conv(match))
  off = match.end()
sys.stdout.write(line[off:])

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232184 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-13 18:20:45 +00:00
Andrea Di Biagio
d288259ccd [X86][AVX] Fix wrong lowering of v4x64 shuffles into concat_vector plus extract_subvector nodes.
This patch fixes a bug in the shuffle lowering logic implemented by function
'lowerV2X128VectorShuffle'.

The are few cases where function 'lowerV2X128VectorShuffle' wrongly expands a
shuffle of two v4X64 vectors into a CONCAT_VECTORS of two EXTRACT_SUBVECTOR
nodes. The problematic expansion only occurs when the shuffle mask M has an
'undef' element at position 2, and M is equivalent to mask <0,1,4,5>.
In that case, the algorithm propagates the wrong vector to one of the two
new EXTRACT_SUBVECTOR nodes.

Example:
;;
define <4 x double> @test(<4 x double> %A, <4 x double> %B) {
entry:
  %0 = shufflevector <4 x double> %A, <4 x double> %B, <4 x i32><i32 undef, i32 1, i32 undef, i32 5>
  ret <4 x double> %0
}
;;

Before this patch, llc (-mattr=+avx) generated:
  vinsertf128 $1, %xmm0, %ymm0, %ymm0

With this patch, llc correctly generates:
  vinsertf128 $1, %xmm1, %ymm0, %ymm0

Added test lower-vec-shuffle-bug.ll

Differential Revision: http://reviews.llvm.org/D8259


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232179 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-13 17:29:49 +00:00
Matt Arsenault
462f98dd60 R600/SI: Add test for min / max with immediate
Make sure this isn't getting confused by canonicalizations
of comparisons with a constant.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232177 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-13 16:43:48 +00:00
Hao Liu
fcc897cc45 [MachineCopyPropagation] Fix a bug causing incorrect removal for the instruction sequences as follows
%Q5_Q6<def> = COPY %Q2_Q3
   %D5<def> =
   %D3<def> =
   %D3<def> = COPY %D6     // Incorrectly removed in MachineCopyPropagation
   Using of %D3 results in incorrect result ...

   Reviewed in http://reviews.llvm.org/D8242 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232142 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-13 05:15:23 +00:00
Sanjay Patel
cae9695fbb [X86, AVX2] Replace inserti128 and extracti128 intrinsics with generic shuffles
This should complete the job started in r231794 and continued in r232045:
We want to replace as much custom x86 shuffling via intrinsics
as possible because pushing the code down the generic shuffle
optimization path allows for better codegen and less complexity
in LLVM.

AVX2 introduced proper integer variants of the hacked integer insert/extract
C intrinsics that were created for this same functionality with AVX1.

This should complete the removal of insert/extract128 intrinsics.

The Clang precursor patch for this change was checked in at r232109.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232120 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 23:16:18 +00:00
Simon Pilgrim
7385cafb7a Removed useless palignr test - we don't actually provide a llvm.x86.ssse3.palign.r.128 intrinsic
Differential Revision: http://reviews.llvm.org/D8302

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232108 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 21:42:03 +00:00
Tom Stellard
3d712a6373 R600/SI: Remove _e32 and _e64 suffixes from mnemonics
Instead print them as part of the $dst operand.  The AsmMatcher
requires the 32-bit and 64-bit encodings have the same mnemonic in
order to parse them correctly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232105 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 21:34:22 +00:00
Andrew Kaylor
d434c0d548 Adding WinEHPrepare tests (currently XFAILs)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232104 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 21:32:59 +00:00
Krzysztof Parzyszek
49fa37992d Unxfail passing test on Hexagon
test/CodeGen/Generic/2008-02-20-MatchingMem.ll


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232098 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 20:38:10 +00:00
Quentin Colombet
be45e0e669 [X86] Fix a regression introduced by r223641.
The permps and permd instructions have their operands swapped compared to the
intrinsic definition. Therefore, they do not fall into the INTR_TYPE_2OP
category.

I did not create a new category for those two, as they are the only one AFAICT
in that case.

<rdar://problem/20108262>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232085 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 19:34:12 +00:00
Krzysztof Parzyszek
7b110fe366 Remove unused complex patterns for addressing modes on Hexagon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232057 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 16:44:50 +00:00
Andrea Di Biagio
be9322ae7c [X86] Fix wrong target specific combine on SETCC nodes.
Part of the folding logic implemented by function 'PerformISDSETCCCombine'
only worked under the assumption that the condition code in input could have
been either SETNE or SETEQ.
Unfortunately that assumption was incorrect, and in some cases the algorithm
ended up incorrectly folding SETCC nodes.

The incorrect folding only affected SETCC dag nodes where:
 - one of the operands was a build_vector of all zeroes;
 - the other operand was a SIGN_EXTEND from a vector of MVT:i1 elements;
 - the condition code was neither SETNE nor SETEQ.

Example:
  (setcc (v4i32 (sign_extend v4i1:%A)), (v4i32 VectorOfAllZeroes), setge)

Before this patch, the entire dag node sequence from the example was
incorrectly folded to node %A.

With this patch, the dag node sequence is folded to a
  (xor %A, (v4i1 VectorOfAllOnes)).

Added test setcc-combine.ll.

Thanks to Greg Bedwell for spotting this issue.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232046 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 15:16:58 +00:00
Sanjay Patel
b4c1547749 [X86, AVX] replace vextractf128 intrinsics with generic shuffles
Now that we've replaced the vinsertf128 intrinsics, 
do the same for their extract twins.

This is very much like D8086 (checked in at r231794):
We want to replace as much custom x86 shuffling via intrinsics
as possible because pushing the code down the generic shuffle
optimization path allows for better codegen and less complexity
in LLVM.

This is also the LLVM sibling to the cfe D8275 patch.

Differential Revision: http://reviews.llvm.org/D8276



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232045 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 15:15:19 +00:00
Simon Pilgrim
df0acf35f3 [X86][AVX2] Added missing palignr stack folding test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232033 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 13:12:33 +00:00
Jingyue Wu
3ea0adcdd5 [NVPTXAsmPrinter] do not print .align on function headers
Summary:
PTX does not allow .align directives on function headers.

Fixes PR21551.

Test Plan: test/Codegen/NVPTX/function-align.ll

Reviewers: eliben, jholewinski

Reviewed By: eliben, jholewinski

Subscribers: llvm-commits, eliben, jpienaar, jholewinski

Differential Revision: http://reviews.llvm.org/D8274

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232004 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 01:50:30 +00:00
Reid Kleckner
b53bb04b2f Remove some CHECK-NOT lines in favor of CHECK-NEXT
NFC, this is just shorter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232000 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 01:38:48 +00:00
Reid Kleckner
7dedaabcae Stop calling DwarfEHPrepare from WinEHPrepare
Instead, run both EH preparation passes, and have them both ignore
functions with unrecognized EH personalities. Pass delegation involved
some hacky code for creating an AnalysisResolver that we don't need now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231995 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-12 00:36:20 +00:00
Reid Kleckner
70992ac969 Handle big index in getelementptr instruction
CodeGen incorrectly ignores (assert from APInt) constant index bigger
than 2^64 in getelementptr instruction. This is a test and fix for that.

Patch by Paweł Bylica!

Reviewed By: rnk

Subscribers: majnemer, rnk, mcrosier, resistor, llvm-commits

Differential Revision: http://reviews.llvm.org/D8219

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231984 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-11 23:36:10 +00:00
Andrew Kaylor
1134ac4a0f Extended support for native Windows C++ EH outlining
Differential Review: http://reviews.llvm.org/D7886



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231981 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-11 23:22:06 +00:00
Jozef Kolek
a2b4e9a30e [mips][microMIPS] Make usage of NOT16 by code generator
Differential Revision: http://reviews.llvm.org/D7748


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231963 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-11 20:28:31 +00:00
Sanjay Patel
02402b3cc1 add CHECK-LABELs for better reliability
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231962 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-11 20:12:07 +00:00
Rafael Espindola
0c78583bf6 Put jump tables in unique sections on COFF.
If a function is going in an unique section (because of -ffunction-sections
for example), putting a jump table in .rodata will keep .rodata alive and
that will keep alive any other function that also has a jump table.

Instead, put the jump table in a unique section that is associated with the
function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231961 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-11 19:58:37 +00:00
Tim Northover
52f83a9ab3 ARM: simplify and extend byval handling
The main issue being fixed here is that APCS targets handling a "byval align N"
parameter with N > 4 were miscounting what objects were where on the stack,
leading to FrameLowering setting the frame pointer incorrectly and clobbering
the stack.

But byval handling had grown over many years, and had multiple layers of cruft
trying to compensate for each other and calculate padding correctly. This only
really needs to be done once, in the HandleByVal function. Elsewhere should
just do what it's told by that call.

I also stripped out unnecessary APCS/AAPCS distinctions (now that Clang emits
byvals with the correct C ABI alignment), which simplified HandleByVal.

rdar://20095672

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231959 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-11 18:54:22 +00:00
Derek Schuff
87e6561f34 Make NaCl's use of .init_array for static constructors match Linux
Summary:
The generic ELF TargetObjectFile defaults to .ctors, but Linux's
defaults to .init_array by calling InitializeELF with the value of
UseInitArray from TargetMachine. Make NaCl's behavior match.

Reviewers: jvoung
Differential Revision: http://reviews.llvm.org/D8240

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231934 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-11 16:16:09 +00:00
Quentin Colombet
7775242da3 [CodeGenPrepare] Refine the cost model provided by the promotion helper.
- Use TargetLowering to check for the actual cost of each extension.
- Provide a factorized method to check for the cost of an extension:
  TargetLowering::isExtFree.
- Provide a virtual method TargetLowering::isExtFreeImpl for targets to be able
  to tune the cost of non-free extensions.

This refactoring offers a better granularity to model what really happens on
different targets.

No performance changes and very few code differences.

Part of <rdar://problem/19267165> 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231855 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-10 21:48:15 +00:00
Nemanja Ivanovic
dc12298109 Add support for part-word atomics for PPC
http://reviews.llvm.org/D8090#inline-67337


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231843 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-10 20:51:07 +00:00
Ahmed Bougacha
4a3cd42601 [AArch64] Avoid going through GPRs for across-vector instructions.
This adds new node types for each intrinsic.
For instance, for addv, we have AArch64ISD::UADDV, such that:
  (v4i32 (uaddv ...))
is the same as
  (v4i32 (scalar_to_vector (i32 (int_aarch64_neon_uaddv ...))))
that is,
  (v4i32 (INSERT_SUBREG (v4i32 (IMPLICIT_DEF)),
           (i32 (int_aarch64_neon_uaddv ...)), ssub)

In a combine, we transform all such across-vector-lanes intrinsics to:

  (i32 (extract_vector_elt (uaddv ...), 0))

This has one big advantage: by making the extract_element explicit, we
enable the existing patterns for lane-aware instructions to fire.
This lets us avoid needlessly going through the GPRs.  Consider:

    uint32x4_t test_mul(uint32x4_t a, uint32x4_t b) {
        return vmulq_n_u32(a, vaddvq_u32(b));
    }

We now generate:
    addv.4s  s1, v1
    mul.4s   v0, v0, v1[0]
instead of the previous:
    addv.4s  s1, v1
    fmov     w8, s1
    dup.4s   v1, w8
    mul.4s   v0, v1, v0

rdar://20044838


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231840 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-10 20:45:38 +00:00