Commit Graph

31173 Commits

Author SHA1 Message Date
Daniel Sanders
f814c6418d [mips] Removed MipsCC::fixedArgFn(). NFC
Summary:
There is one remaining trace of it in MipsCC::analyzeCallOperands() where
Mips16 might override the calling convention. This will moved into
tablegen-erated code later.

Depends on D5965

Reviewers: vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5966

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221053 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-01 17:44:51 +00:00
Daniel Sanders
5e4b155521 [tablegen] Add CustomCallingConv and use it to tablegen-erate the outermost parts of the Mips O32 implementation
Summary:
CustomCallingConv is simply a CallingConv that tablegen should not generate the
implementation for. It allows regular CallingConv's to delegate to these custom
functions. This is (currently) necessary for Mips and we cannot use CCCustom
without having to adapt to the different API that CCCustom uses.

This brings us a bit closer to being able to remove
MipsCC::analyzeCallOperands and MipsCC::analyzeFormalArguments in favour of
the common implementation.

No functional change to the targets.

Depends on D3341

Reviewers: vmedic

Reviewed By: vmedic

Subscribers: vmedic, llvm-commits

Differential Revision: http://reviews.llvm.org/D5965

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221052 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-01 17:38:22 +00:00
Rafael Espindola
5793838fc8 Remove redundant calls to isMaterializable.
This removes calls to isMaterializable in the following cases:

* It was redundant with a call to isDeclaration now that isDeclaration returns
  the correct answer for materializable functions.
* It was followed by a call to Materialize. Just call Materialize and check EC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221050 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-01 16:46:18 +00:00
Daniel Sanders
e2add4346d Revert r221048 - Test commit
It seems I can't commit unless $DBUS_SESSION_BUS_ADDRESS is set correctly and
it is not set for ssh sessions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221049 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-01 16:08:03 +00:00
Daniel Sanders
211ed6d001 Test commit
Added some whitespace to debug some authentication issues I'm having.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221048 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-01 16:00:40 +00:00
Adrian Prantl
bdec4aee7b Revert "Temporarily revert r220777 to sort out build bot breakage."
This reverts commit r221028. Later commits depend on this and
reverting just this one causes even more bots to fail.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221041 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-01 03:19:45 +00:00
NAKAMURA Takumi
5ea912ac8d Revert r220779, "[AVX512] Removed special case for cmp instructions in getVectorMaskingNode. Now cmp intrinsics lower as other intrinsics through VSELECT, and then VSELECT tranforms to AND in PerformSELECTCombine."
Since r221028 (reverting r220777), this caused failures.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221040 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-01 01:36:14 +00:00
Adrian Prantl
a4e1564971 Temporarily revert r220777 to sort out build bot breakage.
"[x86] Simplify vector selection if condition value type matches vselect value type and true value is all ones or false value is all zeros."

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221028 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-01 00:26:59 +00:00
Duncan P. N. Exon Smith
3a84a6377c IR: MDNode => Value: Instruction::getMetadata()
Change `Instruction::getMetadata()` to return `Value` as part of
PR21433.

Update most callers to use `Instruction::getMDNode()`, which wraps the
result in a `cast_or_null<MDNode>`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221024 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-01 00:10:31 +00:00
Reid Kleckner
8d2e511d3b Revert "R600: Add missing file to CMakeLists.txt"
This reverts commit r220998.

It should've been reverted with the other change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221021 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-31 23:39:10 +00:00
Reid Kleckner
a5607fb841 Revert "R600: Make sure to inline all internal functions"
This reverts commit r220996.

It introduced layering violations causing link errors in many
configurations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221020 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-31 23:35:26 +00:00
Reid Kleckner
e1a4787d5d Work around bugs in MSVC "14" CTP 3's conversion logic
It appears to ignore or find ambiguous MachineInstrBuilder's conversion
operators that allow conversion to MachineInstr* and
MachineBasicBlock::bundle_iterator.

As a workaround, add an explicit way to get the MachineInstr.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221017 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-31 23:19:46 +00:00
Tom Stellard
03fb370534 R600: Add IPO to the list of required libraries
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221004 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-31 21:52:08 +00:00
Tom Stellard
7dd15e65fb R600: Add missing file to CMakeLists.txt
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220998 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-31 20:56:36 +00:00
Tom Stellard
b5c86504a0 R600: Don't promote allocas when one of the users is a ptrtoint instruction
We need to figure out how to track ptrtoint values all the
way until result is converted back to a pointer in order
to correctly rewrite the pointer type.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220997 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-31 20:52:04 +00:00
Tom Stellard
5d6cee5e65 R600: Make sure to inline all internal functions
Function calls aren't supported yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220996 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-31 20:52:02 +00:00
Bill Schmidt
2d32816a45 [PowerPC] Initial VSX intrinsic support, with min/max for vector double
Now that we have initial support for VSX, we can begin adding
intrinsics for programmer access to VSX instructions.  This patch adds
basic support for VSX intrinsics in general, and tests it by
implementing intrinsics for minimum and maximum for the vector double
data type.

The LLVM portion of this is quite straightforward.  There is a
companion patch for Clang.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220988 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-31 19:19:07 +00:00
Chad Rosier
b9ce924135 [AArch64] Check Dest Register Liveness in CondOpt pass.
Our internal test reveals such case should not be transformed:

  cmp x17, #3
  b.lt .LBB10_15
  ...
  subs x12, x12, #1
  b.gt .LBB10_1

where x12 is a liveout, becomes:

  cmp x17, #2
  b.le .LBB10_15
  ...
  subs x12, x12, #2
  b.ge .LBB10_1

Unable to provide test case as it's difficult to reproduce on community branch.

http://reviews.llvm.org/D6048
Patch by Zhaoshi Zheng <zhaoshiz@codeaurora.org>!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220987 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-31 19:02:38 +00:00
Quentin Colombet
9b6ca9304c [CodeGenPrepare] Move extractelement close to store if they can be combined.
This patch adds an optimization in CodeGenPrepare to move an extractelement
right before a store when the target can combine them.
The optimization may promote any scalar operations to vector operations in the
way to make that possible.


** Context **

Some targets use different register files for both vector and scalar operations.
This means that transitioning from one domain to another may incur copy from one
register file to another. These copies are not coalescable and may be expensive.
For example, according to the scheduling model, on cortex-A8 a vector to GPR
move is 20 cycles.


** Motivating Example **

Let us consider an example:
define void @foo(<2 x i32>* %addr1, i32* %dest) {
 %in1 = load <2 x i32>* %addr1, align 8
 %extract = extractelement <2 x i32> %in1, i32 1
 %out = or i32 %extract, 1
 store i32 %out, i32* %dest, align 4
 ret void
}

As it is, this IR generates the following assembly on armv7:
  vldr  d16, [r0]            @vector load  
  vmov.32 r0, d16[1]  @ cross-register-file copy: 20 cycles
  orr r0, r0, #1           @ scalar bitwise or
  str r0, [r1]               @ scalar store
  bx  lr

Whereas we could generate much faster code:
  vldr  d16, [r0]               @ vector load
  vorr.i32  d16, #0x1     @ vector bitwise or
  vst1.32 {d16[1]}, [r1:32] @ vector extract + store
  bx  lr

Half of the computation made in the vector is useless, but this allows to get
rid of the expensive cross-register-file copy.


** Proposed Solution **

To avoid this cross-register-copy penalty, we promote the scalar operations to
vector operations. The penalty will be removed if we manage to promote the whole
chain of computation in the vector domain.
Currently, we do that only when the chain of computation ends by a store and the
target is able to combine an extract with a store.

Stores are the most likely candidates, because other instructions produce values
that would need to be promoted and so, extracted as some point[1]. Moreover,
this is customary that targets feature stores that perform a vector extract (see
AArch64 and X86 for instance).

The proposed implementation relies on the TargetTransformInfo to decide whether
or not it is beneficial to promote a chain of computation in the vector domain.
Unfortunately, this interface is rather inaccurate for this level of details and
although this optimization may be beneficial for X86 and AArch64, the inaccuracy
will lead to the optimization being too aggressive.
Basically in TargetTransformInfo, everything that is legal has a cost of 1,
whereas, even if a vector type is legal, usually a vector operation is slightly
more expensive than its scalar counterpart. That will lead to too many
promotions that may not be counter balanced by the saving of the
cross-register-file copy. For instance, on AArch64 this penalty is just 4
cycles.

For now, the optimization is just enabled for ARM prior than v8, since those
processors have a larger penalty on cross-register-file copies, and the scope is
limited to basic blocks. Because of these two factors, we limit the effects of
the inaccuracy. Indeed, I did not want to build up a fancy cost model with block
frequency and everything on top of that.

[1] We can imagine targets that can combine an extractelement with  other
instructions than just stores. If we want to go into that direction, the current
interfaces must be augmented and, moreover, I think this becomes a global isel
problem.

Differential Revision: http://reviews.llvm.org/D5921

<rdar://problem/14170854>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220978 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-31 17:52:53 +00:00
Chad Rosier
66d3a86a9a [AArch64] CondOpt pass is missing FCMP instructions when searching backward for
a CMP which defines the flags used by B.CC.

http://reviews.llvm.org/D6047
Patch by Zhaoshi Zheng <zhaoshiz@codeaurora.org>!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220961 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-31 15:17:36 +00:00
Ulrich Weigand
8a9c531e9a [PowerPC] Load BlockAddress values from the TOC in 64-bit SVR4 code
Since block address values can be larger than 2GB in 64-bit code, they
cannot be loaded simply using an @l / @ha pair, but instead must be
loaded from the TOC, just like GlobalAddress, ConstantPool, and
JumpTable values are.

The commit also fixes a bug in PPCLinuxAsmPrinter::doFinalization where
temporary labels could not be used as TOC values, since code would
attempt (and fail) to use GetOrCreateSymbol to create a symbol of the
same name as the temporary label.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220959 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-31 10:33:14 +00:00
Robert Khasanov
7d18d46ef2 [AVX512] Added VBROADCAST{SS/SD} encoding for VL subset.
Refactored through AVX512_maskable
        


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220908 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-30 14:21:47 +00:00
Robert Khasanov
63c2f3292e [AVX512] Implemented AVX512VL FP bnary packed instructions (VADDP*, VSUBP*, VMULP*, VDIVP*, VMAXP*, VMINP*)
Refactored through AVX512_maskable
Added encoding tests for them.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220858 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-29 15:43:02 +00:00
Robert Khasanov
e1610162fb [AVX512] Fix VSQRT packed instructions internal names.
No functional change


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220808 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-28 18:22:41 +00:00
Robert Khasanov
9371efbcdb [AVX512] Extended avx512_sqrt_packed (sqrt instructions) to VL subset.
Refactored through AVX512_maskable



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220806 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-28 18:15:20 +00:00
Robert Khasanov
59cb03d329 [AVX-512] Expanded rsqrt/rcp instructions to VL subset.
Refactored multiclass through AVX512_maskable



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220783 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-28 16:37:13 +00:00
Robert Khasanov
d4345dd85f [AVX512] Removed special case for cmp instructions in getVectorMaskingNode. Now cmp intrinsics lower as other intrinsics through VSELECT, and then VSELECT tranforms to AND in PerformSELECTCombine.
No functional change.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220779 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-28 16:17:14 +00:00
Robert Khasanov
edf556ec1f [x86] Simplify vector selection if condition value type matches vselect value type and true value is all ones or false value is all zeros.
This transformation worked if selector is produced by SETCC, however SETCC is needed only if we consider to swap operands. So I replaced SETCC check for this case.
Added tests for vselect of <X x i1> values.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220777 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-28 15:59:40 +00:00
Robert Khasanov
4a52493457 [AVX512] Bring back vector-shuffle lowering support through broadcasts
Ffter commit at rev219046 512-bit broadcasts lowering become non-optimal. Most of tests on broadcasting and embedded broadcasting were changed and they doesn’t produce efficient code.

Example below is from commit changes (it’s the first test from test/CodeGen/X86/avx512-vbroadcast.ll):

 define   <16 x i32> @_inreg16xi32(i32 %a) {
 ; CHECK-LABEL: _inreg16xi32:
 ; CHECK:       ## BB#0:
-; CHECK-NEXT:    vpbroadcastd %edi, %zmm0
+; CHECK-NEXT:    vmovd %edi, %xmm0
+; CHECK-NEXT:    vpbroadcastd %xmm0, %ymm0
+; CHECK-NEXT:    vinserti64x4 $1, %ymm0, %zmm0, %zmm0
 ; CHECK-NEXT:    retq
 %b = insertelement <16 x i32> undef, i32 %a, i32 0
 %c = shufflevector <16 x i32> %b, <16 x i32> undef, <16 x i32> zeroinitializer
 ret <16 x i32> %c
}

Here, 256-bit broadcast was generated instead of 512-bit one.

In this patch
1) I added vector-shuffle lowering through broadcasts
2) Removed asserts and branches likes because this is incorrect
-  assert(Subtarget->hasDQI() && "We can only lower v8i64 with AVX-512-DQI");
3) Fixed lowering tests


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220774 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-28 12:28:51 +00:00
Reid Kleckner
d5de327da0 X86: Implement the vectorcall calling convention
This is a Microsoft calling convention that supports both x86 and x86_64
subtargets. It passes vector and floating point arguments in XMM0-XMM5,
and passes them indirectly once they are consumed.

Homogenous vector aggregates of up to four elements can be passed in
sequential vector registers, but this part is not implemented in LLVM
and will be handled in Clang.

On 32-bit x86, it is similar to fastcall in that it uses ecx:edx as
integer register parameters and is callee cleanup. On x86_64, it
delegates to the normal win64 calling convention.

Reviewers: majnemer

Differential Revision: http://reviews.llvm.org/D5943

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220745 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-28 01:29:26 +00:00
Tim Northover
dd778c6c9f AArch64: enable Cortex-A57 FP balancing on Cortex-A53.
Benchmarks have shown that it's harmless to the performance there, and having a
unified set of passes between the two cores where possible helps big.LITTLE
deployment.

Patch by Z. Zheng.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220744 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-28 01:24:32 +00:00
NAKAMURA Takumi
9ef2fd8775 AArch64InstrInfo.h: Fix a warning introduced in clang r220703. [-Winconsistent-missing-override]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220739 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-27 23:29:27 +00:00
Adam Nemet
6bc8d95153 [AVX512] Add vpermil variable version
This is implemented via a multiclass that derives from the vperm imm
multiclass.

Fixes <rdar://problem/18426089>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220737 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-27 23:08:40 +00:00
Adam Nemet
5c76721372 [AVX512] Clean up avx512_perm_imm to use X86VectorVTInfo
No functionality change.  No change in X86.td.expanded except that we only set
the CD8 attributes for the memory variants.  (This shouldn't be used unless we
have a memory operand.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220736 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-27 23:08:37 +00:00
Adam Nemet
7ba4de2ccc [AVX512] Derive vpermil* from avx512_perm_imm
This used to derive from avx512_pshuf_imm which is confusing.

NFC.  Compared X86.td.expanded.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220735 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-27 23:08:34 +00:00
Adam Nemet
d0ee9ada16 [AVX512] Fix copy-and-paste bugs in vpermil
1) i512mem -> f512mem (this is the packed FP input being permuted)
2) element size is 64 bits in EVEX_CD8 for PD.

(A good illustration why X86VectorVTInfo is useful)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220734 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-27 23:08:31 +00:00
Pete Cooper
68aeef61f4 Fix a stackmap bug introduced in r220710.
For a call to not return in to the stackmap shadow, the shadow must end with the call.

To do this, we must insert any required nops *before* the call, and not after it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220728 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-27 22:38:45 +00:00
Juergen Ributzka
52a6f59d41 [FastISel][AArch64] Emit immediate version of icmp (subs) for null pointer check.
This is a minor change to use the immediate version when the operand is a null
value. This should get rid of an unnecessary 'mov' instruction in debug
builds and align the code more with the one generated by SelectionDAG.

This fixes rdar://problem/18785125.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220713 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-27 19:58:36 +00:00
Juergen Ributzka
e2995ff88f [FastISel][AArch64] Optimize compare-and-branch for i1 to use 'tbz'.
Minor enhancement to use 'tbz' for i1 compare-and-branch to get rid of an 'and'
instruction.

This fixes rdar://problem/18784953.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220712 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-27 19:46:23 +00:00
Pete Cooper
7476f9c513 Stackmap shadows should consider call returns a branch target.
To avoid emitting too many nops, a stackmap shadow can include emitted instructions in the shadow, but these must not include branch targets.

A return from a call should count as a branch target as patching over the instructions after the call would lead to incorrect behaviour for threads currently making that call, when they return.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220710 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-27 19:40:35 +00:00
Juergen Ributzka
b11c5b1078 [FastISel][AArch64] Use 'cbz' also for null values (pointers).
The pattern matching for a 'ConstantInt' value was too restrictive. Checking for
a 'Constant' with a bull value is sufficient for using an 'cbz/cbnz' instruction.

This fixes rdar://problem/18784732.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220709 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-27 19:38:05 +00:00
Juergen Ributzka
5745cad861 [FastISel][AArch64] Don't fold the 'and' instruction into the 'tbz/tbnz' instruction if it is in a different basic block.
This fixes a bug where the input register was not defined for the 'tbz/tbnz'
instruction. This happened, because we folded the 'and' instruction from a
different basic block.

This fixes rdar://problem/18784013.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220704 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-27 19:16:48 +00:00
Juergen Ributzka
d3a04223e8 [FastISel][AArch64] Fix load/store with frame indices.
At higher optimization levels the LLVM IR may contain more complex patterns for
loads/stores from/to frame indices. The 'computeAddress' function wasn't able to
handle this and triggered an assertion.

This fix extends the possible addressing modes for frame indices.

This fixes rdar://problem/18783298.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220700 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-27 18:21:58 +00:00
Lang Hames
57902cc070 [PBQP] Unique allowed-sets for nodes in the PBQP graph and use pairs of these
sets as keys into a cache of interference matrice values in the Interference
constraint adder.

Creating interference matrices was one of the large remaining time-sinks in
PBQP. Caching them reduces the total compile time (when using PBQP) on the
nightly test suite by ~10%.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220688 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-27 17:44:25 +00:00
NAKAMURA Takumi
af628cc0b8 Prune CRLF.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220678 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-27 12:37:26 +00:00
Oliver Stannard
b1d8e7e77c [ARM] Select VMAXNM and VMINNM regardless of operand order
Currently, the ARM backend will select the VMAXNM and VMINNM for these C
expressions:
  (a < b) ? a : b
  (a > b) ? a : b
but not these expressions:
  (a > b) ? b : a
  (a < b) ? b : a

This patch allows all of these expressions to be matched.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220671 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-27 09:23:02 +00:00
Yuri Gorshenin
75bb472c06 [asan-asm-instrumentation] Added comment describing how asm instrumentation works.
Summary: [asan-asm-instrumentation] Added comment describing how asm instrumentation works.

Reviewers: eugenis

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5970

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220670 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-27 08:38:54 +00:00
Elena Demikhovsky
9e19cf1ffd AVX-512: Fixed encoding of VPBROADCASTM and added SKX forms of this instruction
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220638 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-26 09:52:24 +00:00
Simon Pilgrim
c31aaa5a3f [X86][SSE] Vector integer/float conversion memory folding
Tidied up some entries in the folding tables so that they are under the correct comment section (they were categorised as AVX2 instructions when they're AVX1).

Minor patch agreed with qcolombet.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220613 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-25 08:11:20 +00:00
Jingyue Wu
1d1d705a95 [NVPTX] aligned byte-buffers for vector return types
Summary:
Fixes PR21100 which is caused by inconsistency between the declared return type
and the expected return type at the call site. The new behavior is consistent
with nvcc and the NVPTXTargetLowering::getPrototype function.

Test Plan: test/Codegen/NVPTX/vector-return.ll

Reviewers: jholewinski

Reviewed By: jholewinski

Subscribers: llvm-commits, meheff, eliben, jholewinski

Differential Revision: http://reviews.llvm.org/D5612

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220607 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-25 03:46:16 +00:00