Commit Graph

4636 Commits

Author SHA1 Message Date
Cameron McInally
febc28b529 Update AVX512 vector blend intrinsic names.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196581 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-06 13:35:35 +00:00
Juergen Ributzka
fca7695903 [Stackmap] Update stackmap unit test to use AnyRegCC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196552 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-06 00:28:54 +00:00
Alp Toker
087ab613f4 Correct word hyphenations
This patch tries to avoid unrelated changes other than fixing a few
hyphen-related ambiguities and contractions in nearby lines.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196471 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-05 05:44:44 +00:00
Rafael Espindola
9155b17815 Hide the stub created for MO_ExternalSymbol too.
given

declare void @llvm.memset.p0i8.i32(i8* nocapture, i8, i32, i32, i1)
declare void @foo()
define void @bar() {
  call void @foo()
  call void @llvm.memset.p0i8.i32(i8* null, i8 0, i32 188, i32 1, i1 false)
  ret void
}

We used to produce

L_foo$stub:
        .indirect_symbol        _foo
        .ascii  "\364\364\364\364\364"

_memset$stub:
        .indirect_symbol        _memset
        .ascii  "\364\364\364\364\364"

We not produce a private stub for memset too.

Stubs are not needed with recent linkers, but we still produce them for darwin8.

Thanks to David Fang for confirming that gcc used to do this too.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196468 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-05 05:19:12 +00:00
Cameron McInally
f6770bcee8 Add FileCheck statements for r196435.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196449 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-05 01:20:36 +00:00
Cameron McInally
6c8faddaf5 Add AVX512 patterns for v16i32 broadcast and v2i64 zero extend load.
Patch by Aleksey Bader.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196435 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-05 00:11:25 +00:00
Cameron McInally
6d3d93c40b Fix assembly syntax for AVX512 vector blend instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196393 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-04 18:05:36 +00:00
Cameron McInally
80955805e4 Suppress '(x < y) ? a : 0 -> (x < y) & a' transform on X86 architectures with dedicated mask registers.
Patch by Aleksey Bader.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196386 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-04 14:52:33 +00:00
Rafael Espindola
a61f9456a0 Add -mcpu=core2 to all llc invocations in this test.
Should fix the atom buildbot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196340 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-04 01:25:24 +00:00
Juergen Ributzka
6abfcbdfc8 [Stackmap] Emit multi-byte nops for X86.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196334 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-04 00:39:08 +00:00
Rafael Espindola
b972f33cdd Use CHECK-LABEL to make this test more strict.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196321 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-03 21:12:36 +00:00
Rafael Espindola
21a9fd247e Fix mingw32 thiscall + sret.
Unlike msvc, when handling a thiscall + sret gcc will
* Put the sret in %ecx
* Put the this pointer is (%esp)

This fixes, for example, calling stringstream::str.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196312 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-03 20:51:23 +00:00
Michael Liao
239ffb30b0 Enhance the fix of PR17631
- The fix to PR17631 fixes part of the cases where 'vzeroupper' should
  not be issued before 'call' insn. There're other cases where helper
  calls will be inserted not limited to epilog. These helper calls do
  not follow the standard calling convention and won't clobber any YMM
  registers. (So far, all call conventions will clobber any or part of
  YMM registers.)
  This patch enhances the previous fix to cover more cases 'vzerosupper' should
  not be inserted by checking if that function call won't clobber any YMM
  registers and skipping it if so.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196261 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-03 09:17:32 +00:00
Rafael Espindola
ba7cb02009 Also test the created stubs on 32 bits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196052 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-01 21:24:30 +00:00
Andrew Trick
1561e8381f Add -mcpu to stackmap.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196051 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-01 18:17:05 +00:00
Juergen Ributzka
1baf0c0924 Force CPU type to unbreak unit tests on Haswell machines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195971 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-30 03:07:16 +00:00
Rafael Espindola
ef8a810cd7 Cleanup and test X86AsmPrinter::printPCRelImm.
It is only used for asm printing.

On X86 we put basic block addresses on register before passing them to inline
asm, so the MO_MachineBasicBlock case was dead.

MO_ExternalSymbol was dead since any symbol being passed to inline asm
is represented as MO_GlobalAddress.

The MO_GlobalAddress and MO_Register cases were not tested.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195824 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 06:53:13 +00:00
Michael Liao
fd115c47a2 Fix PR18054
- Fix bug in (vsext (vzext x)) -> (vsext x) in SIGN_EXTEND_IN_REG
  lowering where we need to check whether x is a vector type (in-reg
  type) of i8, i16 or i32; otherwise, that optimization is not valid.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195779 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-26 20:31:31 +00:00
Andrew Trick
501aeea325 StackMap: Implement support for DirectMemRefOp.
A Direct stack map location records the address of frame index. This
address is itself the value that the runtime requested. This differs
from IndirectMemRefOp locations, which refer to a stack locations from
which the requested values must be loaded. Direct locations can
directly communicate the address if an alloca, while IndirectMemRefOp
handle register spills.

For example:

entry:
  %a = alloca i64...
  llvm.experimental.stackmap(i32 <ID>, i32 <shadowBytes>, i64* %a)

Since both the alloca and stackmap intrinsic are in the entry block,
and the intrinsic takes the address of the alloca, the runtime can
assume that LLVM will not substitute alloca with any intervening
value. This must be verified by the runtime by checking that the stack
map's location is a Direct location type. The runtime can then
determine the alloca's relative location on the stack immediately after
compilation, or at any time thereafter. This differs from Register and
Indirect locations, because the runtime can only read the values in
those locations when execution reaches the instruction address of the
stack map.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195712 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-26 02:03:25 +00:00
Cameron McInally
0e6ec124d5 Add an intrinsic for the SSE2 PAUSE instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195697 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-26 00:20:43 +00:00
Bill Wendling
5df09f0367 Unrevert r195599 with testcase fix.
I'm not sure how it was checking for the wrong values...
PR18023.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195670 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-25 18:05:22 +00:00
Amara Emerson
99812474c3 Revert r195599 as it broke the builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195636 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-25 11:24:18 +00:00
Bill Wendling
dfc615f284 Don't look past volatile loads.
A volatile load should block us from trying to coalesce stores.
PR18023

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195599 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-25 05:01:21 +00:00
Manman Ren
bc8569d0c0 Debug Info: update testing cases to specify the debug info version number.
We are going to drop debug info without a version number or with a different
version number, to make sure we don't crash when we see bitcode files with
different debug info metadata format.

Make tests more robust by removing hard-coded metadata numbers in CHECK lines.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195535 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-23 01:16:29 +00:00
Manman Ren
bec50063a5 Debug Info: update testing cases to specify the debug info version number.
We are going to drop debug info without a version number or with a different
version number, to make sure we don't crash when we see bitcode files with
different debug info metadata format.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195504 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-22 21:49:45 +00:00
Jim Grosbach
e1af5f6ad1 X86: Perform integer comparisons at i32 or larger.
Utilizing the 8 and 16 bit comparison instructions, even when an input can
be folded into the comparison instruction itself, is typically not worth it.
There are too many partial register stalls as a result, leading to significant
slowdowns. By always performing comparisons on at least 32-bit
registers, performance of the calculation chain leading to the
comparison improves. Continue to use the smaller comparisons when
minimizing size, as that allows better folding of loads into the
comparison instructions.

rdar://15386341

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195496 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-22 19:57:47 +00:00
Paul Robinson
16c7e0b48c Teach ISel not to optimize 'optnone' functions (revised).
Improvements over r195317:
- Set/restore EnableFastISel flag instead of just running FastISel within
  SelectAllBasicBlocks; the flag is checked in various places, and
  FastISel won't run properly if those places don't do the right thing.
- Test looks for normal ISel versus FastISel behavior, and not
  something more subtle that doesn't work everywhere.

Based on work by Andrea Di Biagio.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195491 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-22 19:11:24 +00:00
Andrew Trick
ed20bf5ef8 patchpoint: factor SD builder code for live vars. Plain stackmap also optimizes Constant values now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195488 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-22 19:07:36 +00:00
Michael Liao
0894438912 Fix PR18014
- When simplifying the mask generation for BLEND, check whether that mask is
  also consumed by other non-BLEND insns. If true, skip that simplification.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195476 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-22 17:56:57 +00:00
Rafael Espindola
9519b689c8 Don't produce tail calls when the caller is x86_thiscallcc.
The callee will not pop the stack for us.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195467 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-22 15:18:28 +00:00
Kostya Serebryany
a7e8d6581f Revert r195318 as it causes miscompilation (PR18029)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195439 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-22 10:30:39 +00:00
NAKAMURA Takumi
ca3c03a167 Tweak 3 tests in llvm/test/CodeGen/X86 to add -mcpu=generic since r195383.
They failed on bdver2 buildslave.

FIXME: FileCheck-ize them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195407 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-22 02:28:04 +00:00
Ekaterina Romanova
46f7257ed1 SHLD/SHRD are VectorPath (microcode) instructions known to have poor latency on certain architectures. While generating SHLD/SHRD instructions is acceptable when optimizing for size, optimizing for speed on these platforms should be implemented using alternative sequences of instructions composed of add, adc, shr, shl, or and lea which are directPath instructions. These alternative instructions not only have a lower latency but they also increase the decode bandwidth by allowing simultaneous decoding of a third directPath instruction.
AMD's processors family K7, K8, K10, K12, K15 and K16 are known to have SHLD/SHRD instructions with very poor latency. Optimization guides for these processors recommend using an alternative sequence of instructions. For these AMD's processors, I disabled folding (or (x << c) | (y >> (64 - c))) when we are not optimizing for size.

It might be beneficial to disable this folding for some of the Intel's processors. However, since I couldn't find specific recommendations regarding using SHLD/SHRD instructions on Intel's processors, I haven't disabled this peephole for Intel.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195383 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-21 23:21:26 +00:00
Bill Wendling
072ebe59e2 The basic problem is that some mainstream programs cannot deal with the way
clang optimizes tail calls, as in this example:

int foo(void);
int bar(void) {
 return foo();
}

where the call is transformed to:

  calll .L0$pb
.L0$pb:
  popl  %eax
.Ltmp0:
  addl  $_GLOBAL_OFFSET_TABLE_+(.Ltmp0-.L0$pb), %eax
  movl  foo@GOT(%eax), %eax
  popl  %ebp
  jmpl  *%eax                   # TAILCALL

However, the GOT references must all be resolved at dlopen() time, and so this
approach cannot be used with lazy dynamic linking (e.g. using RTLD_LAZY), which
usually populates the PLT with stubs that perform the actual resolving.

This patch changes X86TargetLowering::LowerCall() to skip tail call
optimization, if the called function is a global or external symbol.

Patch by Dimitry Andric!

PR15086

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195318 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-21 07:04:30 +00:00
Benjamin Kramer
16e2f0ef1a MachineBlockPlacement: Strengthen the source order bias when picking an exit block.
We now only allow breaking source order if the exit block frequency is
significantly higher than the other exit block. The actual bias is
currently under a flag so the best cut-off can be found; the flag
defaults to the old behavior. The idea is to get some benchmark coverage
over different values for the flag and pick the best one.

When we require the new frequency to be at least 20% higher than the old
frequency I see a 5% speedup on zlib's deflate when compressing a random
file on x86_64/westmere. Hal reported a small speedup on Fhourstones on
a BG/Q and no regressions in the test suite.

The test case is the full long_match function from zlib's deflate. I was
reluctant to add it for previous tweaks to branch probabilities because
it's large and potentially fragile, but changed my mind since it's an
important use case and more likely to break with all the current work
going into the PGO infrastructure.

Differential Revision: http://llvm-reviews.chandlerc.com/D2202

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195265 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-20 19:08:44 +00:00
Elena Demikhovsky
5cd32afac4 AVX-512: Concat 4 128-bit vectors in one 512-bit vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195229 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-20 09:10:40 +00:00
Cameron McInally
c5a925c198 Fix assembly operands for the SSE2 cvtsd2ss instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195129 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-19 14:36:00 +00:00
Andrew Trick
d73d4f4ef2 Use symbolic operands in the patchpoint folding routine and fix a spilling bug.
Fixes <rdar://15487687> [JS] AnyRegCC argument ends up being spilled

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195094 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-19 03:29:59 +00:00
Bill Wendling
281e8eb9fd Testcase for PR17964
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194961 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-17 10:53:19 +00:00
Benjamin Kramer
d5ae5b0186 DAGCombiner: Partially revert r192795, getNOT was fixed not to create illegal constants.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194959 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-17 10:40:03 +00:00
Andrew Trick
bb756ca244 Added a size field to the stack map record to handle subregister spills.
Implementing this on bigendian platforms could get strange. I added a
target hook, getStackSlotRange, per Jakob's recommendation to make
this as explicit as possible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194942 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-17 01:36:23 +00:00
Bob Wilson
cc7052343e Avoid illegal integer promotion in fastisel
Stop folding constant adds into GEP when the type size doesn't match.
Otherwise, the adds' operands are effectively being promoted, changing the
conditions of an overflow.  Results are different when:

    sext(a) + sext(b) != sext(a + b)

Problem originally found on x86-64, but also fixed issues with ARM and PPC,
which used similar code.

<rdar://problem/15292280>

Patch by Duncan Exon Smith!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194840 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-15 19:09:27 +00:00
Cameron McInally
28e12e9f02 Add AVX512 unmasked FMA intrinsics and support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194824 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-15 17:01:14 +00:00
Alexey Samsonov
ec3ce8a99d Redirect unused test case output to /dev/null
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194798 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-15 09:36:58 +00:00
Andrew Trick
728eb5fbea Platform proof a test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194788 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-15 05:52:56 +00:00
Matt Arsenault
59d3ae6cdc Add addrspacecast instruction.
Patch by Michele Scandale!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194760 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-15 01:34:59 +00:00
Eric Christopher
5a5d116d58 Simplify testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194748 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-14 23:43:10 +00:00
Rafael Espindola
5e9f8c3948 Add a triple and switch test to FileCheck.
On windows we don't print .weak for function definitions, so count was only
finding 1 'weak'.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194713 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-14 17:12:32 +00:00
Rafael Espindola
3d47402f2e Error if we see an alias to a declaration.
In ELF and COFF an alias is just another offset in a section. There is no way
to represent an alias to something in another file.

In MachO, the spec has the N_INDR type which should allow for exactly that, but
is not currently implemented. Given that it is specified but not implemented,
we error in codegen to avoid miscompiling but don't reject aliases to
declarations in the verifier to leave the option open of implementing it.

In the past we have used alias to declarations as a way of implementing
weakref, which is why it exists in some old tests which this patch updates.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194705 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-14 13:58:06 +00:00
Elena Demikhovsky
f58e414405 AVX-512: Handled extractelement from mask vector;
Added VMOSHDUP/VMOVSLDUP shuffle instructions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194691 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-14 11:29:27 +00:00