9662 Commits

Author SHA1 Message Date
NAKAMURA Takumi
ef70d2a393 [CMake] Let add_public_tablegen_target() provide intrinsics_gen, too.
I think, in principle, intrinsics_gen may be added explicitly.
That said, it can be added incidentally, since each target already has dependencies to llvm-tblgen.
Almost all source files depend on both CommonTaleGen and intrinsics_gen.

Explicit add_dependencies() have been pruned under lib/Target.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195929 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-28 17:04:31 +00:00
NAKAMURA Takumi
ad363187c4 [CMake] Let add_public_tablegen_target responsible to provide dependency to CommonTableGen.
add_public_tablegen_target adds *CommonTableGen to LLVM_COMMON_DEPENDS.
LLVM_COMMON_DEPENDS affects add_llvm_library (and other add_target stuff) within its scope.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195927 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-28 17:04:04 +00:00
Rafael Espindola
4ca0ef70cd The global prefix is always one char. Don't use a string for it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195926 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-28 17:00:49 +00:00
NAKAMURA Takumi
98bb341955 [CMake] Prune include_directories() in llvm/lib/Target, take #2.
I forgot to commit them. They were staging in my local repo.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195924 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-28 15:30:37 +00:00
Rafael Espindola
60f6083a36 Use the mangler consistently instead of using getGlobalPrefix directly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195911 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-28 08:59:52 +00:00
Rafael Espindola
825dfc8cba Remove dead code.
MO_ExternalSymbol and MO_JumpTableIndex don't show up in inline asm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195861 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 18:38:14 +00:00
Rafael Espindola
4635dbb8bc Convert two if sequences to switches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195859 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 18:26:51 +00:00
Rafael Espindola
b7e71e35a9 Use a switch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195857 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 18:18:24 +00:00
Rafael Espindola
ed3eb50482 Remove more dead code now that this is only used for inline asm.
MO_ConstantPoolIndex is handled in printLeaMemReference.
MO_JumpTableIndex and MO_ExternalSymbol don't show up in inline asm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195847 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 15:13:06 +00:00
Rafael Espindola
3b818b481f Convert more methods in static helpers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195826 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 07:34:09 +00:00
Rafael Espindola
81e995dc91 Convert these methods into static functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195825 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 07:14:26 +00:00
Rafael Espindola
ef8a810cd7 Cleanup and test X86AsmPrinter::printPCRelImm.
It is only used for asm printing.

On X86 we put basic block addresses on register before passing them to inline
asm, so the MO_MachineBasicBlock case was dead.

MO_ExternalSymbol was dead since any symbol being passed to inline asm
is represented as MO_GlobalAddress.

The MO_GlobalAddress and MO_Register cases were not tested.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195824 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 06:53:13 +00:00
Michael Liao
fd115c47a2 Fix PR18054
- Fix bug in (vsext (vzext x)) -> (vsext x) in SIGN_EXTEND_IN_REG
  lowering where we need to check whether x is a vector type (in-reg
  type) of i8, i16 or i32; otherwise, that optimization is not valid.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195779 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-26 20:31:31 +00:00
Andrew Trick
501aeea325 StackMap: Implement support for DirectMemRefOp.
A Direct stack map location records the address of frame index. This
address is itself the value that the runtime requested. This differs
from IndirectMemRefOp locations, which refer to a stack locations from
which the requested values must be loaded. Direct locations can
directly communicate the address if an alloca, while IndirectMemRefOp
handle register spills.

For example:

entry:
  %a = alloca i64...
  llvm.experimental.stackmap(i32 <ID>, i32 <shadowBytes>, i64* %a)

Since both the alloca and stackmap intrinsic are in the entry block,
and the intrinsic takes the address of the alloca, the runtime can
assume that LLVM will not substitute alloca with any intervening
value. This must be verified by the runtime by checking that the stack
map's location is a Direct location type. The runtime can then
determine the alloca's relative location on the stack immediately after
compilation, or at any time thereafter. This differs from Register and
Indirect locations, because the runtime can only read the values in
those locations when execution reaches the instruction address of the
stack map.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195712 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-26 02:03:25 +00:00
Andrew Trick
151ed66489 whitespace
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195711 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-26 02:03:20 +00:00
Cameron McInally
0e6ec124d5 Add an intrinsic for the SSE2 PAUSE instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195697 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-26 00:20:43 +00:00
Rafael Espindola
02ddf4abc2 Do the string comparison in the constructor instead of once per nop.
Thanks to Roman Divacky for the suggestion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195684 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-25 20:50:03 +00:00
Rafael Espindola
8f6631cdb6 Don't use nopl in cpus that don't support it.
Patch by Mikulas Patocka. I added the test. I checked that for cpu names that
gas knows about, it also doesn't generate nopl.

The modified cpus:
i686 - there are i686-class CPUs that don't have nopl: Via c3, Transmeta
        Crusoe, Microsoft VirtualBox - see
        https://bbs.archlinux.org/viewtopic.php?pid=775414
k6, k6-2, k6-3, winchip-c6, winchip2 - these are 586-class CPUs
via c3 c3-2 - see https://bugs.archlinux.org/task/19733 as a proof that
        Via c3 and c3-Nehemiah don't have nopl

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195679 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-25 20:15:14 +00:00
Tim Northover
8a6c627fd0 X86: enable AVX2 under Haswell native compilation
Patch by Adam Strzelecki

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195632 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-25 09:52:59 +00:00
Jim Grosbach
e1af5f6ad1 X86: Perform integer comparisons at i32 or larger.
Utilizing the 8 and 16 bit comparison instructions, even when an input can
be folded into the comparison instruction itself, is typically not worth it.
There are too many partial register stalls as a result, leading to significant
slowdowns. By always performing comparisons on at least 32-bit
registers, performance of the calculation chain leading to the
comparison improves. Continue to use the smaller comparisons when
minimizing size, as that allows better folding of loads into the
comparison instructions.

rdar://15386341

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195496 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-22 19:57:47 +00:00
Michael Liao
0894438912 Fix PR18014
- When simplifying the mask generation for BLEND, check whether that mask is
  also consumed by other non-BLEND insns. If true, skip that simplification.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195476 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-22 17:56:57 +00:00
Rafael Espindola
9519b689c8 Don't produce tail calls when the caller is x86_thiscallcc.
The callee will not pop the stack for us.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195467 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-22 15:18:28 +00:00
Kostya Serebryany
a7e8d6581f Revert r195318 as it causes miscompilation (PR18029)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195439 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-22 10:30:39 +00:00
Ekaterina Romanova
46f7257ed1 SHLD/SHRD are VectorPath (microcode) instructions known to have poor latency on certain architectures. While generating SHLD/SHRD instructions is acceptable when optimizing for size, optimizing for speed on these platforms should be implemented using alternative sequences of instructions composed of add, adc, shr, shl, or and lea which are directPath instructions. These alternative instructions not only have a lower latency but they also increase the decode bandwidth by allowing simultaneous decoding of a third directPath instruction.
AMD's processors family K7, K8, K10, K12, K15 and K16 are known to have SHLD/SHRD instructions with very poor latency. Optimization guides for these processors recommend using an alternative sequence of instructions. For these AMD's processors, I disabled folding (or (x << c) | (y >> (64 - c))) when we are not optimizing for size.

It might be beneficial to disable this folding for some of the Intel's processors. However, since I couldn't find specific recommendations regarding using SHLD/SHRD instructions on Intel's processors, I haven't disabled this peephole for Intel.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195383 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-21 23:21:26 +00:00
Bill Wendling
072ebe59e2 The basic problem is that some mainstream programs cannot deal with the way
clang optimizes tail calls, as in this example:

int foo(void);
int bar(void) {
 return foo();
}

where the call is transformed to:

  calll .L0$pb
.L0$pb:
  popl  %eax
.Ltmp0:
  addl  $_GLOBAL_OFFSET_TABLE_+(.Ltmp0-.L0$pb), %eax
  movl  foo@GOT(%eax), %eax
  popl  %ebp
  jmpl  *%eax                   # TAILCALL

However, the GOT references must all be resolved at dlopen() time, and so this
approach cannot be used with lazy dynamic linking (e.g. using RTLD_LAZY), which
usually populates the PLT with stubs that perform the actual resolving.

This patch changes X86TargetLowering::LowerCall() to skip tail call
optimization, if the called function is a global or external symbol.

Patch by Dimitry Andric!

PR15086

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195318 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-21 07:04:30 +00:00
NAKAMURA Takumi
56b09220a3 X86ISelLowering.cpp: Mark a variable VT as LLVM_ATTRIBUTE_UNUSED. [-Wunused-variable]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195238 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-20 10:55:22 +00:00
NAKAMURA Takumi
6bee54e30a Whitespace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195237 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-20 10:55:15 +00:00
Elena Demikhovsky
29086acc6e Fixed compilation error.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195230 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-20 09:23:22 +00:00
Elena Demikhovsky
5cd32afac4 AVX-512: Concat 4 128-bit vectors in one 512-bit vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195229 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-20 09:10:40 +00:00
Cameron McInally
c5a925c198 Fix assembly operands for the SSE2 cvtsd2ss instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195129 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-19 14:36:00 +00:00
Andrew Trick
d73d4f4ef2 Use symbolic operands in the patchpoint folding routine and fix a spilling bug.
Fixes <rdar://15487687> [JS] AnyRegCC argument ends up being spilled

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195094 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-19 03:29:59 +00:00
Andrew Trick
8ddf988ef4 Add an abstraction to handle patchpoint operands.
Hard-coded operand indices were scattered throughout lowering stages
and layers. It was super bug prone.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195093 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-19 03:29:56 +00:00
Juergen Ributzka
354362524a [weak vtables] Remove a bunch of weak vtables
This patch removes most of the trivial cases of weak vtables by pinning them to
a single object file. The memory leaks in this version have been fixed. Thanks
Alexey for pointing them out.

Differential Revision: http://llvm-reviews.chandlerc.com/D2068

Reviewed by Andy

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195064 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-19 00:57:56 +00:00
Reid Kleckner
a7b7a7d629 Revert "COFF: Emit all MCSymbols rather than filtering out some of them"
This reverts commit r190888, to fix PR17967.  The original change wasn't
the right way to get @feat.00 into the object file.  The right fix is to
make @feat.00 be a global symbol.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195053 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-18 23:08:12 +00:00
Alexey Samsonov
b21ab43cfc Revert r194865 and r194874.
This change is incorrect. If you delete virtual destructor of both a base class
and a subclass, then the following code:
  Base *foo = new Child();
  delete foo;
will not cause the destructor for members of Child class. As a result, I observe
plently of memory leaks. Notable examples I investigated are:
ObjectBuffer and ObjectBufferStream, AttributeImpl and StringSAttributeImpl.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194997 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-18 09:31:53 +00:00
Andrew Trick
bb756ca244 Added a size field to the stack map record to handle subregister spills.
Implementing this on bigendian platforms could get strange. I added a
target hook, getStackSlotRange, per Jakob's recommendation to make
this as explicit as possible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194942 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-17 01:36:23 +00:00
Juergen Ributzka
0ccb37a733 The WebKit_JS CC preserves the same registers as the C CC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194936 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-16 22:08:58 +00:00
Jim Grosbach
35de9946d5 X86: Encode the 'h' cpu subtype in the MachO header for x86.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194906 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-16 00:52:57 +00:00
Lang Hames
445fd04f53 Remove unused arguments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194882 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-15 23:19:01 +00:00
Lang Hames
8c66df2c7a During folding for patchpoint/stackmap instructions, defer creation of new MIs
until we know that folding will be successful.

No functional change.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194880 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-15 23:13:21 +00:00
Juergen Ributzka
5a364c5561 [weak vtables] Remove a bunch of weak vtables
This patch removes most of the trivial cases of weak vtables by pinning them to
a single object file.

Differential Revision: http://llvm-reviews.chandlerc.com/D2068

Reviewed by Andy

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194865 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-15 22:34:48 +00:00
Bob Wilson
cc7052343e Avoid illegal integer promotion in fastisel
Stop folding constant adds into GEP when the type size doesn't match.
Otherwise, the adds' operands are effectively being promoted, changing the
conditions of an overflow.  Results are different when:

    sext(a) + sext(b) != sext(a + b)

Problem originally found on x86-64, but also fixed issues with ARM and PPC,
which used similar code.

<rdar://problem/15292280>

Patch by Duncan Exon Smith!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194840 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-15 19:09:27 +00:00
Cameron McInally
28e12e9f02 Add AVX512 unmasked FMA intrinsics and support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194824 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-15 17:01:14 +00:00
Matt Arsenault
59d3ae6cdc Add addrspacecast instruction.
Patch by Michele Scandale!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194760 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-15 01:34:59 +00:00
Elena Demikhovsky
f58e414405 AVX-512: Handled extractelement from mask vector;
Added VMOSHDUP/VMOVSLDUP shuffle instructions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194691 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-14 11:29:27 +00:00
Andrew Trick
72cf01cc7c Minor extension to llvm.experimental.patchpoint: don't require a call.
If a null call target is provided, don't emit a dummy call. This
allows the runtime to reserve as little nop space as it needs without
the requirement of emitting a call.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194676 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-14 06:54:10 +00:00
Juergen Ributzka
c7e77f91fe SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too.
This patch reapplies r193676 with an additional fix for the Hexagon backend. The
SystemZ backend has already been fixed by r194148.

The Type Legalizer recognizes that VSELECT needs to be split, because the type
is to wide for the given target. The same does not always apply to SETCC,
because less space is required to encode the result of a comparison. As a result
VSELECT is split and SETCC is unrolled into scalar comparisons.

This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG
Combiner. If a matching pattern is found, then the result mask of SETCC is
promoted to the expected vector mask type for the given target. Now the type
legalizer will split both VSELECT and SETCC.

This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX
pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.

Reviewed by Nadav

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194542 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-13 01:57:54 +00:00
Andrew Trick
7107aded17 Cleanup the stackmap operand folding code and fix a corner case.
I still don't know how to refer to the fixed operands symbolically. I
plan to look into it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194529 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-12 22:58:39 +00:00
Eric Christopher
2a499a7313 Add a FIXME for 32-bit q modifiers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194515 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-12 21:47:44 +00:00
Andrew Trick
0085d5e5ae Simplify operand folding when rematerializing a load.
We already know how to fold a reload from a frameindex without
analyzing the load instruction. Generalize this to handle any
frameindex load. This streamlines the logic for rematerializing loads
from stack arguments. As a side effect, it allows stackmaps to record
a stack argument location without spilling it.

Verified no effect on codegen for llvm test-suite.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194497 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-12 18:06:12 +00:00