Commit Graph

9684 Commits

Author SHA1 Message Date
Reid Kleckner
cc8d39acf5 Revert "Fix miscompile of MS inline assembly with stack realignment"
This reverts commit r196876.  Its tests failed on the bots, so I'll
figure it out tomorrow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196879 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-10 05:31:27 +00:00
Reid Kleckner
ec4d326aad Fix miscompile of MS inline assembly with stack realignment
For stack frames requiring realignment, three pointers may be needed:
- ebp to address incoming arguments
- esi (could be any callee-saved register) to address locals
- esp to address outgoing arguments

We would use esi unconditionally without verifying that it did not
conflict with inline assembly.

This change doesn't do the verification, it simply emits a fatal error
on functions that use stack realignment, dynamic SP adjustments, and
inline assembly.

Because stack realignment is common on Windows, we also no longer assume
that MS inline assembly clobbers esp.  Instead, we analyze the inline
instructions for implicit definitions and check if esp is there.  If so,
we require the use of a base pointer and consider it in the condition
above.

Mostly fixes PR16830, but we could try harder to find a non-conflicting
base pointer.

Reviewers: sunfish

Differential Revision: http://llvm-reviews.chandlerc.com/D1317

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196876 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-10 05:12:23 +00:00
Rafael Espindola
5201d61654 Don't add suffixes for stdcall/fastcall on 64 coff.
This matches the behavior of both msvc and mingw.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196814 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-09 20:44:48 +00:00
Cameron McInally
febc28b529 Update AVX512 vector blend intrinsic names.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196581 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-06 13:35:35 +00:00
Rafael Espindola
fac7a9e644 Remove the isImplicitlyPrivate argument of getNameWithPrefix.
getSymbolWithGlobalValueBase use is to create a name of a new symbol based
on the name of an existing GV. Assert that and then remove the last call
to pass true to isImplicitlyPrivate.

This gives the mangler API a 1:1 mapping from GV to names, which is what we
need to drop the mangler dependency on the target (and use an extended
datalayout instead).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196472 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-05 05:53:12 +00:00
Alp Toker
087ab613f4 Correct word hyphenations
This patch tries to avoid unrelated changes other than fixing a few
hyphen-related ambiguities and contractions in nearby lines.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196471 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-05 05:44:44 +00:00
Rafael Espindola
9155b17815 Hide the stub created for MO_ExternalSymbol too.
given

declare void @llvm.memset.p0i8.i32(i8* nocapture, i8, i32, i32, i1)
declare void @foo()
define void @bar() {
  call void @foo()
  call void @llvm.memset.p0i8.i32(i8* null, i8 0, i32 188, i32 1, i1 false)
  ret void
}

We used to produce

L_foo$stub:
        .indirect_symbol        _foo
        .ascii  "\364\364\364\364\364"

_memset$stub:
        .indirect_symbol        _memset
        .ascii  "\364\364\364\364\364"

We not produce a private stub for memset too.

Stubs are not needed with recent linkers, but we still produce them for darwin8.

Thanks to David Fang for confirming that gcc used to do this too.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196468 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-05 05:19:12 +00:00
Cameron McInally
6c8faddaf5 Add AVX512 patterns for v16i32 broadcast and v2i64 zero extend load.
Patch by Aleksey Bader.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196435 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-05 00:11:25 +00:00
Kevin Enderby
f50f3a3bb9 Fix a bug in darwin's 32-bit X86 handling of evaluating fixups.
Where it would use a scattered relocation entry but falls back to a
normal relocation entry because the FixupOffset is more than 24-bits.

The bug is in the X86MachObjectWriter::RecordScatteredRelocation() where
it changes reference parameter FixedValue but then returns false to indicate
it did not create a scattered relocation entry.  The fix is simply to save the
original value of the parameter FixedValue at the start of the method and
restore it if we are returning false in that case.

rdar://15526046


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196432 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-04 23:36:24 +00:00
Cameron McInally
6d3d93c40b Fix assembly syntax for AVX512 vector blend instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196393 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-04 18:05:36 +00:00
Michael Liao
a2f6f94b74 [X86] Check YMM31/ZMM31 as well
- No test case as there's no calling convention preserve YMM31/ZMM31 only



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196391 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-04 17:44:22 +00:00
Cameron McInally
80955805e4 Suppress '(x < y) ? a : 0 -> (x < y) & a' transform on X86 architectures with dedicated mask registers.
Patch by Aleksey Bader.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196386 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-04 14:52:33 +00:00
Juergen Ributzka
6abfcbdfc8 [Stackmap] Emit multi-byte nops for X86.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196334 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-04 00:39:08 +00:00
Rafael Espindola
21a9fd247e Fix mingw32 thiscall + sret.
Unlike msvc, when handling a thiscall + sret gcc will
* Put the sret in %ecx
* Put the this pointer is (%esp)

This fixes, for example, calling stringstream::str.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196312 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-03 20:51:23 +00:00
Michael Liao
239ffb30b0 Enhance the fix of PR17631
- The fix to PR17631 fixes part of the cases where 'vzeroupper' should
  not be issued before 'call' insn. There're other cases where helper
  calls will be inserted not limited to epilog. These helper calls do
  not follow the standard calling convention and won't clobber any YMM
  registers. (So far, all call conventions will clobber any or part of
  YMM registers.)
  This patch enhances the previous fix to cover more cases 'vzerosupper' should
  not be inserted by checking if that function call won't clobber any YMM
  registers and skipping it if so.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196261 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-03 09:17:32 +00:00
Rafael Espindola
9472fd7403 Refactor the setting of PrivateGlobalPrefix.
No functionality change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196170 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-02 23:39:26 +00:00
Rafael Espindola
cce5873de3 Move getSymbolWithGlobalValueBase to TargetLoweringObjectFile.
This allows it to be used in TargetLoweringObjectFileImpl.cpp.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196117 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-02 16:25:47 +00:00
Alp Toker
4f9dd99c28 Introduce poor man's consumeToken() in X86AsmParser
This makes the code a little more idiomatic.

No change in behaviour.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196113 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-02 16:06:06 +00:00
Rafael Espindola
4a6855441c Change the default of AsmWriterClassName and isMCAsmWriter.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196065 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-02 04:55:42 +00:00
Benjamin Kramer
e1139818e8 Revamp error checking in the ms inline asm parser.
- Actually abort when an error occurred.
- Check that the frontend lookup worked when parsing length/size/type operators.

Tested by a clang test. PR18096.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196044 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-01 11:47:42 +00:00
Lang Hames
1cbca515b6 Refactor a lot of patchpoint/stackmap related code to simplify and make it
target independent.

Most of the x86 specific stackmap/patchpoint handling was necessitated by the
use of the native address-mode format for frame index operands. PEI has now
been modified to treat stackmap/patchpoint similarly to DEBUG_INFO, allowing
us to use a simple, platform independent register/offset pair for frame
indexes on stackmap/patchpoints.

Notes:
  - Folding is now platform independent and automatically supported.
  - Emiting patchpoints with direct memory references now just involves calling
    the TargetLoweringBase::emitPatchPoint utility method from the target's
    XXXTargetLowering::EmitInstrWithCustomInserter method. (See
    X86TargetLowering for an example).
  - No more ugly platform-specific operand parsers.

This patch shouldn't change the generated output for X86. 



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195944 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-29 03:07:54 +00:00
Rafael Espindola
88ccad035e Refactor to remove a bit of duplication. No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195933 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-28 20:12:44 +00:00
NAKAMURA Takumi
ef70d2a393 [CMake] Let add_public_tablegen_target() provide intrinsics_gen, too.
I think, in principle, intrinsics_gen may be added explicitly.
That said, it can be added incidentally, since each target already has dependencies to llvm-tblgen.
Almost all source files depend on both CommonTaleGen and intrinsics_gen.

Explicit add_dependencies() have been pruned under lib/Target.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195929 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-28 17:04:31 +00:00
NAKAMURA Takumi
ad363187c4 [CMake] Let add_public_tablegen_target responsible to provide dependency to CommonTableGen.
add_public_tablegen_target adds *CommonTableGen to LLVM_COMMON_DEPENDS.
LLVM_COMMON_DEPENDS affects add_llvm_library (and other add_target stuff) within its scope.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195927 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-28 17:04:04 +00:00
Rafael Espindola
4ca0ef70cd The global prefix is always one char. Don't use a string for it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195926 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-28 17:00:49 +00:00
NAKAMURA Takumi
98bb341955 [CMake] Prune include_directories() in llvm/lib/Target, take #2.
I forgot to commit them. They were staging in my local repo.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195924 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-28 15:30:37 +00:00
Rafael Espindola
60f6083a36 Use the mangler consistently instead of using getGlobalPrefix directly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195911 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-28 08:59:52 +00:00
Rafael Espindola
825dfc8cba Remove dead code.
MO_ExternalSymbol and MO_JumpTableIndex don't show up in inline asm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195861 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 18:38:14 +00:00
Rafael Espindola
4635dbb8bc Convert two if sequences to switches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195859 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 18:26:51 +00:00
Rafael Espindola
b7e71e35a9 Use a switch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195857 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 18:18:24 +00:00
Rafael Espindola
ed3eb50482 Remove more dead code now that this is only used for inline asm.
MO_ConstantPoolIndex is handled in printLeaMemReference.
MO_JumpTableIndex and MO_ExternalSymbol don't show up in inline asm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195847 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 15:13:06 +00:00
Rafael Espindola
3b818b481f Convert more methods in static helpers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195826 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 07:34:09 +00:00
Rafael Espindola
81e995dc91 Convert these methods into static functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195825 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 07:14:26 +00:00
Rafael Espindola
ef8a810cd7 Cleanup and test X86AsmPrinter::printPCRelImm.
It is only used for asm printing.

On X86 we put basic block addresses on register before passing them to inline
asm, so the MO_MachineBasicBlock case was dead.

MO_ExternalSymbol was dead since any symbol being passed to inline asm
is represented as MO_GlobalAddress.

The MO_GlobalAddress and MO_Register cases were not tested.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195824 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 06:53:13 +00:00
Michael Liao
fd115c47a2 Fix PR18054
- Fix bug in (vsext (vzext x)) -> (vsext x) in SIGN_EXTEND_IN_REG
  lowering where we need to check whether x is a vector type (in-reg
  type) of i8, i16 or i32; otherwise, that optimization is not valid.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195779 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-26 20:31:31 +00:00
Andrew Trick
501aeea325 StackMap: Implement support for DirectMemRefOp.
A Direct stack map location records the address of frame index. This
address is itself the value that the runtime requested. This differs
from IndirectMemRefOp locations, which refer to a stack locations from
which the requested values must be loaded. Direct locations can
directly communicate the address if an alloca, while IndirectMemRefOp
handle register spills.

For example:

entry:
  %a = alloca i64...
  llvm.experimental.stackmap(i32 <ID>, i32 <shadowBytes>, i64* %a)

Since both the alloca and stackmap intrinsic are in the entry block,
and the intrinsic takes the address of the alloca, the runtime can
assume that LLVM will not substitute alloca with any intervening
value. This must be verified by the runtime by checking that the stack
map's location is a Direct location type. The runtime can then
determine the alloca's relative location on the stack immediately after
compilation, or at any time thereafter. This differs from Register and
Indirect locations, because the runtime can only read the values in
those locations when execution reaches the instruction address of the
stack map.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195712 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-26 02:03:25 +00:00
Andrew Trick
151ed66489 whitespace
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195711 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-26 02:03:20 +00:00
Cameron McInally
0e6ec124d5 Add an intrinsic for the SSE2 PAUSE instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195697 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-26 00:20:43 +00:00
Rafael Espindola
02ddf4abc2 Do the string comparison in the constructor instead of once per nop.
Thanks to Roman Divacky for the suggestion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195684 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-25 20:50:03 +00:00
Rafael Espindola
8f6631cdb6 Don't use nopl in cpus that don't support it.
Patch by Mikulas Patocka. I added the test. I checked that for cpu names that
gas knows about, it also doesn't generate nopl.

The modified cpus:
i686 - there are i686-class CPUs that don't have nopl: Via c3, Transmeta
        Crusoe, Microsoft VirtualBox - see
        https://bbs.archlinux.org/viewtopic.php?pid=775414
k6, k6-2, k6-3, winchip-c6, winchip2 - these are 586-class CPUs
via c3 c3-2 - see https://bugs.archlinux.org/task/19733 as a proof that
        Via c3 and c3-Nehemiah don't have nopl

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195679 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-25 20:15:14 +00:00
Tim Northover
8a6c627fd0 X86: enable AVX2 under Haswell native compilation
Patch by Adam Strzelecki

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195632 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-25 09:52:59 +00:00
Jim Grosbach
e1af5f6ad1 X86: Perform integer comparisons at i32 or larger.
Utilizing the 8 and 16 bit comparison instructions, even when an input can
be folded into the comparison instruction itself, is typically not worth it.
There are too many partial register stalls as a result, leading to significant
slowdowns. By always performing comparisons on at least 32-bit
registers, performance of the calculation chain leading to the
comparison improves. Continue to use the smaller comparisons when
minimizing size, as that allows better folding of loads into the
comparison instructions.

rdar://15386341

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195496 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-22 19:57:47 +00:00
Michael Liao
0894438912 Fix PR18014
- When simplifying the mask generation for BLEND, check whether that mask is
  also consumed by other non-BLEND insns. If true, skip that simplification.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195476 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-22 17:56:57 +00:00
Rafael Espindola
9519b689c8 Don't produce tail calls when the caller is x86_thiscallcc.
The callee will not pop the stack for us.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195467 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-22 15:18:28 +00:00
Kostya Serebryany
a7e8d6581f Revert r195318 as it causes miscompilation (PR18029)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195439 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-22 10:30:39 +00:00
Ekaterina Romanova
46f7257ed1 SHLD/SHRD are VectorPath (microcode) instructions known to have poor latency on certain architectures. While generating SHLD/SHRD instructions is acceptable when optimizing for size, optimizing for speed on these platforms should be implemented using alternative sequences of instructions composed of add, adc, shr, shl, or and lea which are directPath instructions. These alternative instructions not only have a lower latency but they also increase the decode bandwidth by allowing simultaneous decoding of a third directPath instruction.
AMD's processors family K7, K8, K10, K12, K15 and K16 are known to have SHLD/SHRD instructions with very poor latency. Optimization guides for these processors recommend using an alternative sequence of instructions. For these AMD's processors, I disabled folding (or (x << c) | (y >> (64 - c))) when we are not optimizing for size.

It might be beneficial to disable this folding for some of the Intel's processors. However, since I couldn't find specific recommendations regarding using SHLD/SHRD instructions on Intel's processors, I haven't disabled this peephole for Intel.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195383 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-21 23:21:26 +00:00
Bill Wendling
072ebe59e2 The basic problem is that some mainstream programs cannot deal with the way
clang optimizes tail calls, as in this example:

int foo(void);
int bar(void) {
 return foo();
}

where the call is transformed to:

  calll .L0$pb
.L0$pb:
  popl  %eax
.Ltmp0:
  addl  $_GLOBAL_OFFSET_TABLE_+(.Ltmp0-.L0$pb), %eax
  movl  foo@GOT(%eax), %eax
  popl  %ebp
  jmpl  *%eax                   # TAILCALL

However, the GOT references must all be resolved at dlopen() time, and so this
approach cannot be used with lazy dynamic linking (e.g. using RTLD_LAZY), which
usually populates the PLT with stubs that perform the actual resolving.

This patch changes X86TargetLowering::LowerCall() to skip tail call
optimization, if the called function is a global or external symbol.

Patch by Dimitry Andric!

PR15086

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195318 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-21 07:04:30 +00:00
NAKAMURA Takumi
56b09220a3 X86ISelLowering.cpp: Mark a variable VT as LLVM_ATTRIBUTE_UNUSED. [-Wunused-variable]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195238 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-20 10:55:22 +00:00
NAKAMURA Takumi
6bee54e30a Whitespace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195237 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-20 10:55:15 +00:00
Elena Demikhovsky
29086acc6e Fixed compilation error.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195230 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-20 09:23:22 +00:00