Commit Graph

3642 Commits

Author SHA1 Message Date
Craig Topper
57ae246a6a Use sse_load_f32/64 for scalar FMA3 intrinsic patterns instead of 128-bit loads to match instruction behavior.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157895 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-03 01:40:43 +00:00
Manman Ren
73c2f7f5ed X86: peephole optimization to remove cmp instruction
This patch will optimize the following:
  sub r1, r3
  cmp r3, r1 or cmp r1, r3
  bge L1
TO
  sub r1, r3
  bge L1 or ble L1

If the branch instruction can use flag from "sub", then we can eliminate
the "cmp" instruction.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157831 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 19:49:33 +00:00
Chris Lattner
2b76473929 testcase for PR13006, thanks to Duncan for filing it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157824 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 18:19:46 +00:00
Hans Wennborg
f0234fcbc9 Implement the local-dynamic TLS model for x86 (PR3985)
This implements codegen support for accesses to thread-local variables
using the local-dynamic model, and adds a clean-up pass so that the base
address for the TLS block can be re-used between local-dynamic access on
an execution path.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157818 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 16:27:21 +00:00
Craig Topper
3a8172ad8d Remove fadd(fmul) patterns for FMA3. This needs to be implemented by paying attention to FP_CONTRACT and matching @llvm.fma which is not available yet. This will allow us to enablle intrinsic use at least though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157804 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 06:07:48 +00:00
Chris Lattner
f59e4e3452 enhance the logic for looking through tailcalls to look through transparent casts
in multiple-return value scenarios, like what happens on X86-64 when returning
small structs.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157800 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 05:29:15 +00:00
Chris Lattner
5b0d946537 enhance getNoopInput to know about vector<->vector bitcasts of legal
types, as well as int<->ptr casts.  This allows us to tailcall functions
with some trivial casts between the call and return (i.e. because the
return types disagree).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157798 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 05:16:33 +00:00
Chris Lattner
09c14c0836 add some simple 64-bit tail call tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157797 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 05:03:31 +00:00
Chris Lattner
e8ea60b8ba merge some tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157795 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 05:00:54 +00:00
Chris Lattner
e109648880 rename test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157794 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 04:58:50 +00:00
Manman Ren
91c5346d91 X86: replace SUB with CMP if possible
This patch will optimize the following
        movq    %rdi, %rax
        subq    %rsi, %rax
        cmovsq  %rsi, %rdi
        movq    %rdi, %rax
to
        cmpq    %rsi, %rdi
        cmovsq  %rsi, %rdi
        movq    %rdi, %rax

Perform this optimization if the actual result of SUB is not used.

rdar: 11540023


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157755 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-31 17:20:29 +00:00
Elena Demikhovsky
177cf1e1a3 Added FMA3 Intel instructions.
I disabled FMA3 autodetection, since the result may differ from expected for some benchmarks.
I added tests for GodeGen and intrinsics.
I did not change llvm.fma.f32/64 - it may be done later.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157737 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-31 09:20:20 +00:00
Craig Topper
0559a2f8ae Add intrinsic for pclmulqdq instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157731 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-31 04:37:40 +00:00
Jakob Stoklund Olesen
9cda1be0aa Prioritize smaller register classes for urgent evictions.
It helps compile exotic inline asm. In the test case, normal GR32
virtual registers use up eax-edx so the final GR32_ABCD live range has
no registers left. Since all the live ranges were tiny, we had no way of
prioritizing the smaller register class.

This patch allows tiny unspillable live ranges to be evicted by tiny
unspillable live ranges from a smaller register class.

<rdar://problem/11542429>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157715 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-30 21:46:58 +00:00
Chris Lattner
f186df0d3e it's pointed out that R11 can be used for magic things, and doing things just for 64-bit registers is silly. Just optimize 3 more.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157699 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-30 18:08:02 +00:00
Chris Lattner
5aaabbfe62 Extend the (abi-irrelevant) return convention to be able to return more than two values in
integer registers.  This is already supported by the fastcc convention, but it doesn't
hurt to support it in the standard conventions as well.

In cases where we can cheat at the calling convention, this allows us to avoid returning
things through memory in more cases.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157698 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-30 17:50:14 +00:00
Benjamin Kramer
1386e9b7b1 Add intrinsics, code gen, assembler and disassembler support for the SSE4a extrq and insertq instructions.
This required light surgery on the assembler and disassembler
because the instructions use an uncommon encoding. They are
the only two instructions in x86 that use register operands
and two immediates.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157634 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-29 19:05:25 +00:00
Chris Lattner
c32cef6aa1 These tests used intrinsics with the wrong prototype. They weren't caught because
the old verifier just checked that something "was a pointer", but not that the pointee
was correct.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157544 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-27 19:35:41 +00:00
Benjamin Kramer
c511b2a5a1 SelectionDAGBuilder: When emitting small compare chains for switches order them by using edge weights.
SimplifyCFG tends to form a lot of 2-3 case switches when merging branches. Move
the most likely condition to the front so it is checked first and the others can
be skipped. This is currently not as effective as it could be because SimplifyCFG
destroys profiling metadata when merging branches and switches. Merging branch
weight metadata is tricky though.

This code touches at most 3 cases so I didn't use a proper sorting algorithm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157521 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-26 20:01:32 +00:00
NAKAMURA Takumi
f755e0001a test/CodeGen/X86/bigstructret.ll: Suppress one test. It is msvc-incompatible. (compatible to mingw32 and netbsd, though)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157474 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-25 15:40:54 +00:00
NAKAMURA Takumi
a389a23bbb test/CodeGen/X86/bigstructret.ll: Relax stack offsets for hosts of stack-align=8, eg. win32 and netbsd.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157471 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-25 15:12:21 +00:00
Eli Friedman
2db0e9ebb6 Simplify code for calling a function where CanLowerReturn fails, fixing a small bug in the process.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157446 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-25 00:09:29 +00:00
David Blaikie
28a5ab2fb4 Fix for CHECK-NOT misspelling.
Patch by Nicklas Bo Jensen.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157421 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-24 22:08:29 +00:00
Jakob Stoklund Olesen
e3b548219f Correctly deal with identity copies in RegisterCoalescer.
Now that the coalescer keeps live intervals and machine code in sync at
all times, it needs to deal with identity copies differently.

When merging two virtual registers, all identity copies are removed
right away. This means that other identity copies must come from
somewhere else, and they are going to have a value number.

Deal with such copies by merging the value numbers before erasing the
copy instruction. Otherwise, we leave dangling value numbers in the live
interval.

This fixes PR12927.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157340 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-23 20:21:06 +00:00
Nuno Lopes
23e75da7e0 revert my previous patches that introduced an additional parameter to the objectsize intrinsic.
After a lot of discussion, we realized it's not the best option for run-time bounds checking

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157255 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-22 15:25:31 +00:00
Jakob Stoklund Olesen
76ff741836 Only erase virtregs with no uses left.
Also make sure registers aren't erased twice if the dead def mentions
the register twice.

This fixes PR12911.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157254 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-22 14:52:12 +00:00
Craig Topper
8ae97baef2 Allow 256-bit shuffles to still be split even if only half of the shuffle comes from two 128-bit pieces.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157175 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-21 06:40:16 +00:00
Peter Collingbourne
92d63ccfc7 When legalising shifts, do not pre-build a list of operands which
may be RAUW'd by the recursive call to LegalizeOps; instead, retrieve
the other operands when calling UpdateNodeOperands.  Fixes PR12889.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157162 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-20 18:36:15 +00:00
Jakob Stoklund Olesen
ee0d5d4398 Properly constrain register classes for sub-registers.
Not all GR64 registers have sub_8bit sub-registers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157150 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-20 06:38:37 +00:00
Jakob Stoklund Olesen
8e86929e3c Properly constrain register classes in 2-addr.
X86 has 2-addr instructions with different constraints on the tied def
and use operands. One is GR32, one is GR32_NOSP.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157149 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-20 06:38:32 +00:00
Jakob Stoklund Olesen
7ebed91fdd Fix 12892.
Dead code elimination during coalescing could cause a virtual register
to be split into connected components. The following rewriting would be
confused about the already joined copies present in the code, but
without a corresponding value number in the live range.

Erase all joined copies instantly when joining intervals such that the
MI and LiveInterval representations are always in sync.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157135 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-19 23:34:59 +00:00
Jakob Stoklund Olesen
ccce1233a2 Erase joined copies immediately.
The late dead code elimination is no longer necessary.

The test changes are cause by a register hint that can be either %rdi or
%rax. The choice depends on the use list order, which this patch changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157131 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-19 20:54:07 +00:00
Nadav Rotem
87d35e8c71 On Haswell, perfer storing YMM registers using a single instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157129 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-19 20:30:08 +00:00
Nadav Rotem
4fc8a5de44 Add support for additional in-reg vbroadcast patterns
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157127 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-19 19:57:37 +00:00
Craig Topper
b82b5abf78 Simplify handling of v16i8 shuffles and fix a missed optimization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157043 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-18 06:42:06 +00:00
Evan Cheng
ad75364815 Teach two-address pass to update the "source" map so it doesn't perform a
non-profitable commute using outdated info. The test case would still fail
because of poor pre-RA schedule. That will be fixed by MI scheduler.

rdar://11472010


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157038 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-18 01:33:51 +00:00
Jakob Stoklund Olesen
0e5e821a69 Remove a test that was only testing for physreg joining.
This is the same as the other tests: Clever tricks are required to make
the arguments and return value line up in a single-instruction function.
It rarely happens in real life.

We have plenty other examples of this behavior.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157030 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-18 00:07:14 +00:00
Jakob Stoklund Olesen
ed18a3e6b2 Remove -join-physregs from the test suite.
This option has been disabled for a while, and it is going away so I can
clean up the coalescer code.

The tests that required physreg joining to be enabled were almost all of
the form "tiny function with interference between arguments and return
value". Such functions are usually inlined in the real world.

The problem exposed by phys_subreg_coalesce-3.ll is real, but fairly
rare.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157027 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-17 23:44:19 +00:00
Evan Cheng
6100366c2f Avoid creating a cycle when folding load / op with flag / store. PR11451474. rdar://11451474
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156896 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-16 01:54:27 +00:00
Jakob Stoklund Olesen
4d10829e12 Fix PR12821.
RAFast must add an <imp-def> operand when it is rewriting a sub-register
def that isn't a read-modify-write.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156777 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-14 21:10:25 +00:00
Dan Gohman
a6063c6e29 Rename @llvm.debugger to @llvm.debugtrap.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156774 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-14 18:58:10 +00:00
Hans Wennborg
12447575dc Fix test/CodeGen/X86/tls-pie.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156612 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 10:19:54 +00:00
Hans Wennborg
228756c744 Implement initial-exec TLS model for 32-bit PIC x86
This fixes a TODO from 2007 :) Previously, LLVM would emit the wrong
code here (see the update to test/CodeGen/X86/tls-pie.ll).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156611 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 10:11:01 +00:00
Dan Gohman
d4347e1af9 Define a new intrinsic, @llvm.debugger. It will be similar to __builtin_trap(),
but it generates int3 on x86 instead of ud2.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156593 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 00:19:32 +00:00
Nadav Rotem
b210651654 AVX2: Add an additional broadcast idiom.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156540 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-10 12:39:13 +00:00
Nadav Rotem
5fc2187a02 Generate AVX/AVX2 shuffles even when there is a memory op somewhere else in the program.
Starting r155461 we are able to select patterns for vbroadcast even when the load op is used by other users.

Fix PR11900.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156539 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-10 12:22:05 +00:00
Nuno Lopes
30759542aa change the objectsize intrinsic signature: add a 3rd parameter to denote the maximum runtime performance penalty that the user is willing to accept.
This commit only adds the parameter. Code taking advantage of it will follow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156473 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-09 15:52:43 +00:00
Craig Topper
189bce48c7 Remove 256-bit AVX non-temporal store intrinsics. Similar was previously done for 128-bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156375 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-08 06:58:15 +00:00
Chad Rosier
42726835e3 Fix a regression from r147481. This combine should only happen if there is a
single use.
rdar://11360370


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156316 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 18:47:44 +00:00
Manman Ren
ed57984483 X86: optimization for -(x != 0)
This patch will optimize -(x != 0) on X86
FROM 
cmpl	$0x01,%edi
sbbl	%eax,%eax
notl	%eax
TO
negl %edi
sbbl %eax %eax

In order to generate negl, I added patterns in Target/X86/X86InstrCompiler.td:
def : Pat<(X86sub_flag 0, GR32:$src), (NEG32r GR32:$src)>;

rdar: 10961709


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156312 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 18:06:23 +00:00