Commit Graph

4206 Commits

Author SHA1 Message Date
Andrew Trick
4e1fb18940 MIsched: Improve the interface to SchedDFS analysis (subtrees).
Allow the strategy to select SchedDFS. Allow the results of SchedDFS
to affect initialization of the scheduler state.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173425 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-25 06:33:57 +00:00
Andrew Trick
178f7d08a4 MISched: Add SchedDFSResult to ScheduleDAGMI to formalize the
interface and allow other strategies to select it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173413 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-25 04:01:04 +00:00
Bill Wendling
e4957fb9b7 Add the heuristic to differentiate SSPStrong from SSPRequired.
The requirements of the strong heuristic are:

* A Protector is required for functions which contain an array, regardless of
  type or length.

* A Protector is required for functions which contain a structure/union which
  contains an array, regardless of type or length.  Note, there is no limit to
  the depth of nesting.

* A protector is required when the address of a local variable (i.e., stack
  based variable) is exposed. (E.g., such as through a local whose address is
  taken as part of the RHS of an assignment or a local whose address is taken as
  part of a function argument.)


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173231 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-23 06:43:53 +00:00
Bill Wendling
114baee1fa Add the IR attribute 'sspstrong'.
SSPStrong applies a heuristic to insert stack protectors in these situations:

* A Protector is required for functions which contain an array, regardless of
  type or length.

* A Protector is required for functions which contain a structure/union which
  contains an array, regardless of type or length.  Note, there is no limit to
  the depth of nesting.

* A protector is required when the address of a local variable (i.e., stack
  based variable) is exposed. (E.g., such as through a local whose address is
  taken as part of the RHS of an assignment or a local whose address is taken as
  part of a function argument.)

This patch implements the SSPString attribute to be equivalent to
SSPRequired. This will change in a subsequent patch.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173230 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-23 06:41:41 +00:00
Michael Liao
13d08bf415 Fix an issue of pseudo atomic instruction DAG schedule
- Add list of physical registers clobbered in pseudo atomic insts
  Physical registers are clobbered when pseudo atomic instructions are
  expanded. Add them in clobber list to prevent DAG scheduler to
  mis-schedule them after these insns are declared side-effect free.
- Add test case from Michael Kuperstein <michael.m.kuperstein@intel.com>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173200 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-22 21:47:38 +00:00
NAKAMURA Takumi
1340833d7c llvm/test/CodeGen/X86/win_ftol2.ll: Add -cpu=generic to appease valgrind.
On valgrind the processor is reported;
  Host CPU: athlon-fx

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172983 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-20 15:40:02 +00:00
Nadav Rotem
0c8607ba6a Revert 172708.
The optimization handles esoteric cases but adds a lot of complexity both to the X86 backend and to other backends.
This optimization disables an important canonicalization of chains of SEXT nodes and makes SEXT and ZEXT asymmetrical.
Disabling the canonicalization of consecutive SEXT nodes into a single node disables other DAG optimizations that assume
that there is only one SEXT node. The AVX mask optimizations is one example. Additionally this optimization does not update the cost model.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172968 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-20 08:35:56 +00:00
Nadav Rotem
ba95865441 On Sandybridge split unaligned 256bit stores into two xmm-sized stores.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172894 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-19 08:38:41 +00:00
Nadav Rotem
48177ac90f On Sandybridge loading unaligned 256bits using two XMM loads (vmovups and vinsertf128) is faster than using a single vmovups instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172868 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-18 23:10:30 +00:00
NAKAMURA Takumi
ca81374e32 llvm/test/CodeGen/X86/Atomics-64.ll: Tweak for 2nd RUN not to overwrite %t. It sometimes causes spurious failure on lit win32.
Feel free to prune or suppress each output.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172823 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-18 14:52:02 +00:00
Elena Demikhovsky
6c327f92a5 Optimization for the following SIGN_EXTEND pairs:
v8i8  -> v8i64, 
v8i8  -> v8i32, 
v4i8  -> v4i64, 
v4i16 -> v4i64 
for AVX and AVX2.

Bug 14865.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172708 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-17 09:59:53 +00:00
Benjamin Kramer
08219ea2b4 X86: Add patterns for X86ISD::VSEXT in registers.
Those can occur when something between the sextload and the store is on the same
chain and blocks isel. Fixes PR14887.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172353 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-13 11:37:04 +00:00
Preston Gurd
1452d46e0b Update patch for the pad short functions pass for Intel Atom (only).
Adds a check for -Oz, changes the code to not re-visit BBs,
and skips over DBG_VALUE instrs.

Patch by Andy Zhang.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172258 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-11 22:06:56 +00:00
Tim Northover
5f2801bd65 Simplify writing floating types to assembly.
This removes previous special cases for each floating-point type in favour of a
shared codepath.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172189 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-11 10:36:13 +00:00
NAKAMURA Takumi
0e4776ce61 llvm/test/CodeGen/X86/ms-inline-asm.ll: Fixup; Globals doesn't have leading underscore in symbol on linux.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172139 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-10 23:02:48 +00:00
Evan Cheng
4ff23d09fa PR14896: Handle memcpy from constant string where the memcpy size is larger than the string size.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172124 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-10 22:13:27 +00:00
Chad Rosier
c1ec207b61 [ms-inline asm] Add support for calling functions from inline assembly.
Part of rdar://12991541

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172121 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-10 22:10:27 +00:00
Evan Cheng
78ec0255d9 Fix a DAG combine bug visitBRCOND() is transforming br(xor(x, y)) to br(x != y).
It cahced XOR's operands before calling visitXOR() but failed to update the
operands when visitXOR changed the XOR node.

rdar://12968664


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171999 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-09 20:56:40 +00:00
Nadav Rotem
1977675a50 add -march to the test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171956 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-09 07:04:23 +00:00
Nadav Rotem
13f8cf55d4 Efficient lowering of vector sdiv when the divisor is a splatted power of two constant.
PR 14848. The lowered sequence is based on the existing sequence the target-independent
DAG Combiner creates for the scalar case.

Patch by Zvi Rackover.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171953 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-09 05:14:33 +00:00
Preston Gurd
c7b902e7fe Pad Short Functions for Intel Atom
The current Intel Atom microarchitecture has a feature whereby
when a function returns early then it is slightly faster to execute
a sequence of NOP instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction until
the return address is ready.

When compiling for X86 Atom only, this patch will run a pass,
called "X86PadShortFunction" which will add NOP instructions where less
than four cycles elapse between function entry and return.

It includes tests.

This patch has been updated to address Nadav's review comments
- Optimize only at >= O1 and don't do optimization if -Os is set
- Stores MachineBasicBlock* instead of BBNum
- Uses DenseMap instead of std::map
- Fixes placement of braces

Patch by Andy Zhang.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171879 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-08 18:27:24 +00:00
Craig Topper
f564a9389d Fix suffix handling for parsing and printing of cvtsi2ss, cvtsi2sd, cvtss2si, cvttss2si, cvtsd2si, and cvttsd2si to match gas behavior.
cvtsi2* should parse with an 'l' or 'q' suffix or no suffix at all. No suffix should be treated the same as 'l' suffix. Printing should always print a suffix. Previously we didn't parse or print an 'l' suffix.
cvtt*2si/cvt*2si should parse with an 'l' or 'q' suffix or not suffix at all. No suffix should use the destination register size to choose encoding. Printing should not print a suffix.

Original 'l' suffix issue with cvtsi2* pointed out by Michael Kuperstein.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171668 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-06 20:39:29 +00:00
Evan Cheng
700843ec2c Fix for PR14739. It's not safe to fold a load into a call across a store. Thanks to Nick Lewycky for the initial patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171665 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-06 19:00:15 +00:00
Craig Topper
835e7bc48e Recommit r171461 which was incorrectly reverted. Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171608 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-05 07:39:25 +00:00
Nadav Rotem
5d1f5c1737 Revert revision 171524. Original message:
URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev
Log:
The current Intel Atom microarchitecture has a feature whereby when a function
returns early then it is slightly faster to execute a sequence of NOP
instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction
until the return address is ready.

When compiling for X86 Atom only, this patch will run a pass, called
"X86PadShortFunction" which will add NOP instructions where less than four
cycles elapse between function entry and return.

It includes tests.

Patch by Andy Zhang.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171603 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-05 05:42:48 +00:00
Preston Gurd
dd30b47175 The current Intel Atom microarchitecture has a feature whereby when a function
returns early then it is slightly faster to execute a sequence of NOP
instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction
until the return address is ready.

When compiling for X86 Atom only, this patch will run a pass, called
"X86PadShortFunction" which will add NOP instructions where less than four
cycles elapse between function entry and return.

It includes tests.

Patch by Andy Zhang.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171524 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-04 20:54:54 +00:00
Nadav Rotem
e12bf18754 Revert revision: 171467. This transformation is incorrect and makes some tests fail. Original message:
Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1.
Added a test.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171468 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-04 17:35:21 +00:00
Elena Demikhovsky
ab70320908 Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1.
Added a test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171467 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-03 08:48:33 +00:00
Michael Gottesman
e33a8b8c2f Revert "Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks."
This reverts commit r171461 since it breaks the following tests:

Clang :: Analysis/outofbound-notwork.c
Clang :: Analysis/string-fail.c
Clang :: CXX/basic/basic.lookup/basic.lookup.qual/p6-0x.cpp
Clang :: CXX/basic/basic.lookup/basic.lookup.unqual/p15.cpp
Clang :: CXX/dcl.dcl/dcl.spec/dcl.fct.spec/p4.cpp
Clang :: CXX/dcl.dcl/dcl.spec/dcl.stc/p10.cpp
Clang :: CXX/temp/temp.param/p14.cpp
Clang :: CXX/temp/temp.res/temp.dep.res/temp.point/p1.cpp
Clang :: CodeGen/2009-02-13-zerosize-union-field-ppc.c
Clang :: CodeGen/blocks-2.c
Clang :: CodeGen/libcalls-d.c
Clang :: CodeGen/libcalls-ld.c
Clang :: CodeGenCXX/conversion-function.cpp
Clang :: CodeGenCXX/debug-info-limit-type.cpp
Clang :: CodeGenCXX/inheriting-constructor.cpp
Clang :: FixIt/fixit-errors.c
Clang :: FixIt/fixit-pmem.cpp
Clang :: Modules/namespaces.cpp
Clang :: PCH/changed-files.c
Clang :: PCH/pr4489.c
Clang :: PCH/source-manager-stack.c
Clang :: Parser/cxx-ambig-decl-expr-xfail.cpp
Clang :: SemaCXX/switch-implicit-fallthrough-cxx98.cpp
Clang :: SemaTemplate/instantiate-function-1.mm

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171466 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-03 08:18:30 +00:00
Craig Topper
56bc0ab095 Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171461 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-03 06:40:20 +00:00
Jakob Stoklund Olesen
251ed7f3e5 Fix PR14732 by handling all kinds of IMPLICIT_DEF live ranges.
Most IMPLICIT_DEF instructions are removed by the ProcessImplicitDefs
pass, and a few are reinserted by PHIElimination when a PHI argument is
<undef>.

RegisterCoalescer was assuming that all IMPLICIT_DEF live ranges look
like those created by PHIElimination, and that their live range never
leaves the basic block.

The PR14732 test case does tricks with PHI nodes that causes a longer
IMPLICIT_DEF live range to appear. This happens very rarely, but
RegisterCoalescer should be able to handle it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171435 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-03 00:47:51 +00:00
Tom Stellard
d40758b24e DAGCombiner: Avoid generating illegal vector INT_TO_FP nodes
DAGCombiner::reduceBuildVecConvertToConvertBuildVec() was making two
mistakes:

1. It was checking the legality of scalar INT_TO_FP nodes and then generating
vector nodes.

2. It was passing the result value type to
TargetLoweringInfo::getOperationAction() when it should have been
passing the value type of the first operand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171420 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-02 22:13:01 +00:00
Nadav Rotem
d570f59048 AVX: Fix a bug in WidenMaskArithmetic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171397 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-02 17:40:39 +00:00
Dmitri Gribenko
a6542923b8 Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ModuleID
This is done to avoid odd test failures, like the one fixed in r171243.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171250 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-30 02:33:22 +00:00
Nadav Rotem
0509db2738 AVX: Move the ZEXT/ANYEXT DAGCo optimizations to the lowering of these optimizations. The old test cases still cover all of these lowering/optimizations. The single change that we have is that now anyext does not need to zero a register, because it does not use the exact code path as the zero_extend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171178 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-28 05:45:24 +00:00
Nadav Rotem
d6fb53adb1 On AVX/AVX2 the type v8i1 is legalized to v8i16, which is an XMM sized
register. In most cases we actually compare or select YMM-sized registers
and mixing the two types creates horrible code. This commit optimizes
some of the transition sequences.

PR14657.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171148 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-27 08:15:45 +00:00
NAKAMURA Takumi
b1a3bafbf1 llvm/test/CodeGen/X86: FileCheck-ize two tests in r171083.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171084 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-26 03:19:30 +00:00
NAKAMURA Takumi
05c8fd9ee2 llvm/test/CodeGen/X86: Disable avx in two tests corresponding to r171082.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171083 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-26 03:08:55 +00:00
Benjamin Kramer
50ec431c9f Harden test so it's not affected by changes to compare lowering.
This only failed on hosts that don't have SSE41.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171066 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-25 13:23:23 +00:00
Benjamin Kramer
99f78061e0 X86: Shave off one shuffle from the pcmpeqq sequence for SSE2 by making use of and commutativity.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171064 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-25 13:09:08 +00:00
Benjamin Kramer
382ed78d3f X86: Custom lower <2 x i64> eq and ne when SSE41 is not available.
pcmpeqd, pshufd, pshufd, pand is cheaper than unpack + cmpq, sbbq, cmpq, sbbq + pack.
Small speedup on loop-vectorized viterbi (-march=core2).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171063 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-25 12:54:19 +00:00
NAKAMURA Takumi
34cb54bea8 llvm/test/CodeGen/X86/fold-vex.ll: Add explicit triple.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171029 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-24 11:14:06 +00:00
Nadav Rotem
ace0c2fad7 Some x86 instructions can load/store one of the operands to memory. On SSE, this memory needs to be aligned.
When these instructions are encoded in VEX (on AVX) there is no such requirement. This changes the folding
tables and removes the alignment restrictions from VEX-encoded instructions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171024 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-24 09:40:33 +00:00
Benjamin Kramer
2f8a6cdfa3 X86: Turn mul of <4 x i32> into pmuludq when no SSE4.1 is available.
pmuludq is slow, but it turns out that all the unpacking and packing of the
scalarized mul is even slower. 10% speedup on loop-vectorized paq8p.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170985 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-22 16:07:56 +00:00
Benjamin Kramer
17347912b4 X86: Emit vector sext as shuffle + sra if vpmovsx is not available.
Also loosen the SSSE3 dependency a bit, expanded pshufb + psra is still better
than scalarized loads. Fixes PR14590.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170984 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-22 11:34:28 +00:00
Nadav Rotem
d0696ef8c3 In some cases, due to scheduling constraints we copy the EFLAGS.
The only way to read the eflags is using push and pop. If we don't
adjust the stack then we run over the first frame index. This is
not something that we want to do, so we have to make sure that
our machine function does not copy the flags. If it does then
we have to emit the prolog that adjusts the stack.

rdar://12896831



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170961 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-21 23:48:49 +00:00
Benjamin Kramer
4716cf4981 try to unbreak ppc buildbots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170913 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-21 18:11:45 +00:00
Benjamin Kramer
2556c6b4b6 X86: Match pmin/pmax as a target specific dag combine. This occurs during vectorization.
Part of PR14667.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170908 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-21 17:46:58 +00:00
Eric Christopher
71a9c2137b Move these files over to the debug info directory.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170810 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-21 00:03:42 +00:00
Bob Wilson
99d8e76d44 Do not introduce vector operations in functions marked with noimplicitfloat.
<rdar://problem/12879313>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170630 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-20 01:36:20 +00:00
Elena Demikhovsky
4b977312c7 Optimized load + SIGN_EXTEND patterns in the X86 backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170506 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-19 07:50:20 +00:00
Craig Topper
40b4a81ab0 Teach SimplifySetCC that comparing AssertZext i1 against a constant 1 can be rewritten as a compare against a constant 0 with the opposite condition.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170495 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-19 06:12:28 +00:00
Craig Topper
b72ae70036 Add rest of BMI/BMI2 instructions to the folding tables as well as popcnt and lzcnt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170304 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-17 05:02:29 +00:00
Benjamin Kramer
388fc6a988 X86: Add a couple of target-specific dag combines that turn VSELECTS into psubus if possible.
We match the pattern "x >= y ? x-y : 0" into "subus x, y" and two special cases
if y is a constant. DAGCombiner canonicalizes those so we first have to undo the
canonicalization for those cases. The pattern occurs in gzip when the loop
vectorizer is enabled. Part of PR14613.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170273 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-15 16:47:44 +00:00
Nadav Rotem
0a1e914f8f TypeLegalizer: Do not generate target specific nodes with illegal types, because we cant type-legalize them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170245 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-14 21:20:37 +00:00
Evan Cheng
9a65a01eeb Fix a bug in DAGCombiner::MatchBSwapHWord. Make sure the node has operands before referencing them. rdar://12868039
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170078 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-13 01:34:32 +00:00
NAKAMURA Takumi
bd85f1004d llvm/test/CodeGen/X86/atom-bypass-slow-division.ll: Fix possible typo(s) in CHECK-NOT lines.
Found by Alexander Zinenko, thanks!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169978 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-12 13:34:20 +00:00
NAKAMURA Takumi
1a7b4a967d llvm/test/CodeGen/X86/atom-bypass-slow-division.ll: Rename symbols, s/test_/Test/g, not to mismatch "CHECK(-NOT): test".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169977 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-12 13:34:14 +00:00
NAKAMURA Takumi
2ab2421a4e llvm/test/CodeGen/X86/store_op_load_fold.ll: Fix typo, s/CHECK_NEXT/CHECK-NEXT/
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169957 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-12 01:41:01 +00:00
NAKAMURA Takumi
87de1e72cb llvm/test/CodeGen/X86/store_op_load_fold.ll: Add explicit triple.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169956 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-12 01:40:56 +00:00
Manman Ren
981b96376a DAGCombine: clamp hi bit in APInt::getBitsSet to avoid assertion
rdar://12838504


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169951 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-12 01:13:50 +00:00
Evan Cheng
61f4dfe369 Avoid using lossy load / stores for memcpy / memset expansion. e.g.
f64 load / store on non-SSE2 x86 targets.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169944 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-12 00:42:09 +00:00
Chad Rosier
1ad9253c9d Add a triple to this test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169803 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-11 00:51:36 +00:00
Chandler Carruth
1c49fda408 Fix a miscompile in the DAG combiner. Previously, we would incorrectly
try to reduce the width of this load, and would end up transforming:

  (truncate (lshr (sextload i48 <ptr> as i64), 32) to i32)
to
  (truncate (zextload i32 <ptr+4> as i64) to i32)

We lost the sext attached to the load while building the narrower i32
load, and replaced it with a zext because lshr always zext's the
results. Instead, bail out of this combine when there is a conflict
between a sextload and a zext narrowing. The rest of the DAG combiner
still optimize the code down to the proper single instruction:

  movswl 6(...),%eax

Which is exactly what we wanted. Previously we read past the end *and*
missed the sign extension:

  movl 6(...), %eax

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169802 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-11 00:36:57 +00:00
Paul Redmond
0a0990af1c move X86-specific test
This test case uses -mcpu=corei7 so it belongs in CodeGen/X86

Reviewed by: Nadav


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169801 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-11 00:36:43 +00:00
Chad Rosier
425e951734 Fall back to the selection dag isel to select tail calls.
This shouldn't affect codegen for -O0 compiles as tail call markers are not
emitted in unoptimized compiles.  Testing with the external/internal nightly
test suite reveals no change in compile time performance.  Testing with -O1,
-O2 and -O3 with fast-isel enabled did not cause any compile-time or
execution-time failures.  All tests were performed on my x86 machine.
I'll monitor our arm testers to ensure no regressions occur there.

In an upcoming clang patch I will be marking the objc_autoreleaseReturnValue
and objc_retainAutoreleaseReturnValue as tail calls unconditionally.  While
it's theoretically true that this is just an optimization, it's an
optimization that we very much want to happen even at -O0, or else ARC
applications become substantially harder to debug.

Part of rdar://12553082

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169796 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-11 00:18:02 +00:00
Evan Cheng
376642ed62 Some enhancements for memcpy / memset inline expansion.
1. Teach it to use overlapping unaligned load / store to copy / set the trailing
   bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies.
2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g.
   x86 and ARM.
3. When memcpy from a constant string, do *not* replace the load with a constant
   if it's not possible to materialize an integer immediate with a single
   instruction (required a new target hook: TLI.isIntImmLegal()).
4. Use unaligned load / stores more aggressively if target hooks indicates they
   are "fast".
5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8.
   Also increase the threshold to something reasonable (8 for memset, 4 pairs
   for memcpy).

This significantly improves Dhrystone, up to 50% on ARM iOS devices.

rdar://12760078


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169791 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-10 23:21:26 +00:00
Craig Topper
48b509c773 Teach DAG combine to handle vector add/sub with vectors of all 0s.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169727 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-10 08:12:29 +00:00
Craig Topper
9472b4fbf9 Teach DAG combine to handle vector logical operations with vectors of all 1s or all 0s. These cases can show up when vectors are split for legalizing. Fix some tests that were dependent on these cases not being combined.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169684 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-08 22:49:19 +00:00
Nadav Rotem
af59e9adbd When we use the BLEND instruction that uses the MSB as a mask, we can remove
the VSRI instruction before it since it does not affect the MSB.

Thanks Craig Topper for suggesting this.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169638 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-07 21:43:11 +00:00
Nadav Rotem
e4ccfef809 X86: Prefer using VPSHUFD over VPERMIL because it has better throughput.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169624 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-07 19:01:13 +00:00
Nadav Rotem
dde785cd70 Fix a bug in the code that merges consecutive stores. Previously we did not
check if loads that happen in between stores alias with the first store in the
chain, only with the second store onwards.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169516 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06 17:34:13 +00:00
Craig Topper
da92646875 Remove intrinsic specific instructions for (V)MOVQUmr with patterns pointing to the normal instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169482 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06 07:31:16 +00:00
Andrew Trick
f3329c419b RegisterPressureTracker: fix findUseBetween to handle DebugValue
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169427 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-05 21:37:50 +00:00
Andrew Trick
553c42cefc RegisterPresssureTracker: Track live physical register by unit.
This is much simpler to reason about, more efficient, and
fixes some corner cases involving implicit super-register defs.
Fixed rdar://12797931.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169425 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-05 21:37:42 +00:00
Elena Demikhovsky
226e0e6264 Simplified BLEND pattern matching for shuffles.
Generate VPBLENDD for AVX2 and VPBLENDW for v16i16 type on AVX2.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169366 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-05 09:24:57 +00:00
Evan Cheng
4e54480531 Add x86 isel lowering logic to form bit test with inverted condition. e.g.
x ^ -1.

Patch by David Majnemer.
rdar://12755626


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169339 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-05 00:10:38 +00:00
Bill Wendling
9493dae613 Use the 'count' attribute to calculate the upper bound of an array.
The count attribute is more accurate with regards to the size of an array. It
also obviates the upper bound attribute in the subrange. We can also better
handle an unbound array by setting the count to -1 instead of the lower bound to
1 and upper bound to 0.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169312 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04 21:34:03 +00:00
Bill Wendling
a7645a3c66 Add a 'count' field to the DWARF subrange.
The count field is necessary because there isn't a difference between the 'lo'
and 'hi' attributes for a one-element array and a zero-element array. When the
count is '0', we know that this is a zero-element array. When it's >=1, then
it's a normal constant sized array. When it's -1, then the array is unbounded.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169218 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04 06:20:49 +00:00
Nadav Rotem
a569a80e58 Allow merging multiple store sequences on the same chain.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169111 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-02 17:14:09 +00:00
Eli Bendersky
e469364244 Fix an invalid regex in the test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169108 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-02 15:46:02 +00:00
Andrew Trick
657b75b994 misched: Fix RegisterPressureTracker handling of DebugVals.
Assertion failed: (TopRPTracker.getPos() == RegionBegin && "bad initial Top tracker").
rdar://12790302.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169072 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-01 01:22:49 +00:00
Andrew Trick
177d87ac8d misched: Fix the DAG builder to handle an undef operand at ExitSU.
Assertion failed: (VNI && "No value to read by operand")
rdar://12790267.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169071 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-01 01:22:44 +00:00
Andrew Trick
30fe61aa35 misched: Fix LiveInterval update to better handle DebugVal.
Assertion failed: (itr != mi2iMap.end() && "Instruction not found in maps.")
rdar://12777252.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169070 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-01 01:22:41 +00:00
Andrew Trick
67bdd42d1e misched: fix RegionBegin when DebugValues get shuffled to the top.
assert (RemainingInstrs == 0 && "Instruction count mismatch!")

rdar://12776937.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169069 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-01 01:22:38 +00:00
Nadav Rotem
90e11dc8ad When combining consecutive stores allow loads in between the stores, if the loads do not alias.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168832 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-29 00:00:08 +00:00
Andrew Trick
8b1496c922 misched: Analysis that partitions the DAG into subtrees.
This is a simple, cheap infrastructure for analyzing the shape of a
DAG. It recognizes uniform DAGs that take the shape of bottom-up
subtrees, such as the included matrix multiplication example. This is
useful for heuristics that balance register pressure with ILP. Two
canonical expressions of the heuristic are implemented in scheduling
modes: -misched-ilpmin and -misched-ilpmax.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168773 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-28 05:13:28 +00:00
Andrew Trick
8f82a08673 misched: better alias analysis.
This fixes a hole in the "cheap" alias analysis logic implemented within
the DAG builder itself, regardless of whether proper alias analysis is
enabled. It now handles this pattern produced by LSR+CodeGenPrepare.

%sunkaddr1 = ptrtoint * %obj to i64
%sunkaddr2 = add i64 %sunkaddr1, %lsr.iv
%sunkaddr3 = inttoptr i64 %sunkaddr2 to i32*
store i32 %v, i32* %sunkaddr3

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168768 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-28 03:42:49 +00:00
Manman Ren
f365d3984e X86: do not fold load instructions such as [V]MOVS[S|D] to other instructions
when the destination register is wider than the memory load.

These load instructions load from m32 or m64 and set the upper bits to zero,
while the folded instructions may accept m128.

rdar://12721174


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168710 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-27 18:09:26 +00:00
Craig Topper
020669d53f Revert accidental commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168687 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-27 08:17:04 +00:00
Craig Topper
af87dae12c Make PrintReg constructor explicit to prevent weird implicit conversions from accidentally being triggered.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168686 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-27 08:14:24 +00:00
Craig Topper
2cf4fb4884 Add test cases for r168417.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168681 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-27 07:19:54 +00:00
NAKAMURA Takumi
cb84142195 llvm/test/CodeGen/X86/2012-07-15-broadcastfold.ll: Loosen expression corresponding to r168627. Win32 and *bsd were affected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168651 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-27 00:48:27 +00:00
Chad Rosier
1243922fc1 Remove the X86 Maximal Stack Alignment Check pass as it is no longer necessary.
This pass was conservative in that it always reserved the FP to enable dynamic
stack realignment, which allowed the RA to use aligned spills for vector
registers.  This happens even when spills were not necessary.  The RA has 
since been improved to use unaligned spills when necessary.

The new behavior is to realign the stack if the frame pointer was already
reserved for some other reason, but don't reserve the frame pointer just
because a function contains vector virtual registers.

Part of rdar://12719844

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168627 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-26 22:55:05 +00:00
Jakub Staszak
d642baf4be Normalize splat 256bit vectors with 8 elements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168600 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-26 19:24:31 +00:00
Elena Demikhovsky
4fe5405bdd Intel OCL built-ins calling conventions now support MacOS 32-bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168359 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-20 09:37:57 +00:00
Jakob Stoklund Olesen
e42561ad0c Handle mixed normal and early-clobber defs on inline asm.
PR14376.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168320 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-19 19:31:10 +00:00
NAKAMURA Takumi
e0827d8880 llvm/test/CodeGen/X86/hipe-cc*.ll: Add explicit -mcpu, or they don't expect to pass on Atom.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168171 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-16 16:07:37 +00:00
Duncan Sands
dc7f174b5e Add the Erlang/HiPE calling convention, patch by Yiannis Tsiouris.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168166 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-16 12:36:39 +00:00
Craig Topper
d577552c66 Use roundps/pd for llvm.ceil, llvm.trunc, llvm.rint, and llvm.nearbyint of vector types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168141 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-16 06:37:56 +00:00
Jakub Staszak
0e52f46e48 Make sure to not get AVX code on an AVX-capable host. Revealed in r167967.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167989 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-14 22:24:01 +00:00
NAKAMURA Takumi
9292136787 llvm/test/CodeGen/X86/memset.ll: FileCheck-ize, and add another case on +avx.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167975 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-14 21:01:40 +00:00
Benjamin Kramer
7c6e8cd7cc Force CPU in test so we don't accidentally get AVX code on an AVX-capable host.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167973 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-14 20:31:42 +00:00
Benjamin Kramer
2dbe929685 X86: Enable SSE memory intrinsics even when stack alignment is less than 16 bytes.
The stack realignment code was fixed to work when there is stack realignment and
a dynamic alloca is present so this shouldn't cause correctness issues anymore.

Note that this also enables generation of AVX instructions for memset
under the assumptions:
- Unaligned loads/stores are always fast on CPUs supporting AVX
- AVX is not slower than SSE
We may need some tweaked heuristics if one of those assumptions turns out not to
be true.

Effectively reverts r58317. Part of PR2962.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167967 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-14 20:08:40 +00:00
Rafael Espindola
8e2b8ae3b1 Handle DAG CSE adding new uses during ReplaceAllUsesWith. Fixes PR14333.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167912 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-14 05:08:56 +00:00
Eric Christopher
242343d1ab Revert "Use the 'count' attribute instead of the 'upper_bound' attribute."
temporarily as it is breaking the gdb bots.

This reverts commit r167806/e7ff4c14b157746b3e0228d2dce9f70712d1c126.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167886 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-13 23:30:43 +00:00
Manman Ren
2adc503f29 X86: when constructing VZEXT_LOAD from other loads, makes sure its output
chain is correctly setup.

As an example, if the original load must happen before later stores, we need
to make sure the constructed VZEXT_LOAD is constrained to be before the stores.

rdar://12684358


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167859 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-13 19:13:05 +00:00
Bill Wendling
e7ff4c14b1 Use the 'count' attribute instead of the 'upper_bound' attribute.
If we have a type 'int a[1]' and a type 'int b[0]', the generated DWARF is the
same for both of them because we use the 'upper_bound' attribute. Instead use
the 'count' attrbute, which gives the correct number of elements in the array.
<rdar://problem/12566646>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167806 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-13 02:31:47 +00:00
Michael Liao
01c6de341c Fix test case added in patch fixing PR14314
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167769 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-12 22:33:18 +00:00
Michael Liao
dd3383fd09 Fix PR14314
- Fix operand order for atomic sub, where the minuend is the value
  loaded from memory and the subtrahend is the parameter specified.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167718 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-12 06:49:17 +00:00
Craig Topper
9c7ae01f39 Cleanup pcmp(e/i)str(m/i) instruction definitions and load folding support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167652 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-10 01:23:36 +00:00
Michael Liao
be02a90de1 Add support of RTM from TSX extension
- Add RTM code generation support throught 3 X86 intrinsics:
  xbegin()/xend() to start/end a transaction region, and xabort() to abort a
  tranaction region



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167573 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-08 07:28:54 +00:00
Andrew Trick
3b87f6204f misched: Heuristics based on the machine model.
misched is disabled by default. With -enable-misched, these heuristics
balance the schedule to simultaneously avoid saturating processor
resources, expose ILP, and minimize register pressure. I've been
analyzing the performance of these heuristics on everything in the
llvm test suite in addition to a few other benchmarks. I would like
each heuristic check to be verified by a unit test, but I'm still
trying to figure out the best way to do that. The heuristics are still
in considerable flux, but as they are refined we should be rigorous
about unit testing the improvements.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167527 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-07 07:05:09 +00:00
NAKAMURA Takumi
b75111f1ef test/CodeGen/X86/fp-fast.ll: Add +avx.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167207 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-01 02:13:45 +00:00
Owen Anderson
607ebde651 Add a few more simple fast-math constant propagations and cancellations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167200 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-01 02:00:53 +00:00
Shuxin Yang
a5526a9bff (For X86) Enhancement to add-carray/sub-borrow (adc/sbb) optimization.
The adc/sbb optimization is to able to convert following expression
into a single adc/sbb instruction:
  (ult) ... = x + 1 // where the ult is unsigned-less-than comparison
  (ult) ... = x - 1

  This change is to flip the "x >u y" (i.e. ugt comparison) in order 
to expose the adc/sbb opportunity.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167180 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-31 23:11:48 +00:00
Manman Ren
dfd0b9b460 X86 SSE: update rsqrtss and rcpss to use two source operands and
the first source operand is tied to the destination operand.

This is to accurately model the corresponding instructions where the upper
bits are unmodified.

rdar://12558838
PR14221


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167064 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 23:53:59 +00:00
Manman Ren
4c74a956b2 X86 MMX: optimize transfer from mmx to i32
We used to generate a store (movq) + a load.
Now we use movd.

rdar://9946746


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167056 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 22:15:38 +00:00
Jakub Staszak
a24262a0f5 Re-commit r166971. I reverted it to quickly, when buildbots didn't have a chance
to test it with chapni's fix (-mattr=+avx).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166985 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 00:01:57 +00:00
Jakub Staszak
c1ed096b6b Revert r166971. It causes buildbot failure. To be investigated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166979 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 23:13:50 +00:00
NAKAMURA Takumi
926dd447f1 llvm/test/CodeGen/X86/vec_shuffle-30.ll: Try to unbreak builds - assuming +avx.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166974 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 22:45:18 +00:00
Jakub Staszak
6d317824a5 Allow to fold vector load if there is more than one bitcast, so in the case:
%0 = load <8 x i16>* %dest
%1 = shufflevector <8 x i16> %0, <8 x i16> %in,
      <8 x i32> < i32 0, i32 1, i32 2, i32 3, i32 13, i32 undef, i32 14, i32 14>
store <8 x i16> %1, <8 x i16>* %dest

We get:
  vmovlpd (%eax), %xmm0, %xmm0

instead of:
  vmovaps (%eax), %xmm1
  vmovsd  %xmm1, %xmm0, %xmm0

No extra test-case is added. I just fixed the existing one
(also it uses FileCheck now).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166971 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 21:56:35 +00:00
Chad Rosier
53e216b304 Remove redundant test case from r166949, per Eli's suggestion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166953 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 18:18:26 +00:00
Chad Rosier
2fbc239e4f [ms-inline asm] Add support for the [] operator. Essentially, [expr1][expr2] is
equivalent to [expr1 + expr2].  See test cases for more examples.
rdar://12470392

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166949 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 18:01:54 +00:00
Michael Liao
2a2263e744 Fix PR14204
- Add missing pattern on X86ISD::VZEXT from VR256 to VR256 when AVX2 is enabled.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166947 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 17:57:12 +00:00
Preston Gurd
a836563e32 This patch addresses a problem with the Post RA scheduler generating an
incorrect instruction sequence due to it not being aware that an
inline assembly instruction may reference memory.

This patch fixes the problem by causing the scheduler to always assume that any
inline assembly code instruction could access memory. This is necessary because
the internal representation of the inline instruction does not include
any information about memory accesses.
 
This should fix PR13504.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166929 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-29 15:01:23 +00:00
Jakob Stoklund Olesen
163f67f4d9 Never attempt to join an early-clobber def with a regular kill.
This fixes PR14194.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166880 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-27 17:41:27 +00:00
Michael Liao
8d7cd1d8fc Add test for ATOM ISA SSSE3
- Remove SSE4.1 feature in other ATOM-based test cases



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166699 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 17:50:05 +00:00
Elena Demikhovsky
c0cd72204d The test avx-intel-ocl.ll failed. I can't reproduce on any of my machines. I added -mcpu flag, may be it will fix the problem
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166669 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 08:38:42 +00:00
Chad Rosier
b3009eec47 [ms-inline asm] Add back-end test case for r166632. Make sure we emit the
correct .s output as well as get the correct encoding by the integrated
assembler.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166638 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 23:10:28 +00:00
Elena Demikhovsky
3575222175 Special calling conventions for Intel OpenCL built-in library.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166566 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 14:46:16 +00:00
Michael Liao
1a5cc710ee Teach DAG combine to fold (buildvec (Xint2fp x)) to (Xint2fp (buildvec x))
- If more than 1 elemennts are defined and target supports the vectorized
  conversion, use the vectorized one instead to reduce the strength on
  conversion operation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166546 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 04:14:18 +00:00
Michael Liao
991b6a22b6 Add custom conversion from v2u32 to v2f32 in 32-bit mode
- As there's no 64-bit GPRs in 32-bit mode, a custom conversion from v2u32 to
  v2f32 is added to improve the efficiency of the code generated.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166545 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 04:09:32 +00:00
Rafael Espindola
847a9c6d77 Change x86_fastcallcc to require inreg markers. This allows it to known
the difference from "int x" (which should go in registers and
"struct y {int x;}" (which should not).

Clang will be updated in the next patches.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166536 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 01:58:48 +00:00
Michael Liao
0787274b70 Fix PR14161
- Check index being extracted to be constant 0 before simplfiying.
  Otherwise, retain the original sequence.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166504 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-23 21:40:15 +00:00
Michael Liao
d9d09600ee Enable lowering ZERO_EXTEND/ANY_EXTEND to PMOVZX from SSE4.1
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166486 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-23 17:34:00 +00:00
Michael Liao
facace808c Lower BUILD_VECTOR to SHUFFLE + INSERT_VECTOR_ELT for X86
- If INSERT_VECTOR_ELT is supported (above SSE2, either by custom
  sequence of legal insn), transform BUILD_VECTOR into SHUFFLE +
  INSERT_VECTOR_ELT if most of elements could be built from SHUFFLE with few
  (so far 1) elements being inserted.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166288 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 17:15:18 +00:00
Sebastian Pop
83ba06afa8 Clear unknown mem ops when merging stack slots (pr14090)
When merging stack slots, if StackColoring::remapInstructions gets a
value back from GetUnderlyingObject that it does not know about or is
not itself a stack slot, clear the memory operand in case it aliases
the merged slot. This prevents the introduction of incorrect aliasing
information.

Author:    Matthew Curtis <mcurtis@codeaurora.org>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166216 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-18 19:53:48 +00:00
Nadav Rotem
1c5bf3f429 In SimplifySelectOps we pulled two loads through a select node despite the fact that one was dependent on the other.
rdar://12513091



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166196 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-18 18:06:48 +00:00
Michael Liao
07edaf3801 Revert part of r166049 back and enable test case in r166125.
- Folding (trunc (concat ... X )) to (concat ... (trunc X) ...) is valid
  when '...' are all 'undef's.
- r166125 relies on this transformation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166155 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 23:45:54 +00:00
Michael Liao
6e0c2b36d6 Disable extract-concat test case temporarily
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166141 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 23:08:19 +00:00
Michael Liao
4031e9018b Revert r166049
- In general, it's unsafe for this transformation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166135 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 22:41:15 +00:00
Michael Liao
13429e224c Teach DAG combine to fold (extract_subvec (concat v1, ..) i) to v_i
- If the extracted vector has the same type of all vectored being concatenated
  together, it should be simplified directly into v_i, where i is the index of
  the element being extracted.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166125 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 20:48:33 +00:00
Michael Liao
281ae5abf5 Fix setjmp on models with non-Small code model nor non-Static relocation model
- MBB address is only valid as an immediate value in Small & Static
  code/relocation models. On other models, LEA is needed to load IP address of
  the restore MBB.
- A minor fix of MBB in MC lowering is added as well to enable target
  relocation flag being propagated into MC.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166084 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-17 02:22:27 +00:00
Jakob Stoklund Olesen
320db3f805 Avoid rematerializing a redef immediately after the old def.
PR14098 contains an example where we would rematerialize a MOV8ri
immediately after the original instruction:

  %vreg7:sub_8bit<def> = MOV8ri 9; GR32_ABCD:%vreg7
  %vreg22:sub_8bit<def> = MOV8ri 9; GR32_ABCD:%vreg7

Besides being pointless, it is also wrong since the original instruction
only redefines part of the register, and the value read by the new
instruction is wrong.

The problem was the LiveRangeEdit::allUsesAvailableAt() didn't
special-case OrigIdx == UseIdx and found the wrong SSA value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166068 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 22:51:58 +00:00
Jakob Stoklund Olesen
cdcdfd2cab Revert r166046 "Switch back to the old coalescer for now to fix the 32 bit bit"
A fix for PR14098, including the test case is in the next commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166067 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 22:51:55 +00:00
Michael Liao
272ea03239 Teach DAG combine to fold (trunc (fptoXi x)) to (fptoXi x)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166049 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 19:38:35 +00:00
Rafael Espindola
6f7cccd2e2 Switch back to the old coalescer for now to fix the 32 bit bit
llvm+clang+compiler-rt bootstrap.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166046 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 19:34:06 +00:00
NAKAMURA Takumi
e26874556b Reapply r165661, Patch by Shuxin Yang <shuxin.llvm@gmail.com>.
Original message:

The attached is the fix to radar://11663049. The optimization can be outlined by following rules:

   (select (x != c), e, c) -> select (x != c), e, x),
   (select (x == c), c, e) -> select (x == c), x, e)
where the <c> is an integer constant.

 The reason for this change is that : on x86, conditional-move-from-constant needs two instructions;
however, conditional-move-from-register need only one instruction.

  While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place. The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase.

  The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource".

Original message since r165661:

My previous change has a bug: I negated the condition code of a CMOV, and go ahead creating a new CMOV using the *ORIGINAL* condition code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166017 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 06:28:34 +00:00
Rafael Espindola
0cead1cfef Fix the cpu name and add -verify-machineinstrs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166003 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 01:13:06 +00:00
Andrew Trick
27c28cef11 misched: Added handleMove support for updating all kill flags, not just for allocatable regs.
This is a medium term workaround until we have a more robust solution
in the form of a register liveness utility for postRA passes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166001 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 00:22:51 +00:00
Michael Liao
6c0e04c823 Add __builtin_setjmp/_longjmp supprt in X86 backend
- Besides used in SjLj exception handling, __builtin_setjmp/__longjmp is also
  used as a light-weight replacement of setjmp/longjmp which are used to
  implementation continuation, user-level threading, and etc. The support added
  in this patch ONLY addresses this usage and is NOT intended to support SjLj
  exception handling as zero-cost DWARF exception handling is used by default
  in X86.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165989 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 22:39:43 +00:00
Andrew Trick
874c0a6ec7 Check output of the misched unit tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165959 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 20:33:14 +00:00
Rafael Espindola
32522201a8 Add a cpu to try to fix the atom builder.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165956 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 19:25:43 +00:00
Rafael Espindola
3da6f0392a Add testcase for pr14088.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165954 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 19:00:10 +00:00
Andrew Trick
bb20b24224 misched tests: add a triple to speculatively fix windows builders.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165952 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 18:21:08 +00:00
Andrew Trick
1e94e98b0e misched: ILP scheduler for experimental heuristics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165950 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 18:02:27 +00:00
Benjamin Kramer
f8b65aaf39 X86: Fix accidentally swapped operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165871 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-13 12:50:19 +00:00
Benjamin Kramer
444dccecfc X86: Promote i8 cmov when both operands are coming from truncates of the same width.
X86 doesn't have i8 cmovs so isel would emit a branch. Emitting branches at this
level is often not a good idea because it's too late for many optimizations to
kick in. This solution doesn't add any extensions (truncs are free) and tries
to avoid introducing partial register stalls by filtering direct copyfromregs.

I'm seeing a ~10% speedup on reading a random .png file with libpng15 via
graphicsmagick on x86_64/westmere, but YMMV depending on the microarchitecture.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165868 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-13 10:39:49 +00:00
Jakob Stoklund Olesen
2bbb07d13c Fix buildbots: -misched=shuffle is only available in +Asserts builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165846 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 23:01:33 +00:00
Jakob Stoklund Olesen
ad5e969ba9 Use a transposed algorithm for handleMove().
Completely update one interval at a time instead of collecting live
range fragments to be updated. This avoids building data structures,
except for a single SmallPtrSet of updated intervals.

Also share code between handleMove() and handleMoveIntoBundle().

Add support for moving dead defs across other live values in the
interval. The MI scheduler can do that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165824 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 21:31:57 +00:00
Jakob Stoklund Olesen
795f951c6d Fix coalescing with IMPLICIT_DEF values.
PHIElimination inserts IMPLICIT_DEF instructions to guarantee that all
PHI predecessors have a live-out value. These IMPLICIT_DEF values are
not considered to be real interference when coalescing virtual
registers:

  %vreg1 = IMPLICIT_DEF
  %vreg2 = MOV32r0

When joining %vreg1 and %vreg2, the IMPLICIT_DEF instruction and its
value number should simply be erased since the %vreg2 value number now
provides a live-out value for the PHI predecesor block.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165813 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 18:03:04 +00:00
Jakob Stoklund Olesen
ebba49395c Pass an explicit operand number to addLiveIns.
Not all instructions define a virtual register in their first operand.
Specifically, INLINEASM has a different format.

<rdar://problem/12472811>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165721 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-11 16:46:07 +00:00
NAKAMURA Takumi
e0297196ed Revert r165661, "Patch by Shuxin Yang <shuxin.llvm@gmail.com>."
It broke stage2 clang and test-suite/MultiSource/Benchmarks/mediabench/g721/g721encode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165692 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-11 02:02:05 +00:00
Nadav Rotem
87255a431b Patch by Shuxin Yang <shuxin.llvm@gmail.com>.
Original message:

The attached is the fix to radar://11663049. The optimization can be outlined by following rules:

   (select (x != c), e, c) -> select (x != c), e, x),
   (select (x == c), c, e) -> select (x == c), x, e)
where the <c> is an integer constant.

 The reason for this change is that : on x86, conditional-move-from-constant needs two instructions;
however, conditional-move-from-register need only one instruction.

  While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place. The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase.

  The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource".



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165661 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 21:31:55 +00:00
Michael Liao
4e2c56bdcb Specify CPU model to avoid breaking ATOM builds
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165638 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 18:04:52 +00:00
Michael Liao
44c2d61b67 Add support for FP_ROUND from v2f64 to v2f32
- Due to the current matching vector elements constraints in
  ISD::FP_ROUND, rounding from v2f64 to v4f32 (after legalization from
  v2f32) is scalarized. Add a customized v2f32 widening to convert it
  into a target-specific X86ISD::VFPROUND to work around this
  constraints.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165631 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 16:53:28 +00:00
Evan Cheng
e61e516a51 When expanding atomic load arith instructions, do not lose target flags. rdar://12453106
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165568 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-09 23:48:33 +00:00
Jakob Stoklund Olesen
6be75ae196 Don't crash on extra evil irreducible control flow.
When the CFG contains a loop with multiple entry blocks, the traces
computed by MachineTraceMetrics don't always have the same nice
properties. Loop back-edges are normally excluded from traces, but
MachineLoopInfo doesn't recognize loops with multiple entry blocks, so
those back-edges may be included.

Avoid asserting when that happens by adding an isEarlierInSameTrace()
function that accurately determines if a dominating block is part of the
same trace AND is above the currrent block in the trace.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165434 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-08 22:06:44 +00:00
Benjamin Kramer
dcf2420b07 X86: fcmov doesn't handle all possible EFLAGS, fall back to a branch for the others.
Otherwise it will try to use SSE patterns and fail horribly if sse is disabled.
Fixes PR14035.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165377 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-07 15:34:27 +00:00
Evan Cheng
2a2947885a Follow up to r165072. Try a different approach: only move the load when it's going to be folded into the call. rdar://12437604
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165287 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-05 01:48:22 +00:00
Nadav Rotem
ea2c50c041 When merging connsecutive stores, use vectors to store the constant zero.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165267 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-04 22:35:15 +00:00
Chad Rosier
34448ae393 [ms-inline asm] Add support in the X86AsmPrinter for printing memory references
in the Intel syntax.

The MC layer supports emitting in the Intel syntax, but this would require the
inline assembly MachineInstr to be lowered to an MCInst before emission.  This
is potential future work, but for now emitting directly from the MachineInstr
suffices.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165173 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-03 22:06:44 +00:00
Nadav Rotem
2e7d38192d Fix a cycle in the DAG. In this code we replace multiple loads with a single load and
multiple stores with a single load. We create the wide loads and stores (and their chains)
before we remove the scalar loads and stores and fix the DAG chain. We attempted to merge
loads with a different chain. When that happened, the assumption that it is safe to RAUW
broke and a cycle was introduced.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165148 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-03 19:30:31 +00:00
Nadav Rotem
c653de6c0f A DAGCombine optimization for mergeing consecutive stores to memory. The optimization
is not profitable in many cases because modern processors perform multiple stores
in parallel and merging stores prior to merging requires extra work. We handle two main cases:

1. Store of multiple consecutive constants:
  q->a = 3;
  q->4 = 5;
In this case we store a single legal wide integer.

2. Store of multiple consecutive loads:
  int a = p->a;
  int b = p->b;
  q->a = a;
  q->b = b;
In this case we load/store either ilegal vector registers or legal wide integer registers.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165125 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-03 16:11:15 +00:00
Jakob Stoklund Olesen
0d141f867d The early if conversion pass is ready to be used as an opt-in.
Enable the pass by default for targets that request it, and change the
-enable-early-ifcvt to the opposite -disable-early-ifcvt.

There are still some x86 regressions when enabling early if-conversion
because of the missing machine models. Disable the pass for x86 until
machine models are added.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165075 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-03 00:51:32 +00:00
Evan Cheng
2b87e06d26 Fix a serious X86 instruction selection bug. In
X86DAGToDAGISel::PreprocessISelDAG(), isel is moving load inside
callseq_start / callseq_end so it can be folded into a call. This can
create a cycle in the DAG when the call is glued to a copytoreg. We
have been lucky this hasn't caused too many issues because the pre-ra
scheduler has special handling of call sequences. However, it has
caused a crash in a specific tailcall case.

rdar://12393897


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165072 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-02 23:49:13 +00:00
Nick Lewycky
b09649b335 Make sure to put our sret argument into %rax on x86-64. Fixes PR13563!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165063 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-02 22:45:06 +00:00
Duncan Sands
48da0be8b5 Fix PR13991: legalizing an overflowing multiplication operation is harder than
the add/sub case since in the case of multiplication you also have to check that
the operation in the larger type did not overflow.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165017 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-02 15:03:49 +00:00
NAKAMURA Takumi
3fc42fd77c test/CodeGen/X86/red-zone2.ll: Add -mtriple=x86_64-linux, and FileCheck-ize.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164975 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-01 22:48:07 +00:00
Michael Liao
3118937305 Fix PR13899
- Update maximal stack alignment when stack arguments are prepared before a
  call.
- Test cases are enhanced to show it's not a Win32 specific issue but a generic
  one.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164946 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-01 16:44:04 +00:00
Nadav Rotem
73fab91f2c Revert r164910 because it causes failures to several phase2 builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164911 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-30 07:17:56 +00:00
Nadav Rotem
e5f163a3b9 A DAGCombine optimization for merging consecutive stores. This optimization is not profitable in many cases
because moden processos can store multiple values in parallel, and preparing the consecutive store requires
some work.  We only handle these cases:

1. Consecutive stores where the values and consecutive loads. For example:
 int a = p->a;
 int b = p->b;
 q->a = a;
 q->b = b;

2. Consecutive stores where the values are constants. Foe example:
 q->a = 4;
 q->b = 5;



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164910 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-30 06:24:14 +00:00
Duncan Sands
454627252b Speculatively revert commit 164885 (nadav) in the hope of ressurecting a pile of
buildbots.  Original commit message:

A DAGCombine optimization for merging consecutive stores. This optimization is not profitable in many cases
because moden processos can store multiple values in parallel, and preparing the consecutive store requires
some work.  We only handle these cases:

1. Consecutive stores where the values and consecutive loads. For example:
  int a = p->a;
  int b = p->b;
  q->a = a;
  q->b = b;

2. Consecutive stores where the values are constants. Foe example:
  q->a = 4;
  q->b = 5;



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164890 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-29 10:25:35 +00:00
Nadav Rotem
72f7b0811e A DAGCombine optimization for merging consecutive stores. This optimization is not profitable in many cases
because moden processos can store multiple values in parallel, and preparing the consecutive store requires
some work.  We only handle these cases:

1. Consecutive stores where the values and consecutive loads. For example:
  int a = p->a;
  int b = p->b;
  q->a = a;
  q->b = b;

2. Consecutive stores where the values are constants. Foe example:
  q->a = 4;
  q->b = 5;



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164885 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-29 06:33:25 +00:00
Evan Cheng
465970736b Do not delete BBs if their addresses are taken. rdar://12396696
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164866 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-28 23:58:57 +00:00
Manman Ren
284c1004f7 Testcase for r164835
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164842 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-28 20:26:33 +00:00
Jakob Stoklund Olesen
ddc26d8936 Avoid dereferencing a NULL pointer.
Fixes PR13943.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164778 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-27 16:34:19 +00:00
NAKAMURA Takumi
e6232ed86d llvm/test/CodeGen/X86/mulx*.ll: Fix copypasto.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164681 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-26 09:24:12 +00:00
Michael Liao
4fa2ddbb94 Add SARX/SHRX/SHLX code generation support
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164675 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-26 08:26:25 +00:00
Michael Liao
6bcdb5b903 Add RORX code generation support
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164674 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-26 08:24:51 +00:00
Michael Liao
0832a72a66 Add MULX code generation support
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164673 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-26 08:22:37 +00:00
Michael Liao
e5e8f7656a Add missing i64 max/min/umax/umin on 32-bit target
- Turn on atomic6432.ll and add specific test case as well



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164616 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-25 18:08:13 +00:00
Evan Cheng
b1cacc7423 Fix an illegal tailcall opt where the callee returns a double via xmm while caller returns x86_fp80 via st0. rdar://12229511
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164588 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-25 05:32:34 +00:00
Jim Grosbach
e9525d8624 Mark jump tables in code sections with DataRegion directives.
Even out-of-line jump tables can be in the code section, so mark them
as data-regions for those targets which support the directives.

rdar://12362871&12362974

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164571 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-24 23:06:27 +00:00
Michael Liao
90d9e9405e Revise test to avoid using of 'grep'
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164472 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-23 02:41:47 +00:00
Michael Liao
4d6b84028a Enhance test case of atomic16 to verify inst encoding fixed in r164453.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164465 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-22 21:07:59 +00:00
NAKAMURA Takumi
f17422181f llvm/test/CodeGen/X86/pr5145.ll: Tweak expressions to match for darwin target.
.LBB0_1: # Linux
LBB0_1:  # Darwin

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164362 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-21 05:19:19 +00:00
Michael Liao
fe87c302aa Add missing i8 max/min/umax/umin support
- Fix PR5145 and turn on test 8-bit atomic ops



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164358 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-21 03:18:52 +00:00
Benjamin Kramer
902ba576fd Fix broken check lines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164317 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-20 19:54:13 +00:00