Commit Graph

6360 Commits

Author SHA1 Message Date
Evan Cheng
e9c1e07c5f Add test for r146163.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146167 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-08 19:21:39 +00:00
Daniel Dunbar
3b0887e291 Revert r146143, "Fix bug 9905: Failure in code selection for llvm intrinsics
sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP,
FEXP2).", it is failing tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146157 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-08 17:32:18 +00:00
NAKAMURA Takumi
e4472726b5 test/CodeGen/X86/vec_compare-2.ll: Add explicit -mtriple=i686-linux.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146152 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-08 15:24:09 +00:00
Nadav Rotem
44bac7cd65 Fix a bug in the integer-promotion of bitcast operations on vector types.
We must not issue a bitcast operation for integer-promotion of vector types, because the
location of the values in the vector may be different.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146150 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-08 13:10:01 +00:00
Stepan Dyatkovskiy
72590c9738 Fix bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146143 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-08 07:55:03 +00:00
Akira Hatanaka
0a18cdc372 32 to 64-bit zext pattern.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146096 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-07 23:14:41 +00:00
Akira Hatanaka
2c78be01f6 64-bit WrapperPICPat patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146086 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-07 22:11:43 +00:00
Akira Hatanaka
7398bf01c2 Modify LowerFCOPYSIGN to handle Mips64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146080 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-07 21:48:50 +00:00
Akira Hatanaka
4d0eb637f0 Fix 64-bit immediate patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146059 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-07 20:10:24 +00:00
Eli Friedman
f91abd22be Support vector bitcasts in the AsmPrinter. PR11495.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146001 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-07 00:50:54 +00:00
Eli Friedman
26323442d5 Fix an optimization involving EXTRACT_SUBVECTOR in DAGCombine so it behaves correctly. PR11494.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145996 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-07 00:11:56 +00:00
Hal Finkel
099730dfb7 delaying restore-cr changed assigned registers in some tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145963 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-06 20:55:46 +00:00
Hal Finkel
327ca3a753 add a test case that uses RESTORE_CR
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145962 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-06 20:55:41 +00:00
Justin Holewinski
4c7ffb6a7e PTX: Continue to fix up the register mess.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145947 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-06 17:39:48 +00:00
Craig Topper
cb6bd11bd6 Fix a bunch of SSE/AVX patterns to use v2i64/v4i64 loads since all other integer vector loads are promoted to those.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145927 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-06 09:04:59 +00:00
Craig Topper
1ff73d7a67 Merge isSHUFPMask and isCommutedSHUFPMask into single function that can do both. Do the same for the 256-bit version. Use loops to reduce size of isVSHUFPYMask. Fix test cases that were incorrectly passing due to isCommutedSHUFPMask not checking for the vector being 128-bit. This caused some 256-bit shuffles to be incorrectly commuted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145921 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-06 04:59:07 +00:00
Chad Rosier
ed42c5f778 [arm-fast-isel] Doublewords only require word-alignment.
rdar://10528060

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145891 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-06 01:44:17 +00:00
Jakob Stoklund Olesen
3e572ac2fb Align ARM constant pool islands via their basic block.
Previously, all ARM::CONSTPOOL_ENTRY instructions had a hardwired
alignment of 4 bytes emitted by ARMAsmPrinter.  Now the same alignment
is set on the basic block.

This is in preparation of supporting ARM constant pool islands with
different alignments.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145890 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-06 01:43:02 +00:00
Akira Hatanaka
d6bc5237d8 Add definitions of 64-bit extract and insert instrucions and make
PerformANDCombine and PerformOrCombine aware of them. Test cases are included
too.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145853 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-05 21:26:34 +00:00
Akira Hatanaka
2bf08ec854 Have LowerJumpTable support Mips64. Modify 2010-07-20-Switch.ll to test N64 and
O32 with relocation-model=pic too.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145850 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-05 21:03:03 +00:00
Hal Finkel
fef3f9aed3 Add test case - this input used to crash because of duplicate generation of SPILL_CRs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145820 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-05 17:55:22 +00:00
Hal Finkel
3fd0018af1 enable PPC register scavenging by default (update tests and remove some FIXMEs)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145819 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-05 17:55:17 +00:00
Hal Finkel
c4785181a1 remove wasted space for extra bit copies of CR2 subregs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145817 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-05 17:55:06 +00:00
NAKAMURA Takumi
27de2a54f3 test/CodeGen/X86/pointer-vector.ll: Add explicit -mtriple=i686-linux.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145805 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-05 07:54:57 +00:00
Nadav Rotem
1608769abe Add support for vectors of pointers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145801 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-05 06:29:09 +00:00
Anton Korobeynikov
0cb2a45cce Emit the ctors in the proper order on ARM/EABI.
Maybe some targets should use this as well.

Patch by Evgeniy Stepanov!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145781 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-03 23:49:37 +00:00
Venkatraman Govindaraju
80b1ae9292 Sparc CodeGen: Fix AnalyzeBranch for PR 10282. Removing addSuccessor() since
AnalyzeBranch doesn't change the successor, just the order.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145779 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-03 21:24:48 +00:00
Sanjoy Das
199ce33b3b Check for stack space more intelligently.
libgcc sets the stack limit field in TCB to 256 bytes above the actual
allocated stack limit.  This means if the function's stack frame needs
less than 256 bytes, we can just compare the stack pointer with the
stack limit.  This should result in lesser calls to __morestack.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145766 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-03 09:32:07 +00:00
Sanjoy Das
40f8222e1e Fix a bug in the x86-32 code generated for segmented stacks.
Currently LLVM pads the call to __morestack with a add and sub of 8
bytes to esp.  This isn't correct since __morestack expects the call
to be followed directly by a ret.

This commit also adjusts the relevant test-case.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145765 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-03 09:21:07 +00:00
Chad Rosier
9eff1e33f6 [arm-fast-isel] Unaligned stores of floats require special care.
rdar://10510150

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145742 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-03 02:21:57 +00:00
Akira Hatanaka
99f50fb3ee Test cases for 64-bit multiplication and division.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145717 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-02 22:31:36 +00:00
Akira Hatanaka
fa341d919f Fix test cases to use FileCheck.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145716 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-02 22:28:09 +00:00
Chad Rosier
b74c865841 [arm-fast-isel] After promoting a function parameter be sure to update the
argument value type.  Otherwise, the sign/zero-extend has no effect on arguments
passed via the stack (i.e., undefined high-order bits).
rdar://10515467

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145701 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-02 20:25:18 +00:00
Hal Finkel
427876757f specify cpu for test to fix failure on some darwin systems with a g4+ cpu
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145699 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-02 19:38:17 +00:00
Craig Topper
138a5c66b9 Add instruction selection support for horizontal add/sub of 256-bit floating point vectors. Also add the test case for 256-bit integer vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145680 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-02 07:16:01 +00:00
Hal Finkel
2457544630 adjust the instruction ordering in some PPC tests: changes due to postRA haz. rec.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145678 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-02 04:58:12 +00:00
Eric Christopher
7d5a61e975 For 64-bit the rest of the general regs are ok for the q constraint. Make
sure we can emit both the high and low versions of those registers.

Fixes rdar://10392864

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145579 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-01 08:12:41 +00:00
Eli Friedman
522fb8cc01 Pass AVX vectors which are arguments to varargs functions on the stack. <rdar://problem/10463281>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145573 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-01 04:49:21 +00:00
Jan Sjödin
dd649e35e5 Support for encoding all FMA4 instructions and tablegen patterns for all
remaining FMA4 instructions and intrinsics with tests.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145525 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-30 22:09:42 +00:00
Eli Friedman
3dad610aaa Make GlobalMerge honor the preferred alignment on globals without an explicitly specified alignment.
<rdar://problem/10497732>.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145523 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-30 21:54:15 +00:00
Nadav Rotem
78647434ea Add test arch to make it pass on non x86 targets
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145498 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-30 17:34:28 +00:00
Nadav Rotem
f3993125b1 Add a tripple to the test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145489 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-30 11:20:56 +00:00
Nadav Rotem
18197d7425 X86: PerformOrCombine introduced a vselect node with a wrong order of operands. This bug was introduced when a dedicated blend sdnode was replaced with the vselect node (in 139479).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145488 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-30 10:13:37 +00:00
Jakob Stoklund Olesen
7c6b2c9a70 FileCheckize.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145452 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-29 23:09:16 +00:00
Akira Hatanaka
ed2a7d2780 Change names for MIPS "generic" processors defined in Mips.td to match what GNU
tools use. Patch by Simon Atanasyan.

"mips32r1" => "mips32"
"4ke" => mips32r2"
"mips64r1" => "mips64"



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145451 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-29 23:08:41 +00:00
Evan Cheng
a3438cf48b Add another missing pattern. llvm-gcc likes f64 but clang likes i64 so it was generating poor code for some SSE builtins.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145448 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-29 22:48:34 +00:00
Jakob Stoklund Olesen
0edd83bfff Make X86::FsFLD0SS / FsFLD0SD real pseudo-instructions.
Like V_SET0, these instructions are expanded by ExpandPostRA to xorps /
vxorps so they can participate in execution domain swizzling.

This also makes the AVX variants redundant.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145440 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-29 22:27:25 +00:00
Chad Rosier
ae6f2cb1fc If fast-isel fails, remove dead instructions generated during the failed
attempt.  

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145425 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-29 19:40:47 +00:00
Elena Demikhovsky
f68b214e2d Fixed vsqrt.ss intrinsic usage - order of input operands was wrong.
Added a test.
Thanks Bruno for reviewing the patch.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145403 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-29 15:00:45 +00:00
Craig Topper
f267972d28 Fix shuffle decoding for memory forms for (V)SHUFPS/D.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145392 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-29 07:58:09 +00:00
Craig Topper
36e36ace77 Fix issues in shuffle decoding around VPERM* instructions. Fix shuffle decoding for VSHUFPS/D for 256-bit types. Add pattern matching for memory forms of VPERMILPS/VPERMILPD.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145390 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-29 07:49:05 +00:00
Craig Topper
fe2a6c584a Fix VINSERTF128/VEXTRACTF128 to be marked as FP instructions. Allow execution dependency fix pass to convert them to their integer equivalents when AVX2 is enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145376 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-29 05:37:58 +00:00
Craig Topper
108126cfc6 Correctly mark VPERM2F128 as being an FP instruction and add execution domain fixing support to convert it to VPERM2I128 for AVX2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145370 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-29 03:57:34 +00:00
Evan Cheng
ed1c0c7f58 Revert r145273 and fix in SelectionDAG::InferPtrAlignment() instead.
Conservatively returns zero when the GV does not specify an alignment nor is it
initialized. Previously it returns ABI alignment for type of the GV. However, if
the type is a "packed" type, then the under-specified alignments is attached to
the load / store instructions. In that case, the alignment of the type cannot be
trusted.
rdar://10464621


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145300 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-28 22:37:34 +00:00
Evan Cheng
1c487869f5 DAG combine should not increase alignment of loads / stores with alignment less
than ABI alignment. These are loads / stores from / to "packed" data structures.
Their alignments are intentionally under-specified.

rdar://10301431


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145273 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-28 20:42:56 +00:00
Craig Topper
70b883b3a7 Add X86 instruction selection for VPERM2I128 when AVX2 is enabled. Merge VPERMILPS/VPERMILPD detection since they are pretty similar.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145238 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-28 10:14:51 +00:00
Chandler Carruth
fac1305da1 Take two on rotating the block ordering of loops. My previous attempt
was centered around the premise of laying out a loop in a chain, and
then rotating that chain. This is good for preserving contiguous layout,
but bad for actually making sane rotations. In order to keep it safe,
I had to essentially make it impossible to rotate deeply nested loops.
The information needed to correctly reason about a deeply nested loop is
actually available -- *before* we layout the loop. We know the inner
loops are already fused into chains, etc. We lose information the moment
we actually lay out the loop.

The solution was the other alternative for this algorithm I discussed
with Benjamin and some others: rather than rotating the loop
after-the-fact, try to pick a profitable starting block for the loop's
layout, and then use our existing layout logic. I was worried about the
complexity of this "pick" step, but it turns out such complexity is
needed to handle all the important cases I keep teasing out of benchmarks.

This is, I'm afraid, a bit of a work-in-progress. It is still
misbehaving on some likely important cases I'm investigating in Olden.
It also isn't really tested. I'm going to try to craft some interesting
nested-loop test cases, but it's likely to be extremely time consuming
and I don't want to go there until I'm sure I'm testing the correct
behavior. Sadly I can't come up with a way of getting simple, fine
grained test cases for this logic. We need complex loop structures to
even trigger much of it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145183 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-27 13:34:33 +00:00
Chandler Carruth
2eb5a744b1 Rework a bit of the implementation of loop block rotation to not rely so
heavily on AnalyzeBranch. That routine doesn't behave as we want given
that rotation occurs mid-way through re-ordering the function. Instead
merely check that there are not unanalyzable branching constructs
present, and then reason about the CFG via successor lists. This
actually simplifies my mental model for all of this as well.

The concrete result is that we now will rotate more loop chains. I've
added a test case from Olden highlighting the effect. There is still
a bit more to do here though in order to regain all of the performance
in Olden.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145179 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-27 09:22:53 +00:00
Chris Lattner
3211c6e31b remove autoupgrade support for old forms of llvm.prefetch and the old
trampoline forms.  Both of these were correct in LLVM 3.0, and we don't
need to support LLVM 2.9 and earlier in mainline.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145174 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-27 07:42:04 +00:00
Chris Lattner
d2bf432b2b Upgrade syntax of tests using volatile instructions to use 'load volatile' instead of 'volatile load', which is archaic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145171 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-27 06:54:59 +00:00
Chris Lattner
663aebf8d6 remove some old autoupgrade logic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145167 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-27 06:10:54 +00:00
Chandler Carruth
2e38cf961d Introduce a loop block rotation optimization to the new block placement
pass. This is designed to achieve one of the important optimizations
that the old code placement pass did, but more simply.

This is a somewhat rough and *very* conservative version of the
transform. We could get a lot fancier here if there are profitable cases
to do so. In particular, this only looks for a single pattern, it
insists that the loop backedge being rotated away is the last backedge
in the chain, and it doesn't provide any means of doing better in-loop
placement due to the rotation. However, it appears that it will handle
the important loops I am finding in the LLVM test suite.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145158 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-27 00:38:03 +00:00
Eli Friedman
4455142a95 Fix APFloat::convert so that it handles narrowing conversions correctly; it
was returning incorrect values in rare cases, and incorrectly marking
exact conversions as inexact in some more common cases. Fixes PR11406, and a
missed optimization in test/CodeGen/X86/fp-stack-O0.ll.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145141 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-26 03:38:02 +00:00
Bruno Cardoso Lopes
1b9b377975 This patch contains support for encoding FMA4 instructions and
tablegen patterns for scalar FMA4 operations and intrinsic. Also
add tests for vfmaddsd.

Patch by Jan Sjodin

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145133 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-25 19:33:42 +00:00
Craig Topper
705f2431a0 Remove 256-bit specific node types for UNPCKHPS/D and instead use the 128-bit versions and let the operand type disinquish. Also fix the load form of the v8i32 patterns for these to realize that the load would be promoted to v4i64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145126 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-24 22:57:10 +00:00
Chandler Carruth
4aae4f9007 Fix a silly use-after-free issue. A much earlier version of this code
need lots of fanciness around retaining a reference to a Chain's slot in
the BlockToChain map, but that's all gone now. We can just go directly
to allocating the new chain (which will update the mapping for us) and
using it.

Somewhat gross mechanically generated test case replicates the issue
Duncan spotted when actually testing this out.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145120 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-24 11:23:15 +00:00
Chandler Carruth
a2deea1dcf When adding blocks to the list of those which no longer have any CFG
conflicts, we should only be adding the first block of the chain to the
list, lest we try to merge into the middle of that chain. Most of the
places we were doing this we already happened to be looking at the first
block, but there is no reason to assume that, and in some cases it was
clearly wrong.

I've added a couple of tests here. One already worked, but I like having
an explicit test for it. The other is reduced from a test case Duncan
reduced for me and used to crash. Now it is handled correctly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145119 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-24 08:46:04 +00:00
Benjamin Kramer
f238f50aaf X86: Use btq for bit tests if the immediate can't be encoded in 32 bits.
Before:
	movabsq	$4294967296, %rax       ## encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x01,0x00,0x00,0x00]
	testq	%rax, %rdi              ## encoding: [0x48,0x85,0xf8]
	jne	LBB0_2                  ## encoding: [0x75,A]

After:
	btq	$32, %rdi               ## encoding: [0x48,0x0f,0xba,0xe7,0x20]
	jb	LBB0_2                  ## encoding: [0x72,A]

btq is usually slower than testq because it doesn't fuse with the jump, but here we're better off
saving one register and a giant movabsq.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145103 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-23 13:54:17 +00:00
NAKAMURA Takumi
e4513b1fc5 test/CodeGen/X86/block-placement.ll: Add explicit -mtriple=i686-linux. X86 Win32 CodeGen does not support EH yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145101 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-23 12:18:22 +00:00
Chandler Carruth
598894ff25 Relax an invariant that block placement was trying to assert a bit
further. This invariant just wasn't going to work in the face of
unanalyzable branches; we need to be resillient to the phenomenon of
chains poking into a loop and poking out of a loop. In fact, we already
were, we just needed to not assert on it.

This was found during a bootstrap with block placement turned on.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145100 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-23 10:35:36 +00:00
Elena Demikhovsky
52a35a89e6 I added several lines in X86 code generator that allow to choose
VSHUFPS/VSHUFPD instructions while lowering VECTOR_SHUFFLE node. I check a commuted VSHUFP mask.

The patch was reviewed by Bruno.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145099 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-23 10:23:16 +00:00
Chandler Carruth
521fc5bcd7 Handle the case of a no-return invoke correctly. It actually still has
successors, they just are all landing pad successors. We handle this the
same way as no successors. Comments attached for the next person to wade
through here and another lovely test case courtesy of Benjamin Kramer's
bugpoint reduction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145098 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-23 08:23:54 +00:00
Bob Wilson
23d66a58b7 Enable stack protectors for all arrays, not just char arrays. rdar://5875909
Patch by Bill Wendling.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145097 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-23 07:13:56 +00:00
Jakob Stoklund Olesen
7f5e43f61d Fix PR11422.
This was a bug in keeping track of the available domains when merging
domain values.

The wrong domain mask caused ExecutionDepsFix to try to move VANDPSYrr
to the integer domain which is only available in AVX2.

Also add an assertion to catch future attempts at emitting AVX2
instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145096 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-23 04:03:08 +00:00
Chandler Carruth
47fb954f74 Fix a crash in block placement due to an inner loop that happened to be
reversed in the function's original ordering, and we happened to
encounter it while handling an outer unnatural CFG structure.

Thanks to the test case reduced from GCC's source by Benjamin Kramer.
This may also fix a crasher in gzip that Duncan reduced for me, but
I haven't yet gotten to testing that one.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145094 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-23 03:03:21 +00:00
Hal Finkel
768c65f677 add basic PPC register-pressure feedback; adjust the vaarg test to match the new register-allocation pattern
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145065 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-22 16:21:04 +00:00
Chandler Carruth
3b7b209bf8 Fix a devilish miscompile exposed by block placement. The
updateTerminator code didn't correctly handle EH terminators in one very
specific case. AnalyzeBranch would find no terminator instruction, and
so the fallback in updateTerminator is to assume fallthrough. This is
correct, but the destination of the fallthrough was assumed to be the
first successor.

This is *almost always* true, but in certain cases the loop
transformations will cause the landing pad to be the first successor!
Instead of this brittle logic, actually look through the successors for
a non-landing-pad accessor, and to assert if more than one is found.

This will hopefully fix some (if not all) of the self host miscompiles
with block placement. Thanks to Benjamin Kramer for reporting, Nick
Lewycky for an initial stab at a reduction, and Duncan for endless
advice on EH (which I know nothing about) as well as reviewing the
actual fix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145062 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-22 13:13:16 +00:00
Rafael Espindola
fdb00a9bdb Add triple to the test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145057 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-22 06:36:25 +00:00
Rafael Espindola
254a13282c If a register is both an early clobber and part of a tied use, handle the use
before the clobber so that we copy the value if needed.

Fixes pr11415.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145056 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-22 06:27:18 +00:00
Craig Topper
6fa583d787 Lowering for v32i8 to VPUNPCKLBW/VPUNPCKHBW when AVX2 is enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145028 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-21 08:26:50 +00:00
Craig Topper
3b73312020 Test case for r145026
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145027 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-21 06:58:09 +00:00
Craig Topper
a124f94952 Make LowerSIGN_EXTEND_INREG split 256-bit vectors when AVX1 is enabled and use AVX2 shifts when AVX2 is enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145022 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-21 01:12:36 +00:00
NAKAMURA Takumi
742e5cf612 test/CodeGen/X86/block-placement.ll: Relax expressions for Win32.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145011 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-20 12:49:45 +00:00
Chandler Carruth
b0dadb9dd5 The logic for breaking the CFG in the presence of hot successors didn't
properly account for the *global* probability of the edge being taken.
This manifested as a very large number of unconditional branches to
blocks being merged against the CFG even though they weren't
particularly hot within the CFG.

The fix is to check whether the edge being merged is both locally hot
relative to other successors for the source block, and globally hot
compared to other (unmerged) predecessors of the destination block.

This introduces a new crasher on GCC single-source, but it's currently
behind a flag, and Ben has offered to work on the reduction. =]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145010 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-20 11:22:06 +00:00
Chandler Carruth
2901243fda Add some comments to the latest test case I added here to document what
is actually being tested. Also add some FileCheck goodness to much more
carefully ensure that the result is the desired result. Before this test
would only have failed through an assert failure if the underlying fix
were reverted.

Also, add some weight metadata and a comment explaining exactly what is
going on to a trick section of the test case. Originally, we were
getting very unlucky and trying to form a block chain that isn't
actually profitable. I'm working on a fix to avoid forming these
unprofitable chains, and that would also have masked any failure from
this test case. The easy solution is to add some metadata that makes it
*really* profitable to form the bad chain here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145006 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-20 09:30:40 +00:00
Craig Topper
0d86d462f8 Add code for lowering v32i8 shifts by a splat to AVX2 immediate shift instructions. Remove 256-bit splat handling from LowerShift as it was already handled by PerformShiftCombine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145005 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-20 00:12:05 +00:00
Craig Topper
745a86bac9 Use 256-bit vcmpeqd for creating an all ones vector when AVX2 is enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145004 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-19 22:34:59 +00:00
Chandler Carruth
03300ecaee Move the handling of unanalyzable branches out of the loop-driven chain
formation phase and into the initial walk of the basic blocks. We
essentially pre-merge all blocks where unanalyzable fallthrough exists,
as we won't be able to update the terminators effectively after any
reorderings. This is quite a bit more principled as there may be CFGs
where the second half of the unanalyzable pair has some analyzable
predecessor that gets placed first. Then it may get placed next,
implicitly breaking the unanalyzable branch even though we never even
looked at the part that isn't analyzable. I've included a test case that
triggers this (thanks Benjamin yet again!), and I'm hoping to synthesize
some more general ones as I dig into related issues.

Also, to make this new scheme work we have to be able to handle branches
into the middle of a chain, so add this check. We always fallback on the
incoming ordering.

Finally, this starts to really underscore a known limitation of the
current implementation -- we don't consider broken predecessors when
merging successors. This can caused major missed opportunities, and is
something I'm planning on looking at next (modulo more bug reports).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144994 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-19 10:26:02 +00:00
Craig Topper
6bf57b0272 Test cases for SSSE3/AVX integer horizontal add/sub.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144990 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-19 09:03:33 +00:00
Craig Topper
1666cb6d63 Extend VPBLENDVB and VPSIGN lowering to work for AVX2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144987 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-19 07:07:26 +00:00
Nadav Rotem
cbbe33fde4 Add AVX2 vpbroadcast support
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144967 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-18 02:49:55 +00:00
Devang Patel
ce35d8b5a1 DISubrange supports unsigned lower/upper array bounds, so let's not fake it in the end while emitting DWARF. If a FE needs to encode signed lower/upper array bounds then we need to extend DISubrange or ad DISignedSubrange.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144937 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-17 23:43:15 +00:00
Chad Rosier
478b06c980 When fast iseling a GEP, accumulate the offset rather than emitting a series of
ADDs.  MaxOffs is used as a threshold to limit the size of the offset. Tradeoffs
being: (1) If we can't materialize the large constant then we'll cause fast-isel
to bail. (2) Too large of an offset can't be directly encoded in the ADD
resulting in a MOV+ADD.  Generally not a bad thing because otherwise we would
have had ADD+ADD, but on Thumb this turns into a MOVS+MOVT+ADD. Working on a fix
for that. (3) Conversely, too low of a threshold we'll miss opportunities to 
coalesce ADDs.
rdar://10412592



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144886 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-17 07:15:58 +00:00
Eli Friedman
4db4addcd4 Make sure to replace the chain properly when DAGCombining a LOAD+EXTRACT_VECTOR_ELT into a single LOAD. Fixes PR10747/PR11393.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144863 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-16 23:50:22 +00:00
Evan Cheng
2b89498979 Another missing X86ISD::MOVLPD pattern. rdar://10450317
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144839 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-16 22:24:44 +00:00
Evan Cheng
c3aa7c5c5a Disable expensive two-address optimizations at -O0. rdar://10453055
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144806 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-16 18:44:48 +00:00
Eli Friedman
ee94dc212e Fix testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144769 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-16 03:03:52 +00:00
Eli Friedman
d577df8e5a CONCAT_VECTORS can have more than two operands. PR11389.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144768 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-16 02:52:39 +00:00
Nadav Rotem
f8c10e5cb1 AVX: Add support for vbroadcast from BUILD_VECTOR and refactor some of the vbroadcast code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144720 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-15 22:50:37 +00:00
NAKAMURA Takumi
ec0af2f4e1 test/CodeGen/X86/dec-eflags-lower.ll: Relax expression for win32 x64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144714 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-15 22:30:37 +00:00
Pete Cooper
2d49689793 Added custom lowering for load->dec->store sequence in x86 when the EFLAGS registers is used
by later instructions.

Only done for DEC64m right now.

Fixes <rdar://problem/6172640>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144705 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-15 21:57:53 +00:00
Rafael Espindola
6c5b2dcd83 We currently use a callback to handle an IL pass deleting a BB that still
has a reference to it. Unfortunately, that doesn't work for codegen passes
since we don't get notified of MBB's being deleted (the original BB stays).

Use that fact to our advantage and after printing a function, check if
any of the IL BBs corresponds to a symbol that was not printed. This fixes
pr11202.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144674 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-15 19:08:46 +00:00
Jakob Stoklund Olesen
f805a7c25c Revert r144611 and r144613.
These tests are actually correct, clang was miscompiling ExeDepsFix::processUses.

Evan fixed the miscompilation in r144628.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144630 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-15 07:13:03 +00:00
Chandler Carruth
3273c8937b Rather than trying to use the loop block sequence *or* the function
block sequence when recovering from unanalyzable control flow
constructs, *always* use the function sequence. I'm not sure why I ever
went down the path of trying to use the loop sequence, it is
fundamentally not the correct sequence to use. We're trying to preserve
the incoming layout in the cases of unreasonable control flow, and that
is only encoded at the function level. We already have a filter to
select *exactly* the sub-set of blocks within the function that we're
trying to form into a chain.

The resulting code layout is also significantly better because of this.
In several places we were ending up with completely unreasonable control
flow constructs due to the ordering chosen by the loop structure for its
internal storage. This change removes a completely wasteful vector of
basic blocks, saving memory allocation in the common case even though it
costs us CPU in the fairly rare case of unnatural loops. Finally, it
fixes the latest crasher reduced out of GCC's single source. Thanks
again to Benjamin Kramer for the reduction, my bugpoint skills failed at
it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144627 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-15 06:26:43 +00:00
Craig Topper
4c077a1f04 Properly qualify AVX2 specific parts of execution dependency table. Also enable converting between 256-bit PS/PD operations when AVX1 is enabled. Fixes PR11370.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144622 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-15 05:55:35 +00:00
Jakob Stoklund Olesen
ff70467aa2 Really fix test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144613 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-15 03:17:01 +00:00
Jakob Stoklund Olesen
3c84ec070a Allow for depencendy-breaking instructions before cvt*.
This should unbreak clang-x86_64-darwin10-RA, but I can't actually
reproduce the failure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144611 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-15 02:29:48 +00:00
Evan Cheng
eaa192af18 Add vmov.f32 to materialize f32 immediate splats which cannot be handled by
integer variants. rdar://10437054


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144608 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-15 02:12:34 +00:00
Jakob Stoklund Olesen
c2ecf3efbf Break false dependencies before partial register updates.
Two new TargetInstrInfo hooks lets the target tell ExecutionDepsFix
about instructions with partial register updates causing false unwanted
dependencies.

The ExecutionDepsFix pass will break the false dependencies if the
updated register was written in the previoius N instructions.

The small loop added to sse-domains.ll runs twice as fast with
dependency-breaking instructions inserted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144602 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-15 01:15:30 +00:00
Jim Grosbach
ffc658b056 ARM VLDR/VSTR instructions don't need a size suffix.
Canonicallize on the non-suffixed form, but continue to accept assembly that
has any correctly sized type suffix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144583 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-14 23:03:21 +00:00
Chad Rosier
e91da1baa1 Add newline to end of file. Thanks, Eli.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144579 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-14 22:48:33 +00:00
Chad Rosier
909cb4f2f2 Add support for inlining small memcpys.
rdar://10412592


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144578 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-14 22:46:17 +00:00
Chad Rosier
e489af8dce Fix a performance regression from r144565. Positive offsets were being lowered
into registers, rather then encoded directly in the load/store.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144576 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-14 22:34:48 +00:00
Evan Cheng
76c8f08567 Add a missing pattern for X86ISD::MOVLPD. rdar://10436044
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144566 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-14 20:35:52 +00:00
Chad Rosier
57b2997966 Add support for Thumb load/stores with negative offsets.
rdar://10412592



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144565 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-14 20:22:27 +00:00
Evan Cheng
2a4410df44 Teach two-address pass to re-schedule two-address instructions (or the kill
instructions of the two-address operands) in order to avoid inserting copies.
This fixes the few regressions introduced when the two-address hack was
disabled (without regressing the improvements).
rdar://10422688


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144559 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-14 19:48:55 +00:00
Pete Cooper
a77214a4c4 Changed SSE4/AVX <2 x i64> extract and insert ops to be Custom lowered
Constant idx case is still done in tablegen but other cases are then expanded

Fixes <rdar://problem/10435460>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144557 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-14 19:38:42 +00:00
Jakob Stoklund Olesen
f054e19819 Fix early-clobber handling in shrinkToUses.
I broke this in r144515, it affected most ARM testers.

<rdar://problem/10441389>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144547 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-14 18:45:38 +00:00
Jakob Stoklund Olesen
4a9b615f3e Delete stale comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144542 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-14 18:03:05 +00:00
Chandler Carruth
2770c14185 Fix an overflow bug in MachineBranchProbabilityInfo. This pass relied on
the sum of the edge weights not overflowing uint32, and crashed when
they did. This is generally safe as BranchProbabilityInfo tries to
provide this guarantee. However, the CFG can get modified during codegen
in a way that grows the *sum* of the edge weights. This doesn't seem
unreasonable (imagine just adding more blocks all with the default
weight of 16), but it is hard to come up with a case that actually
triggers 32-bit overflow. Fortuately, the single-source GCC build is
good at this. The solution isn't very pretty, but its no worse than the
previous code. We're already summing all of the edge weights on each
query, we can sum them, check for an overflow, compute a scale, and sum
them again.

I've included a *greatly* reduced test case out of the GCC source that
triggers it. It's a pretty lame test, as it clearly is just barely
triggering the overflow. I'd like to have something that is much more
definitive, but I don't understand the fundamental pattern that triggers
an explosion in the edge weight sums.

The buggy code is duplicated within this file. I'll colapse them into
a single implementation in a subsequent commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144526 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-14 08:50:16 +00:00
Chad Rosier
dc9205d9c2 Add support for ARM halfword load/stores and signed byte loads with negative
offsets.
rdar://10412592



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144518 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-14 04:09:28 +00:00
Chandler Carruth
b5856c83ff Teach machine block placement to cope with unnatural loops. These don't
get loop info structures associated with them, and so we need some way
to make forward progress selecting and placing basic blocks. The
technique used here is pretty brutal -- it just scans the list of blocks
looking for the first unplaced candidate. It keeps placing blocks like
this until the CFG becomes tractable.

The cost is somewhat unfortunate, it requires allocating a vector of all
basic block pointers eagerly. I have some ideas about how to simplify
and optimize this, but I'm trying to get the logic correct first.

Thanks to Benjamin Kramer for the reduced test case out of GCC. Sadly
there are other bugs that GCC is tickling that I'm reducing and working
on now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144516 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-14 00:00:35 +00:00
Chandler Carruth
df234353fb Rewrite #3 of machine block placement. This is based somewhat on the
second algorithm, but only loosely. It is more heavily based on the last
discussion I had with Andy. It continues to walk from the inner-most
loop outward, but there is a key difference. With this algorithm we
ensure that as we visit each loop, the entire loop is merged into
a single chain. At the end, the entire function is treated as a "loop",
and merged into a single chain. This chain forms the desired sequence of
blocks within the function. Switching to a single algorithm removes my
biggest problem with the previous approaches -- they had different
behavior depending on which system triggered the layout. Now there is
exactly one algorithm and one basis for the decision making.

The other key difference is how the chain is formed. This is based
heavily on the idea Andy mentioned of keeping a worklist of blocks that
are viable layout successors based on the CFG. Having this set allows us
to consistently select the best layout successor for each block. It is
expensive though.

The code here remains very rough. There is a lot that needs to be done
to clean up the code, and to make the runtime cost of this pass much
lower. Very much WIP, but this was a giant chunk of code and I'd rather
folks see it sooner than later. Everything remains behind a flag of
course.

I've added a couple of tests to exercise the issues that this iteration
was motivated by: loop structure preservation. I've also fixed one test
that was exhibiting the broken behavior of the previous version.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144495 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-13 11:20:44 +00:00
Chad Rosier
9eb674880b The order in which the predicate is added differs between Thumb and ARM mode. Fix predicate when in ARM mode and restore SelectIntrinsicCall.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144494 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-13 09:44:21 +00:00
Chad Rosier
a517ab155b Temporarily disable SelectIntrinsicCall when in ARM mode. This is causing failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144492 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-13 05:14:43 +00:00
Chad Rosier
b29b950bf2 Add support for emitting both signed- and zero-extend loads. Fix
SimplifyAddress to handle either a 12-bit unsigned offset or the ARM +/-imm8
offsets (addressing mode 3).  This enables a load followed by an integer 
extend to be folded into a single load.

For example:
ldrb r1, [r0]       ldrb r1, [r0]
uxtb r2, r1     =>
mov  r3, r2         mov  r3, r1


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144488 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-13 02:23:59 +00:00
Jakob Stoklund Olesen
334575e79b Remove the -color-ss-with-regs option.
It was off by default.

The new register allocators don't have the problems that made it
necessary to reallocate registers during stack slot coloring.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144481 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-13 00:31:23 +00:00
Jakob Stoklund Olesen
5d9b109181 Delete the 'standard' spiller with used the old spilling framework.
The current register allocators all use the inline spiller.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144477 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 23:29:02 +00:00
Jakob Stoklund Olesen
fe9dd87783 Remove histogram tests.
Counting the number of occurences of each opcode is not a useful test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144474 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 22:39:40 +00:00
Jakob Stoklund Olesen
56ad83d47c RAGreedy is better about hinting now.
Or maybe we are just getting lucky.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144473 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 22:39:37 +00:00
Jakob Stoklund Olesen
7f67091259 Linear scan is going away.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144472 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 22:39:34 +00:00
Jakob Stoklund Olesen
2eda9458ea XFAIL test that depends on linear scan to remove dead code.
Filed PR11364 to track the problem.  Should the register allocator
eliminate dead code?

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144471 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 22:39:30 +00:00
Jakob Stoklund Olesen
bf27b61593 Remove obsolete test.
This test was committed with a bugfix to RemoveCopyByCommutingDef, but
that optimization is no longer triggered by this test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144470 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 22:39:27 +00:00
Jakob Stoklund Olesen
55adef0c43 Remove obsolete test.
This test is for a very specific LocalRewriter bug.  LocalRewriter is
going away.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144469 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 22:39:24 +00:00
Jakob Stoklund Olesen
bb2fdd63c6 Remove obsolete test.
I don't think this test does what is was supposed to do, and
LocalRewriter is going away anyway.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144463 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 20:37:57 +00:00
Jakob Stoklund Olesen
d211e731aa Eliminate more linear scan tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144462 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 20:35:26 +00:00
Jakob Stoklund Olesen
7d7d569cbb Switch a couple -O0 tests to RABasic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144461 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 20:11:04 +00:00
Jakob Stoklund Olesen
097d277ef0 Switch a few tests off linearscan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144460 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 19:53:52 +00:00
Jakob Stoklund Olesen
4ee1aa7020 Delete old test of a VirtRegRewriter feature.
This test doesn't expose the issue with RAGreedy.

I filed PR11363 to track the missing InlineSpiller feature.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144459 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 19:53:48 +00:00
Jakob Stoklund Olesen
8658c51c1b Remove old test that doesn't make sense.
The test is checking that the output doesn't contains any 'mov '
strings. It does contain movl, though.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144458 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 19:53:45 +00:00
Craig Topper
7be5dfd1a1 Add more AVX2 shift lowering support. Move AVX2 variable shift to use patterns instead of custom lowering code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144457 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 09:58:49 +00:00
Eli Friedman
501852423d Don't try to form pre/post-indexed loads/stores until after LegalizeDAG runs. Fixes PR11029.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144438 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 00:35:34 +00:00
Chad Rosier
11add26ec2 Add support in fast-isel for selecting memset/memcpy/memmove intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144426 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-11 23:31:03 +00:00
Chad Rosier
6d267449ac Loosen test by using REs. Approved by Devang.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144425 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-11 23:25:38 +00:00
Andrew Trick
95bc85e4ee Preserve MachineMemOperands in ARMLoadStoreOptimizer.
Fixes PR8113.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144409 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-11 22:18:09 +00:00
Dan Bailey
96e6458903 allow non-device function calls in PTX when natively handling device-side printf
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144388 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-11 14:45:12 +00:00
Craig Topper
46154eb6fd Add lowering for AVX2 shift instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144380 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-11 07:39:23 +00:00
Chad Rosier
a07d3fc693 Add support for using immediates with select instructions.
rdar://10412592


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144376 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-11 06:20:39 +00:00
Eli Friedman
15f58c56e9 Make sure to expand SIGN_EXTEND_INREG for NEON vectors. PR11319, round 3.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144361 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-11 03:16:38 +00:00
Chad Rosier
4e89d97e3a Add support for using MVN to materialize negative constants.
rdar://10412592

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144348 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-11 00:36:21 +00:00
Chad Rosier
16455ce1a4 When in ARM mode, LDRH/STRH require special handling of negative offsets.
For correctness, disable this for now.
rdar://10418009

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144316 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-10 21:09:49 +00:00
NAKAMURA Takumi
bd165eac9d test/CodeGen/X86/lsr-loop-exit-cond.ll: Try to appease linux and freebsd bots to specify explicit -mtriple=x86_64-darwin.
I guess it expects -relocation-model=pic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144290 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-10 14:18:59 +00:00
Evan Cheng
623a7e146b Use a bigger hammer to fix PR11314 by disabling the "forcing two-address
instruction lower optimization" in the pre-RA scheduler.

The optimization, rather the hack, was done before MI use-list was available.
Now we should be able to implement it in a better way, perhaps in the
two-address pass until a MI scheduler is available.

Now that the scheduler has to backtrack to handle call sequences. Adding
artificial scheduling constraints is just not safe. Furthermore, the hack
is not taking all the other scheduling decisions into consideration so it's just
as likely to pessimize code. So I view disabling this optimization goodness
regardless of PR11314.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144267 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-10 07:43:16 +00:00
Chad Rosier
6cba97c555 For immediate encodings of icmp, zero or sign extend first. Then
determine if the value is negative and flip the sign accordingly.
rdar://10422026

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144258 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-10 01:30:39 +00:00
Jakob Stoklund Olesen
17afb06648 Strip old implicit operands after foldMemoryOperand.
The TII.foldMemoryOperand hook preserves implicit operands from the
original instruction.  This is not what we want when those implicit
operands refer to the register being spilled.

Implicit operands referring to other registers are preserved.

This fixes PR11347.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144247 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-10 00:17:03 +00:00
Eli Friedman
14e809c872 Make sure we correctly unroll conversions between v2f64 and v2i32 on ARM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144241 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-09 23:36:02 +00:00
Eli Friedman
0948f0acca Add check so we don't try to perform an impossible transformation. Fixes issue from PR11319.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144216 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-09 22:25:12 +00:00
Nadav Rotem
c6c7e85a71 AVX2: Add patterns for variable shift operations
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144212 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-09 21:22:13 +00:00
Chad Rosier
a7a996b98d Use REs to remove dependencies on the register allocation order.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144209 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-09 20:06:13 +00:00
Duncan Sands
ef0b3ca3a8 Speculatively revert commit 144124 (djg) in the hope that the 32 bit
dragonegg self-host buildbot will recover (it is complaining about object
files differing between different build stages).  Original commit message:

Add a hack to the scheduler to disable pseudo-two-address dependencies in
basic blocks containing calls. This works around a problem in which
these artificial dependencies can get tied up in calling seqeunce
scheduling in a way that makes the graph unschedulable with the current
approach of using artificial physical register dependencies for calling
sequences. This fixes PR11314.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144188 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-09 14:20:48 +00:00
Nadav Rotem
bb539bf973 Add AVX2 support for vselect of v32i8
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144187 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-09 13:21:28 +00:00
Craig Topper
b80ada98c5 Enable execution dependency fix pass for YMM registers when AVX2 is enabled. Add AVX2 logical operations to list of replaceable instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144179 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-09 09:37:21 +00:00
Craig Topper
0a15035f52 Add instruction selection for AVX2 integer comparisons.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144176 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-09 08:06:13 +00:00
Craig Topper
aaa643c70e Add AVX2 instruction lowering for add, sub, and mul.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144174 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-09 07:28:55 +00:00
Chad Rosier
2f2fe417f9 Add support for encoding immediates in icmp and fcmp. Hopefully, this will
remove a fair number of unnecessary materialized constants.
rdar://10412592


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144163 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-09 03:22:02 +00:00
Jakob Stoklund Olesen
f4c4768fb2 Collapse DomainValues across loop back-edges.
During the initial RPO traversal of the basic blocks, remember the ones
that are incomplete because of back-edges from predecessors that haven't
been visited yet.

After the initial RPO, revisit all those loop headers so the incoming
DomainValues on the back-edges can be properly collapsed.

This will properly fix execution domains on software pipelined code,
like the included test case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144151 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-09 01:06:56 +00:00
Dan Gohman
9cae2d2225 Add a hack to the scheduler to disable pseudo-two-address dependencies in
basic blocks containing calls. This works around a problem in which
these artificial dependencies can get tied up in calling seqeunce
scheduling in a way that makes the graph unschedulable with the current
approach of using artificial physical register dependencies for calling
sequences. This fixes PR11314.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144124 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 21:29:06 +00:00
Evan Cheng
3568a1051e Add workaround for Cortex-M3 errata 602117 by replacing ldrd x, y, [x] with ldm or ldr pairs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144123 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 21:21:09 +00:00
Pete Cooper
d9eb920aa4 Adding test for machine-licm operating on invariant load instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144104 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 19:06:53 +00:00
Lang Hames
5207bf2177 Lower mem-ops to unaligned i32/i16 load/stores on ARM where supported.
Add support for trimming constants to GetDemandedBits. This fixes some funky
constant generation that occurs when stores are expanded for targets that don't
support unaligned stores natively.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144102 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 18:56:23 +00:00
NAKAMURA Takumi
a422294ab1 test/CodeGen/X86/vec_shuffle-39.ll: Add explicit -mtriple=x86_64-linux. Passing packed value is not compatible on Win32 x64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144068 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 03:46:39 +00:00
NAKAMURA Takumi
916d6441e1 test/CodeGen/X86/vec_shuffle-38.ll: Relax expression for Win32 x64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144067 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 03:46:32 +00:00
NAKAMURA Takumi
5fb870d861 test/CodeGen/X86/vec_shuffle.ll: Add explicit -mtriple=i686-linux. We may see some suboptimal frame (%ebp) emission on certain hosts. Possible [PR11031]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144066 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 03:46:25 +00:00
Eli Friedman
9f1f26aefa Make sure to mark vector extload's as expand on ARM. Fixes PR11319.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144057 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 01:43:53 +00:00
Eli Friedman
2efa35f779 Add a bunch of calls to RemoveDeadNode in LegalizeDAG, so legalization doesn't get confused by CSE later on. Fixes PR11318.
Re-commit of r144034, with an extra fix so that RemoveDeadNode doesn't blow up.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144055 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 01:25:24 +00:00
Evan Cheng
7bc389b6b0 Add x86 isel logic and patterns to match movlps from clang generated IR for _mm_loadl_pi(). rdar://10134392, rdar://10050222
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144052 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 00:31:58 +00:00
Bill Wendling
8b7d76990c Convert to the new EH model.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144049 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 00:17:28 +00:00
Bill Wendling
30ceba32b2 Convert tests to the new EH model.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144048 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 00:09:27 +00:00
Chad Rosier
0eff39f2e2 Enable support for returning i1, i8, and i16. Nothing special todo as it's the
callee's responsibility to sign or zero-extend the return value.  The additional
test case just checks to make sure the calls are selected (i.e., -fast-isel-abort
doesn't assert).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144047 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 00:03:32 +00:00
Pete Cooper
02e5fb0f58 Added missing newline
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144046 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 00:03:24 +00:00
Eli Friedman
58dd0fec4d Revert r144034 while I try to track down a crash.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144044 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-07 23:53:20 +00:00
Jakob Stoklund Olesen
61f46de349 Fix test for Windows as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144038 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-07 23:10:43 +00:00
Jakob Stoklund Olesen
b26c7727c9 Kill and collapse outstanding DomainValues.
DomainValues that are only used by "don't care" instructions are now
collapsed to the first possible execution domain after all basic blocks
have been processed.  This typically means the PS domain on x86.

For example, the vsel_i64 and vsel_double functions in sse2-blend.ll are
completely collapsed to the PS domain instead of containing a mix of
execution domains created by isel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144037 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-07 23:08:21 +00:00
Pete Cooper
a29fc806fe InstCombine now optimizes vector udiv by power of 2 to shifts
Fixes r8429


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144036 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-07 23:04:49 +00:00
Eli Friedman
1b4f6f2532 Add a bunch of calls to RemoveDeadNode in LegalizeDAG, so legalization doesn't get confused by CSE later on. Fixes PR11318.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144034 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-07 22:51:10 +00:00
Benjamin Kramer
70be28a5ad Simplify some uses of utohexstr.
As a side effect hex is printed lowercase instead of uppercase now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144013 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-07 21:00:59 +00:00
Jakob Stoklund Olesen
32dd4eb204 Fix test for Linux.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144003 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-07 20:47:23 +00:00
Jakob Stoklund Olesen
3e5d5c53a0 Expand V_SET0 to xorps by default.
The xorps instruction is smaller than pxor, so prefer that encoding.

The ExecutionDepsFix pass will switch the encoding to pxor and xorpd
when appropriate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143996 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-07 19:15:58 +00:00
Craig Topper
4c763ee613 Add AVX2 variable shift instructions and intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143915 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-07 08:26:24 +00:00
Craig Topper
28692044db Add AVX2 VPMOVMASK instructions and intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143904 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-07 03:20:35 +00:00
Craig Topper
69f5df7778 Add AVX2 VEXTRACTI128 and VINSERTI128 instructions. Fix VPERM2I128 to be qualified with HasAVX2 instead of HasAVX. Mark VINSERTF128 and VEXTRACTF128 as never having side effects.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143902 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-07 02:00:04 +00:00
Craig Topper
c8eb880a7f More AVX2 instructions and their intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143895 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-06 23:04:08 +00:00
Craig Topper
27e5d0c72a Add more AVX2 instructions and intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143861 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-06 06:12:20 +00:00
Chad Rosier
42536af5ce Add support for passing i1, i8, and i16 call parameters. Also, be sure to
zero-extend the constant integer encoding.  Test case provides testing for
both call parameters and materialization of i1, i8, and i16 types.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143821 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-05 20:16:15 +00:00
Benjamin Kramer
c25c908977 Add an option to pad an uleb128 to MCObjectWriter and remove the uleb128 encoding from the DWARF asm printer.
As a side effect we now print dwarf ulebs with .ascii directives.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143809 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-05 11:52:44 +00:00
Eli Friedman
bd00a934c6 Enhanced vzeroupper insertion pass that avoids inserting vzeroupper where it is unnecessary through local analysis. Patch from Bruno Cardoso Lopes, with some additional changes.
I'm going to wait for any review comments and perform some additional testing before turning this on by default.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143750 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-04 23:46:11 +00:00
Craig Topper
517497cce0 Add intrinsics for X86 vcvtps2ph and vcvtph2ps instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143682 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-04 06:59:21 +00:00
Chad Rosier
f470cbbad2 Add fast-isel support for returning i1, i8, and i16.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143669 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-04 00:50:21 +00:00
Dan Gohman
65fd6564b8 Reapply r143206, with fixes. Disallow physical register lifetimes
across calls, and only check for nested dependences on the special
call-sequence-resource register.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143660 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-03 21:49:52 +00:00
Pete Cooper
71fccadbed Reverted r143600 - selector reference change
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143646 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-03 20:47:50 +00:00