Commit Graph

6342 Commits

Author SHA1 Message Date
Nadav Rotem
c8d12eee12 On AVX, we can load v8i32 at a time. The bug happens when two uneven loads are used.
When we load the v12i32 type, the GenWidenVectorLoads method generates two loads: v8i32 and v4i32 
and attempts to use CONCAT_VECTORS to join them. In this fix I concat undef values to widen 
the smaller value. The test "widen_load-2.ll" also exposes this bug on AVX.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147964 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-11 20:19:17 +00:00
Bill Wendling
3bf052b76c Check to make sure that the CFString's back store ends up in the correct section.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147961 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-11 19:33:37 +00:00
Rafael Espindola
2028b793e1 Support segmented stacks on mac.
This uses TLS slot 90, which actually belongs to JavaScriptCore. We only support
frames with static size
Patch by Brian Anderson.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147960 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-11 19:00:37 +00:00
Rafael Espindola
7692ce9e81 Split segmented stacks tests into tests for static- and dynamic-size frames.
Patch by Brian Anderson.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147959 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-11 18:51:03 +00:00
Rafael Espindola
25cd4ff97e Generate the segmented stack prologue for fastcc too.
Patch by Brian Anderson.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147958 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-11 18:41:19 +00:00
Chandler Carruth
11f0e7b158 Revert r147945 which disabled an addressing mode transformation. I had
hoped this would revive one of the llvm-gcc selfhost build bots, but it
didn't so it doesn't appear that my transform is the culprit.

If anyone else is seeing failures, please let me know!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147957 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-11 18:36:12 +00:00
Rafael Espindola
313c703831 Use unsigned comparison in segmented stack prologue.
This is a comparison of two addresses, and GCC does the comparison unsigned.

Patch by Brian Anderson.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147954 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-11 18:23:35 +00:00
Rafael Espindola
014f7a3b37 Explicitly set the scale to 1 on some segstack prologue instrs.
Patch by Brian Anderson.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147952 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-11 18:14:03 +00:00
Jan Sjödin
46df3adb4e Add XOP Intrinsics and tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147949 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-11 15:20:20 +00:00
Nadav Rotem
394a1f53b9 Fix a bug in the lowering of BUILD_VECTOR for AVX. SCALAR_TO_VECTOR does not zero untouched elements. Use INSERT_VECTOR_ELT instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147948 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-11 14:07:51 +00:00
Chandler Carruth
e4bc80a14b Disable the transformation I added in r147936 to see if it fixes some
strange build bot failures that look like a miscompile into an infloop.
I'll investigate this tomorrow, but I'd both like to know whether my
patch is the culprit, and get the bots back to green.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147945 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-11 12:17:47 +00:00
Jakob Stoklund Olesen
dec1f99615 Fix undefined code and reenable test case.
I don't think the compact encoding code is right, but at least is has
defined behavior now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147938 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-11 09:08:04 +00:00
Chandler Carruth
f103b3d1b9 Teach the X86 instruction selection to do some heroic transforms to
detect a pattern which can be implemented with a small 'shl' embedded in
the addressing mode scale. This happens in real code as follows:

  unsigned x = my_accelerator_table[input >> 11];

Here we have some lookup table that we look into using the high bits of
'input'. Each entity in the table is 4-bytes, which means this
implicitly gets turned into (once lowered out of a GEP):

  *(unsigned*)((char*)my_accelerator_table + ((input >> 11) << 2));

The shift right followed by a shift left is canonicalized to a smaller
shift right and masking off the low bits. That hides the shift right
which x86 has an addressing mode designed to support. We now detect
masks of this form, and produce the longer shift right followed by the
proper addressing mode. In addition to saving a (rather large)
instruction, this also reduces stalls in Intel chips on benchmarks I've
measured.

In order for all of this to work, one part of the DAG needs to be
canonicalized *still further* than it currently is. This involves
removing pointless 'trunc' nodes between a zextload and a zext. Without
that, we end up generating spurious masks and hiding the pattern.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147936 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-11 08:41:08 +00:00
NAKAMURA Takumi
69b5df8bf6 llvm/test/CodeGen/X86/zext-fold.ll: Relax an expression in stack offset.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147928 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-11 07:34:22 +00:00
NAKAMURA Takumi
29cc410e6d llvm/test/CodeGen/X86/sub-with-overflow.ll: Add explicit -mtriple=i686-linux.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147927 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-11 07:34:14 +00:00
Andrew Trick
08c66642d7 ARM Ld/St Optimizer fix.
Allow LDRD to be formed from pairs with different LDR encodings. This was the original intention of the pass. Somewhere along the way, the LDR opcodes were refined which broke the optimization. We really don't care what the original opcodes are as long as they both map to the same LDRD and the immediate still fits.

Fixes rdar://10435045 ARMLoadStoreOptimization cannot handle mixed LDRi8/LDRi12


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147922 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-11 03:56:08 +00:00
Jakob Stoklund Olesen
bc9beda433 Disable test that seems to expose an unrelated Linux issue.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147921 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-11 03:42:27 +00:00
Jakob Stoklund Olesen
2aad2f6e60 Detect when a value is undefined on an edge to a landing pad.
Consider this code:

int h() {
  int x;
  try {
    x = f();
    g();
  } catch (...) {
    return x+1;
  }
  return x;
}

The variable x is undefined on the first edge to the landing pad, but it
has the f() return value on the second edge to the landing pad.

SplitAnalysis::getLastSplitPoint() would assume that the return value
from f() was live into the landing pad when f() throws, which is of
course impossible.

Detect these cases, and treat them as if the landing pad wasn't there.
This allows spill code to be inserted after the function call to f().

<rdar://problem/10664933>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147912 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-11 02:07:05 +00:00
Chad Rosier
eab30279f7 Add test case for r147881.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147891 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-10 23:09:53 +00:00
Joerg Sonnenberger
216f63702f Default stack alignment for 32bit x86 should be 4 Bytes, not 8 Bytes.
Add a test that checks the stack alignment of a simple function for
Darwin, Linux and NetBSD for 32bit and 64bit mode.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147888 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-10 22:43:53 +00:00
Jakob Stoklund Olesen
19d0bf3a92 Consider unknown alignment caused by OptimizeThumb2Instructions().
This function runs after all constant islands have been placed, and may
shrink some instructions to their 2-byte forms.  This can actually cause
some constant pool entries to move out of range because of growing
alignment padding.

Treat instructions that may be shrunk the same as inline asm - they
erode the known alignment bits.

Also reinstate an old assertion in verify(). It is correct now that
basic block offsets include alignments.

Add a single large test case that will hopefully exercise many parts of
the constant island pass.

<rdar://problem/10670199>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147885 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-10 22:32:14 +00:00
Jim Grosbach
f1f16c832f ARM updating VST2 pseudo-lowering fixed vs. register update.
rdar://10663487

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147876 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-10 21:11:12 +00:00
Nadav Rotem
6c0366cb25 Fix a bug in the legalization of shuffle vectors. When we emulate shuffles using BUILD_VECTORS we may be using a BV of different type. Make sure to cast it back.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147851 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-10 14:28:46 +00:00
Craig Topper
a937633893 Fix a crash in AVX2 when trying to broadcast a double into a 128-bit vector. There is no vbroadcastsd xmm, but we do need to support 64-bit integers broadcasted into xmm. Also factor the AVX check into the isVectorBroadcast function. This makes more sense since the AVX2 check was already inside.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147844 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-10 08:23:59 +00:00
Evan Cheng
97b5beb7fe Allow machine-cse to look across MBB boundary when cse'ing instructions that
define physical registers. It's currently very restrictive, only catching
cases where the CE is in an immediate (and only) predecessor. But it catches
a surprising large number of cases.

rdar://10660865


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147827 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-10 02:02:58 +00:00
Chandler Carruth
a7cb699251 Cleanup and FileCheck-ize a test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147772 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-09 09:44:26 +00:00
Craig Topper
5feb5dae93 Clean up patterns for MOVNT*. Not sure why there were floating point types on MOVNTPS and MOVNTDQ. And v4i64 was completely missing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147767 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-09 06:52:46 +00:00
Rafael Espindola
1fe9737eb4 Don't print an unused label before .cfi_endproc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147763 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-09 00:17:29 +00:00
Craig Topper
39f227e4dd Don't disable MMX support when AVX is enabled. Fix predicates for MMX instructions that were added along with SSE instructions to check for AVX in addition to SSE level.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147762 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-09 00:11:29 +00:00
Victor Umansky
435d0bd09d Reverted commit #147601 upon Evan's request.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147748 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-08 17:20:33 +00:00
Rafael Espindola
547be2699c Don't print a label before .cfi_startproc when we don't need to. This makes
the produce assembly when using CFI just a bit more readable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147743 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-07 22:42:19 +00:00
Jakob Stoklund Olesen
4964ba01f9 Use getRegForValue() to materialize the address of ARM globals.
This enables basic local CSE, giving us 20% smaller code for
consumer-typeset in -O0 builds.

<rdar://problem/10658692>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147720 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-07 04:07:22 +00:00
Evan Cheng
977679d603 Added a late machine instruction copy propagation pass. This catches
opportunities that only present themselves after late optimizations
such as tail duplication .e.g.
## BB#1:
        movl    %eax, %ecx
        movl    %ecx, %eax
        ret

The register allocator also leaves some of them around (due to false
dep between copies from phi-elimination, etc.)

This required some changes in codegen passes. Post-ra scheduler and the
pseudo-instruction expansion passes have been moved after branch folding
and tail merging. They were before branch folding before because it did
not always update block livein's. That's fixed now. The pass change makes
independently since we want to properly schedule instructions after
branch folding / tail duplication.

rdar://10428165
rdar://10640363



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147716 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-07 03:02:36 +00:00
Jakob Stoklund Olesen
45ca7c6336 Use movw+movt in ARMFastISel::ARMMaterializeGV.
This eliminates a lot of constant pool entries for -O0 builds of code
with many global variable accesses.

This speeds up -O0 codegen of consumer-typeset by 2x because the
constant island pass no longer has to look at thousands of constant pool
entries.

<rdar://problem/10629774>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147712 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-07 01:47:05 +00:00
Eric Christopher
5548755201 Make the 'x' constraint work for AVX registers as well.
Fixes rdar://10614894

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147704 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-07 01:02:09 +00:00
Jakob Stoklund Olesen
bad1e6b8e0 Enable aligned NEON spilling by default.
Experiments show this to be a small speedup for modern ARM cores.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147689 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-06 22:19:37 +00:00
Chandler Carruth
62dfc51152 Prevent a DAGCombine from firing where there are two uses of
a combined-away node and the result of the combine isn't substantially
smaller than the input, it's just canonicalized. This is the first part
of a significant (7%) performance gain for Snappy's hot decompression
loop.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147604 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-05 11:05:55 +00:00
Chandler Carruth
1e141a854e Cleanup and FileCheck-ize a test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147603 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-05 11:05:47 +00:00
Victor Umansky
19d8559019 Peephole optimization of ptest-conditioned branch in X86 arch. Performs instruction combining of sequences generated by ptestz/ptestc intrinsics to ptest+jcc pair for SSE and AVX.
Testing: passed 'make check' including LIT tests for all sequences being handled (both SSE and AVX)

Reviewers: Evan Cheng, David Blaikie, Bruno Lopes, Elena Demikhovsky, Chad Rosier, Anton Korobeynikov



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147601 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-05 08:46:19 +00:00
Benjamin Kramer
44aac553f6 FileCheck hygiene.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147580 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-05 00:43:34 +00:00
Jakob Stoklund Olesen
7255a4e133 Reapply r146997, "Heed spill slot alignment on ARM."
Now that canRealignStack() understands frozen reserved registers, it is
safe to use it for aligned spill instructions.

It will only return true if the registers reserved at the beginning of
register allocation allow for dynamic stack realignment.

<rdar://problem/10625436>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147579 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-05 00:26:57 +00:00
NAKAMURA Takumi
7d34b92993 test/CodeGen/X86/jump_sign.ll: Add -mcpu=pentiumpro for non-x86 hosts. It uses "cmov".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147521 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-04 03:52:23 +00:00
Akira Hatanaka
cb9dd72fdc Have getRegForInlineAsmConstraint return the correct register class when target
is Mips64.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147516 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-04 02:45:01 +00:00
Evan Cheng
afad0fe59a Fix more places which should be checking for iOS, not darwin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147513 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-04 01:55:04 +00:00
Evan Cheng
56f582d664 For x86, canonicalize max
(x > y) ? x : y
=>
(x >= y) ? x : y

So for something like
(x - y) > 0 : (x - y) ? 0
It will be
(x - y) >= 0 : (x - y) ? 0

This makes is possible to test sign-bit and eliminate a comparison against
zero. e.g.
subl   %esi, %edi
testl  %edi, %edi
movl   $0, %eax
cmovgl %edi, %eax
=>
xorl   %eax, %eax
subl   %esi, $edi
cmovsl %eax, %edi

rdar://10633221


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147512 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-04 01:41:39 +00:00
Jakob Stoklund Olesen
6d5b7cc235 Revert r146997, "Heed spill slot alignment on ARM."
This patch caused a miscompilation of oggenc because a frame pointer was
suddenly needed halfway through register allocation.

<rdar://problem/10625436>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147487 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-03 22:34:35 +00:00
Nadav Rotem
c2d064f028 Revert 147426 because it caused pr11696.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147485 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-03 22:19:42 +00:00
Nadav Rotem
316477dd54 Fix incorrect widening of the bitcast sdnode in case the incoming operand is integer-promoted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147484 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-03 22:12:28 +00:00
Chad Rosier
3d1161e9ae Enhance DAGCombine for transforming 128->256 casts into a vmovaps, rather
then a vxorps + vinsertf128 pair if the original vector came from a load.
rdar://10594409

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147481 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-03 21:05:52 +00:00
Elena Demikhovsky
ce58a03587 Fixed a bug in SelectionDAG.cpp.
The failure seen on win32, when i64 type is illegal.
It happens on stage of conversion VECTOR_SHUFFLE to BUILD_VECTOR.

The failure message is:
llc: SelectionDAG.cpp:784: void VerifyNodeCommon(llvm::SDNode*): Assertion `(I->getValueType() == EltVT || (EltVT.isInteger() && I->getValueType().isInteger() && EltVT.bitsLE(I->getValueType()))) && "Wrong operand type!"' failed.

I added a special test that checks vector shuffle on win32.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147445 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-03 11:59:04 +00:00