Commit Graph

3836 Commits

Author SHA1 Message Date
Chad Rosier
ef68cfa8b8 Formatting. No functional change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163627 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-11 16:33:10 +00:00
Nadav Rotem
8754bbbe67 Stack Coloring: Dont crash on dbg values which use stack frames.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163616 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-11 12:34:27 +00:00
NAKAMURA Takumi
985dcfc351 test/CodeGen/X86/ms-inline-asm.ll: Relax for non-darwin x86 targets. '##InlineAsm' could not be seen in other hosts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163554 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 22:04:54 +00:00
Chad Rosier
24f5fddbdf [ms-inline asm] Properly emit the asm directives when the AsmPrinterVariant
and InlineAsmVariant don't match.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163550 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 21:36:05 +00:00
Chad Rosier
5c3dcb7bc0 Update test case for Release builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163549 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 21:31:43 +00:00
Chad Rosier
3b132fab0b [ms-inline asm] Pass the correct AsmVariant to the PrintAsmOperand() function
and update the printOperand() function accordingly.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163544 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 21:10:49 +00:00
Nadav Rotem
6165dba25f Stack Coloring: Handle the case where END markers come before BEGIN markers properly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163530 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 18:51:09 +00:00
Michael Liao
b8150d8523 Enhance PR11334 fix to support extload from v2f32/v4f32
- Fix an remaining issue of PR11674 as well



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163528 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 18:33:51 +00:00
Michael Liao
7fdc66bf73 Add boolean simplification support from CMOV
- If a boolean value is generated from CMOV and tested as boolean value,
  simplify the use of test result by referencing the original condition.
  RDRAND intrinisc is one of such cases.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163516 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 16:36:16 +00:00
Nadav Rotem
e47feeb823 Stack Coloring: Add support for multiple regions of the same slot, within a single basic block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163507 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 12:39:35 +00:00
Elena Demikhovsky
8100d244ff The VPSHUFB 256-bit instruction may be generated when one of input vector is undefined or zeroinitializer.
I've added the "zeroinitializer" case in this patch.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163506 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 12:13:11 +00:00
Nadav Rotem
9a2ae00c85 Teach the DAGBuilder about lifetime markers which are generated from PHINodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163494 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 08:43:23 +00:00
Craig Topper
956342b210 Teach DAG combiner to constant fold fneg of a BUILD_VECTOR of constants.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163483 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-09 22:58:45 +00:00
Craig Topper
12fb5c667f Add instruction selection for ffloor of vectors when SSE4.1 or AVX is enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163473 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-08 17:42:27 +00:00
Craig Topper
4362067d7c Add support for lowering FABS of vector types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163461 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-08 07:31:51 +00:00
Jakob Stoklund Olesen
45c5c57179 Allow overlaps between virtreg and physreg live ranges.
The RegisterCoalescer understands overlapping live ranges where one
register is defined as a copy of the other. With this change, register
allocators using LiveRegMatrix can do the same, at least for copies
between physical and virtual registers.

When a physreg is defined by a copy from a virtreg, allow those live
ranges to overlap:

  %CL<def> = COPY %vreg11:sub_8bit; GR32_ABCD:%vreg11
  %vreg13<def,tied1> = SAR32rCL %vreg13<tied0>, %CL<imp-use,kill>

We can assign %vreg11 to %ECX, overlapping the live range of %CL.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163336 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 18:15:23 +00:00
Nadav Rotem
79cb162e5d Disable stack coloring by default in order to resolve the i386 failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163316 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 14:27:06 +00:00
Elena Demikhovsky
4178946afb AVX2 optimization.
Added generation of VPSHUB instruction for <32 x i8> vector shuffle when possible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163312 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 12:42:01 +00:00
Nadav Rotem
a76d7d64a4 Fix the test by specifying an exact cpu model.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163307 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 10:33:33 +00:00
Nadav Rotem
c05d30601c Add a new optimization pass: Stack Coloring, that merges disjoint static allocations (allocas). Allocas are known to be
disjoint if they are marked by disjoint lifetime markers (@llvm.lifetime.XXX intrinsics).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163299 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 09:17:37 +00:00
Craig Topper
07149fe715 Add patterns for converting stores of subvector_extracts of lower 128-bits of a 256-bit vector to VMOVAPSmr/VMOVUPSmr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163292 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 05:15:01 +00:00
Preston Gurd
2e2efd9600 Generic Bypass Slow Div
- CodeGenPrepare pass for identifying div/rem ops
- Backend specifies the type mapping using addBypassSlowDivType
- Enabled only for Intel Atom with O2 32-bit -> 8-bit
- Replace IDIV with instructions which test its value and use DIVB if the value
is positive and less than 256.
- In the case when the quotient and remainder of a divide are used a DIV
and a REM instruction will be present in the IR. In the non-Atom case
they are both lowered to IDIVs and CSE removes the redundant IDIV instruction,
using the quotient and remainder from the first IDIV. However,
due to this optimization CSE is not able to eliminate redundant
IDIV instructions because they are located in different basic blocks.
This is overcome by calculating both the quotient (DIV) and remainder (REM)
in each basic block that is inserted by the optimization and reusing the result
values when a subsequent DIV or REM instruction uses the same operands.
- Test cases check for the presents of the optimization when calculating
either the quotient, remainder,  or both.

Patch by Tyler Nowicki!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163150 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04 18:22:17 +00:00
Elena Demikhovsky
3251020738 This patch optimizes shuffle instruction - generates 2 instructions instead of 4.
Since this specific shuffle is widely used in many workloads we have ~10% performance on them.

shufflevector <8 x float> %A, <8 x float> %B, <8 x i32> <i32 0, i32 8, i32 2, i32 10, i32 4, i32 12, i32 6, i32 14>

vmovaps (%rdx), %ymm0
vshufps $8, %ymm0, %ymm0, %ymm0
vmovaps (%rcx), %ymm1
vshufps $8, %ymm0, %ymm1, %ymm1
vunpcklps       %ymm0, %ymm1, %ymm0

vmovaps (%rcx), %ymm0
vmovsldup       (%rdx), %ymm1
vblendps        $85, %ymm0, %ymm1, %ymm0


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163134 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04 12:49:02 +00:00
Pete Cooper
0fc44aba18 Revert "Take account of boolean vector contents when promoting a build vector from i1 to some other type. rdar://problem/12210060"
This reverts commit 5dd9e214fb.

Thanks to Duncan for explaining how this should have been done.

Conflicts:

	test/CodeGen/X86/vec_select.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163064 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01 17:37:55 +00:00
NAKAMURA Takumi
5cf8bac4cc llvm/test/CodeGen/X86/fp-fast.ll: Suppress FMA4 on AMD Bulldozer host, corresponding to r162999.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163041 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01 00:26:28 +00:00
Manman Ren
c11b7193a7 Fix Atom bots for r163036.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163040 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01 00:17:06 +00:00
Manman Ren
2b7a2e8833 SelectionDAG: when constructing VZEXT_LOAD from other loads, make sure its
output chain is correctly setup.

As an example, if the original load must happen before later stores, we need
to make sure the constructed VZEXT_LOAD is constrained to be before the stores.

rdar://11457792


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163036 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31 23:16:57 +00:00
Craig Topper
dfb1e4babd Mark FMA4 instructions as commutable and add them to the folding tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163035 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31 23:10:34 +00:00
Michael Liao
265bcb1e5b Fix PR12359
- In addition to undefined, if V2 is zero vector, skip 2nd PSHUFB and POR as
  well as PSHUFB will zero elements with negative indices.

  Patch by Sriram Murali <sriram.murali@intel.com>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163018 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31 20:12:31 +00:00
Craig Topper
cb0848696d Mark FMA3 instructions as commutable so that the operands to the multiply part can be commuted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163001 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31 16:31:13 +00:00
Craig Topper
bf4043768c Add support for converting llvm.fma to fma4 instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162999 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31 15:40:30 +00:00
Jakob Stoklund Olesen
908c0c01f6 Don't enforce ordered inline asm operands.
I was too optimistic, inline asm can have tied operands that don't
follow the def order.

Fixes PR13742.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162998 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31 15:34:59 +00:00
NAKAMURA Takumi
2a1b0e7864 llvm/test/CodeGen/X86/vec_select.ll: Fix failure on xmm-less hosts, to add -mattr=+sse2.
FIXME: Should this be tested with both +avx and -avx,+sse2?

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162983 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31 10:02:22 +00:00
Pete Cooper
5dd9e214fb Take account of boolean vector contents when promoting a build vector from i1 to some other type. rdar://problem/12210060
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162960 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30 23:58:52 +00:00
Owen Anderson
9e3b6dfc2f Try to make this test more generic to unbreak buildbots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162958 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30 23:51:20 +00:00
Owen Anderson
43da6c7f13 Teach the DAG combiner to turn chains of FADDs (x+x+x+x+...) into FMULs by constants. This is only enabled in unsafe FP math mode, since it does not preserve rounding effects for all such constants.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162956 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30 23:35:16 +00:00
Michael Liao
a03c44117b Introduce 'UseSSEx' to force SSE legacy encoding
- Add 'UseSSEx' to force SSE legacy insn not being selected when AVX is
  enabled.

  As the penalty of inter-mixing SSE and AVX instructions, we need
  prevent SSE legacy insn from being generated except explicitly
  specified through some intrinsics. For patterns supported by both
  SSE and AVX, so far, we force AVX insn will be tried first relying on
  AddedComplexity or position in td file. It's error-prone and
  introduces bugs accidentally.

  'UseSSEx' is disabled when AVX is turned on. For SSE insns inherited
  by AVX, we need this predicate to force VEX encoding or SSE legacy
  encoding only.

  For insns not inherited by AVX, we still use the previous predicates,
  i.e. 'HasSSEx'. So far, these insns fall into the following
  categories:
  * SSE insns with MMX operands
  * SSE insns with GPR/MEM operands only (xFENCE, PREFETCH, CLFLUSH,
    CRC, and etc.)
  * SSE4A insns.
  * MMX insns.
  * x87 insns added by SSE.

2 test cases are modified:

 - test/CodeGen/X86/fast-isel-x86-64.ll
   AVX code generation is different from SSE one. 'vcvtsi2sdq' cannot be
   selected by fast-isel due to complicated pattern and fast-isel
   fallback to materialize it from constant pool.

 - test/CodeGen/X86/widen_load-1.ll
   AVX code generation is different from SSE one after fixing SSE/AVX
   inter-mixing. Exec-domain fixing prefers 'vmovapd' instead of
   'vmovaps'.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162919 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30 16:54:46 +00:00
Michael Liao
b6efbd2145 Should put test case under test/ExecutionEngine/MCJIT/
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162885 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30 00:43:57 +00:00
Michael Liao
faa1159a69 Fix PR13727
- The root cause is that target constant materialization in X86 fast-isel
  creates a PC-rel addressing which may overflow 32-bit range in non-Small code
  model if .rodata section is allocated too far away from code segment in
  MCJIT, which uses Large code model so far.
- Follow the similar logic to fix non-Small code model in fast-isel by skipping
  non-Small code model.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162881 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30 00:30:16 +00:00
Bill Wendling
eeba6e8317 The commutative flag is already correctly set within the multiclass. If we set
it here, then a 'register-memory' version would wrongly get the commutative
flag.
<rdar://problem/12180135>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162741 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-28 07:36:46 +00:00
Craig Topper
13897fb263 Merge AVX_SET0PSY/AVX_SET0PDY/AVX2_SET0 into a single post-RA pseudo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162738 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-28 07:05:28 +00:00
NAKAMURA Takumi
2f820a5d64 llvm/test/CodeGen/X86/pr12312.ll: Add -mtriple=x86_64-unknown-unknown.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162736 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-28 04:04:29 +00:00
Michael Liao
dbf8b5be97 Fix PR12312
- Add a target-specific DAG optimization to recognize a pattern PTEST-able.
  Such a pattern is a OR'd tree with X86ISD::OR as the root node. When
  X86ISD::OR node has only its flag result being used as a boolean value and
  all its leaves are extracted from the same vector, it could be folded into an
  X86ISD::PTEST node.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162735 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-28 03:34:40 +00:00
NAKAMURA Takumi
1ba64859f5 llvm/test/CodeGen/X86/fma.ll: Add -march=x86, or two tests would fail on non-x86 hosts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162667 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-27 11:50:26 +00:00
NAKAMURA Takumi
fdc35405cd llvm/test/CodeGen/X86/fma_patterns.ll: Add -mtriple=x86_64. It was incompatible on i686 and Windows x64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162664 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-27 09:37:54 +00:00
Craig Topper
43e4c62c43 Commit test change for r162658.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162660 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-27 07:55:50 +00:00
Anitha Boyapati
0aa63fcbcb FMA3 tests on bdver2 target for changes made in rev 162012. Also made
corresponding changes to existing tests for darwin triple to ensure that
same pattern is tested for bdver2 target.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162655 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-27 06:59:01 +00:00
Craig Topper
e10fa862f8 Make sure that FMA3 is favored even when FMA4 is also enabled. Test case for r162454.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162653 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-27 03:38:15 +00:00
Michael Liao
24438b8359 fix a case where all operands of BUILD_VECTOR are undefined
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162214 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-20 17:59:18 +00:00
Nadav Rotem
d60cb11afd When unsafe math is used, we can use commutative FMAX and FMIN. In some cases
this allows for better code generation.

Added a new DAGCombine transformation to convert FMAX and FMIN to FMANC and
FMINC, which are commutative.

For example:

  movaps  %xmm0, %xmm1
  movsd LC(%rip), %xmm0
  minsd %xmm1, %xmm0

becomes:

  minsd LC(%rip), %xmm0




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162187 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-19 13:06:16 +00:00