Commit Graph

7160 Commits

Author SHA1 Message Date
Nadav Rotem
0cd19b9301 Stack Coloring: We have code that checks that all of the uses of allocas
are within the lifetime zone. Sometime legitimate usages of allocas are
hoisted outside of the lifetime zone. For example, GEPS may calculate the
address of a member of an allocated struct. This commit makes sure that
we only check (abort regions or assert) for instructions that read and write
memory using stack frames directly. Notice that by allowing legitimate
usages outside the lifetime zone we also stop checking for instructions
which use derivatives of allocas. We will catch less bugs in user code
and in the compiler itself.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163791 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-13 12:38:37 +00:00
Jakob Stoklund Olesen
aa0cfea9a4 Don't fold indexed loads into TCRETURNmi64.
We don't have enough GR64_TC registers when calling a varargs function
with 6 arguments. Since %al holds the number of vector registers used,
only %r11 is available as a scratch register.

This means that addressing modes using both base and index registers
can't be folded into TCRETURNmi64.

<rdar://problem/12282281>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163761 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-13 00:25:00 +00:00
Michael Liao
6c7ccaa3fd Fix PR11985
- BlockAddress has no support of BA + offset form and there is no way to
  propagate that offset into machine operand;
- Add BA + offset support and a new interface 'getTargetBlockAddress' to
  simplify target block address forming;
- All targets are modified to use new interface and X86 backend is enhanced to
  support BA + offset addressing.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163743 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-12 21:43:09 +00:00
Roman Divacky
9d760ae5c6 This patch corrects logic in PPCFrameLowering for save and restore of
nonvolatile condition register fields across calls under the SVR4 ABIs.                                            
                                                                                                                   
 * With the 64-bit ABI, the save location is at a fixed offset of 8 from                                           
the stack pointer.  The frame pointer cannot be used to access this                                                
portion of the stack frame since the distance from the frame pointer may                                           
change with alloca calls.                                                                                          
                                                                                                                   
 * With the 32-bit ABI, the save location is just below the general
register save area, and is accessed via the frame pointer like the rest
of the save areas.  This is an optional slot, so it must only be created                                           
if any of CR2, CR3, and CR4 were modified.                                                                      
                                                                                                                   
 * For both ABIs, save/restore logic is generated only if one of the     
nonvolatile CR fields were modified.                                   

I also took this opportunity to clean up an extra FIXME in
PPCFrameLowering.h.  Save area offsets for 32-bit GPRs are meaningless
for the 64-bit ABI, so I removed them for correctness and efficiency.


Fixes PR13708 and partially also PR13623. It lets us enable exception handling
on PPC64.

Patch by William J. Schmidt!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163713 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-12 14:47:47 +00:00
Kristof Beyls
789efbad2a Fix constant folding through bitcasts by no longer relying on undefined behaviour (converting NaN values between float and double).
SelectionDAG::getConstantFP(double Val, EVT VT, bool isTarget);
should not be used when Val is not a simple constant (as the comment in
SelectionDAG.h indicates). This patch avoids using this function
when folding an unknown constant through a bitcast, where it cannot be
guaranteed that Val will be a simple constant.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163703 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-12 11:25:02 +00:00
Nadav Rotem
0a16da4457 Stack coloring: remove lifetime intervals which contain escaped allocas.
The input program may contain intructions which are not inside lifetime
markers. This can happen due to a bug in the compiler or due to a bug in
user code (for example, returning a reference to a local variable).
This commit adds checks that all of the instructions in the function and
invalidates lifetime ranges which do not contain all of the instructions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163678 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-12 04:57:37 +00:00
Chad Rosier
a7b159ccdd [ms-inline asm] Split the parsing of IR asm strings into GCC and MS variants.
Add support in the EmitMSInlineAsmStr() function for handling integer consts.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163645 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-11 19:09:56 +00:00
Chad Rosier
ef68cfa8b8 Formatting. No functional change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163627 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-11 16:33:10 +00:00
Nadav Rotem
8754bbbe67 Stack Coloring: Dont crash on dbg values which use stack frames.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163616 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-11 12:34:27 +00:00
NAKAMURA Takumi
985dcfc351 test/CodeGen/X86/ms-inline-asm.ll: Relax for non-darwin x86 targets. '##InlineAsm' could not be seen in other hosts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163554 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 22:04:54 +00:00
Chad Rosier
24f5fddbdf [ms-inline asm] Properly emit the asm directives when the AsmPrinterVariant
and InlineAsmVariant don't match.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163550 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 21:36:05 +00:00
Chad Rosier
5c3dcb7bc0 Update test case for Release builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163549 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 21:31:43 +00:00
Chad Rosier
3b132fab0b [ms-inline asm] Pass the correct AsmVariant to the PrintAsmOperand() function
and update the printOperand() function accordingly.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163544 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 21:10:49 +00:00
Jakob Stoklund Olesen
519daf5d2d Don't attempt to use flags from predicated instructions.
The ARM backend can eliminate cmp instructions by reusing flags from a
nearby sub instruction with similar arguments.

Don't do that if the sub is predicated - the flags are not written
unconditionally.

<rdar://problem/12263428>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163535 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 19:17:25 +00:00
Nadav Rotem
6165dba25f Stack Coloring: Handle the case where END markers come before BEGIN markers properly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163530 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 18:51:09 +00:00
Michael Liao
b8150d8523 Enhance PR11334 fix to support extload from v2f32/v4f32
- Fix an remaining issue of PR11674 as well



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163528 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 18:33:51 +00:00
Michael Liao
7fdc66bf73 Add boolean simplification support from CMOV
- If a boolean value is generated from CMOV and tested as boolean value,
  simplify the use of test result by referencing the original condition.
  RDRAND intrinisc is one of such cases.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163516 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 16:36:16 +00:00
James Molloy
8cd08bf4ac Fix an assertion failure when optimising a shufflevector incorrectly into concat_vectors, and a followup bug with SelectionDAG::getNode() creating nodes with invalid types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163511 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 14:01:21 +00:00
Nadav Rotem
e47feeb823 Stack Coloring: Add support for multiple regions of the same slot, within a single basic block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163507 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 12:39:35 +00:00
Elena Demikhovsky
8100d244ff The VPSHUFB 256-bit instruction may be generated when one of input vector is undefined or zeroinitializer.
I've added the "zeroinitializer" case in this patch.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163506 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 12:13:11 +00:00
Nadav Rotem
9a2ae00c85 Teach the DAGBuilder about lifetime markers which are generated from PHINodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163494 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 08:43:23 +00:00
Craig Topper
956342b210 Teach DAG combiner to constant fold fneg of a BUILD_VECTOR of constants.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163483 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-09 22:58:45 +00:00
Craig Topper
12fb5c667f Add instruction selection for ffloor of vectors when SSE4.1 or AVX is enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163473 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-08 17:42:27 +00:00
Craig Topper
4362067d7c Add support for lowering FABS of vector types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163461 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-08 07:31:51 +00:00
Craig Topper
a1fb1d2ed7 Set operation action for FFLOOR to Expand for all vector types for X86. Set FFLOOR of v4f32 to Expand for ARM. v2f64 was already correct.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163458 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-08 04:58:43 +00:00
Jakob Stoklund Olesen
45c5c57179 Allow overlaps between virtreg and physreg live ranges.
The RegisterCoalescer understands overlapping live ranges where one
register is defined as a copy of the other. With this change, register
allocators using LiveRegMatrix can do the same, at least for copies
between physical and virtual registers.

When a physreg is defined by a copy from a virtreg, allow those live
ranges to overlap:

  %CL<def> = COPY %vreg11:sub_8bit; GR32_ABCD:%vreg11
  %vreg13<def,tied1> = SAR32rCL %vreg13<tied0>, %CL<imp-use,kill>

We can assign %vreg11 to %ECX, overlapping the live range of %CL.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163336 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 18:15:23 +00:00
Nadav Rotem
79cb162e5d Disable stack coloring by default in order to resolve the i386 failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163316 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 14:27:06 +00:00
Elena Demikhovsky
4178946afb AVX2 optimization.
Added generation of VPSHUB instruction for <32 x i8> vector shuffle when possible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163312 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 12:42:01 +00:00
Nadav Rotem
a76d7d64a4 Fix the test by specifying an exact cpu model.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163307 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 10:33:33 +00:00
James Molloy
ba8562af44 Improve codegen for BUILD_VECTORs on ARM.
If we have a BUILD_VECTOR that is mostly a constant splat, it is often better to splat that constant then insertelement the non-constant lanes instead of insertelementing every lane from an undef base.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163304 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 09:55:02 +00:00
Nadav Rotem
c05d30601c Add a new optimization pass: Stack Coloring, that merges disjoint static allocations (allocas). Allocas are known to be
disjoint if they are marked by disjoint lifetime markers (@llvm.lifetime.XXX intrinsics).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163299 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 09:17:37 +00:00
James Molloy
6c822eea47 Optimize codegen for VSETLNi{8,16,32} operating on Q registers. Degenerate to a VSETLN on D registers, instead of an (INSERT_SUBREG (VSETLN (EXTRACT_SUBREG ))) sequence to help the register coalescer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163298 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 09:16:01 +00:00
Craig Topper
07149fe715 Add patterns for converting stores of subvector_extracts of lower 128-bits of a 256-bit vector to VMOVAPSmr/VMOVUPSmr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163292 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 05:15:01 +00:00
Jakob Stoklund Olesen
098c6a547f Use predication instead of pseudo-opcodes when folding into MOVCC.
Now that it is possible to dynamically tie MachineInstr operands,
predicated instructions are possible in SSA form:

  %vreg3<def> = SUBri %vreg1, -2147483647, pred:14, pred:%noreg, %opt:%noreg
  %vreg4<def,tied1> = MOVCCr %vreg3<tied0>, %vreg1, %pred:12, pred:%CPSR

Becomes a predicated SUBri with a tied imp-use:

  SUBri %vreg1, -2147483647, pred:13, pred:%CPSR, opt:%noreg, %vreg1<imp-use,tied0>

This means that any instruction that is safe to move can be folded into
a MOVCC, and the *CC pseudo-instructions are no longer needed.

The test case changes reflect that Thumb2SizeReduce recognizes the
predicated instructions. It didn't understand the pseudos.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163274 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05 23:58:02 +00:00
Tim Northover
7bebddf55e Strip old MachineInstrs *after* we know we can put them back.
Previous patch accidentally decided it couldn't convert a VFP to a
NEON instruction after it had already destroyed the old one. Not a
good move.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163230 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05 18:37:53 +00:00
Pranav Bhandarkar
4c3d3ecdf8 LLVM Bug Fix 13709: Remove needless lsr(Rp, #32) instruction access the
subreg_hireg of register pair Rp.

	* lib/Target/Hexagon/HexagonPeephole.cpp(PeepholeDoubleRegsMap): New
	 DenseMap similar to PeepholeMap that additionally records subreg info
	 too.
        (runOnMachineFunction): Record information in PeepholeDoubleRegsMap
        and copy propagate the high sub-reg of Rp0 in Rp1 = lsr(Rp0, #32) to
	the instruction Rx = COPY Rp1:logreg_subreg.
	* test/CodeGen/Hexagon/remove_lsr.ll: New test.
	



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163214 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05 16:01:40 +00:00
Silviu Baranga
3d5e161fe4 Fixed the DAG combiner to better handle the folding of AND nodes for vector types. The previous code was making the assumption that the length of the bitmask returned by isConstantSplat was equal to the size of the vector type. Now we first make sure that the splat value has at least the length of the vector lane type, then we only use as many fields as we have available in the splat value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163203 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05 08:57:21 +00:00
Logan Chien
fd91d8dd7e Fix UseInitArray option for MIPS target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163193 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05 06:17:17 +00:00
Jakob Stoklund Olesen
daddf07497 Move tie checks into MachineVerifier::visitMachineOperand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163152 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04 18:38:28 +00:00
Preston Gurd
2e2efd9600 Generic Bypass Slow Div
- CodeGenPrepare pass for identifying div/rem ops
- Backend specifies the type mapping using addBypassSlowDivType
- Enabled only for Intel Atom with O2 32-bit -> 8-bit
- Replace IDIV with instructions which test its value and use DIVB if the value
is positive and less than 256.
- In the case when the quotient and remainder of a divide are used a DIV
and a REM instruction will be present in the IR. In the non-Atom case
they are both lowered to IDIVs and CSE removes the redundant IDIV instruction,
using the quotient and remainder from the first IDIV. However,
due to this optimization CSE is not able to eliminate redundant
IDIV instructions because they are located in different basic blocks.
This is overcome by calculating both the quotient (DIV) and remainder (REM)
in each basic block that is inserted by the optimization and reusing the result
values when a subsequent DIV or REM instruction uses the same operands.
- Test cases check for the presents of the optimization when calculating
either the quotient, remainder,  or both.

Patch by Tyler Nowicki!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163150 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04 18:22:17 +00:00
Sergei Larin
3e59040810 Porting Hexagon MI Scheduler to the new API.
Change current Hexagon MI scheduler to use new converging
scheduler. Integrates DFA resource model into it.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163137 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04 14:49:56 +00:00
Arnold Schwaighofer
67514e9066 Patch to implement UMLAL/SMLAL instructions for the ARM architecture
This patch corrects the definition of umlal/smlal instructions and adds support
for matching them to the ARM dag combiner.

Bug 12213

Patch by Yin Ma!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163136 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04 14:37:49 +00:00
Elena Demikhovsky
3251020738 This patch optimizes shuffle instruction - generates 2 instructions instead of 4.
Since this specific shuffle is widely used in many workloads we have ~10% performance on them.

shufflevector <8 x float> %A, <8 x float> %B, <8 x i32> <i32 0, i32 8, i32 2, i32 10, i32 4, i32 12, i32 6, i32 14>

vmovaps (%rdx), %ymm0
vshufps $8, %ymm0, %ymm0, %ymm0
vmovaps (%rcx), %ymm1
vshufps $8, %ymm0, %ymm1, %ymm1
vunpcklps       %ymm0, %ymm1, %ymm0

vmovaps (%rcx), %ymm0
vmovsldup       (%rdx), %ymm1
vblendps        $85, %ymm0, %ymm1, %ymm0


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163134 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04 12:49:02 +00:00
Nadav Rotem
9f40cb32ac Not all targets have efficient ISel code generation for select instructions.
For example, the ARM target does not have efficient ISel handling for vector
selects with scalar conditions. This patch adds a TLI hook which allows the
different targets to report which selects are supported well and which selects
should be converted to CF duting codegen prepare.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163093 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-02 12:10:19 +00:00
Nadav Rotem
f55ef64544 Generate better select code by allowing the target to use scalar select, and not sign-extend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163086 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-02 08:20:07 +00:00
Pete Cooper
0fc44aba18 Revert "Take account of boolean vector contents when promoting a build vector from i1 to some other type. rdar://problem/12210060"
This reverts commit 5dd9e214fb.

Thanks to Duncan for explaining how this should have been done.

Conflicts:

	test/CodeGen/X86/vec_select.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163064 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01 17:37:55 +00:00
Logan Chien
8fccd013d8 Fix Thumb2 fixup kind in the integrated-as.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163063 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01 15:06:36 +00:00
Owen Anderson
58d5729540 Teach DAG combine a number of tricks to simplify FMA expressions in fast-math mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163051 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01 06:04:27 +00:00
NAKAMURA Takumi
5cf8bac4cc llvm/test/CodeGen/X86/fp-fast.ll: Suppress FMA4 on AMD Bulldozer host, corresponding to r162999.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163041 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01 00:26:28 +00:00
Manman Ren
c11b7193a7 Fix Atom bots for r163036.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163040 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01 00:17:06 +00:00