Commit Graph

7492 Commits

Author SHA1 Message Date
Roman Divacky
c1611d8d10 Specify cpu to get the correct instruction ordering. Remove XFAIL.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164306 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-20 14:59:42 +00:00
Michael Liao
85fb261a55 Specify CPu to prevent failure on ATOM due to different code scheduling
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164283 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-20 03:34:04 +00:00
Michael Liao
b118a073d7 Re-work X86 code generation of atomic ops with spin-loop
- Rewrite/merge pseudo-atomic instruction emitters to address the
  following issue:
  * Reduce one unnecessary load in spin-loop

    previously the spin-loop looks like

        thisMBB:
        newMBB:
          ld  t1 = [bitinstr.addr]
          op  t2 = t1, [bitinstr.val]
          not t3 = t2  (if Invert)
          mov EAX = t1
          lcs dest = [bitinstr.addr], t3  [EAX is implicit]
          bz  newMBB
          fallthrough -->nextMBB

    the 'ld' at the beginning of newMBB should be lift out of the loop
    as lcs (or CMPXCHG on x86) will load the current memory value into
    EAX. This loop is refined as:

        thisMBB:
          EAX = LOAD [MI.addr]
        mainMBB:
          t1 = OP [MI.val], EAX
          LCMPXCHG [MI.addr], t1, [EAX is implicitly used & defined]
          JNE mainMBB
        sinkMBB:

  * Remove immopc as, so far, all pseudo-atomic instructions has
    all-register form only, there is no immedidate operand.

  * Remove unnecessary attributes/modifiers in pseudo-atomic instruction
    td

  * Fix issues in PR13458

- Add comprehensive tests on atomic ops on various data types.
  NOTE: Some of them are turned off due to missing functionality.

- Revise tests due to the new spin-loop generated.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164281 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-20 03:06:15 +00:00
Jakob Stoklund Olesen
d40d4c34f7 Resolve conflicts involving dead vector lanes for -new-coalescer.
A common coalescing conflict in vector code is lane insertion:

  %dst = FOO
  %src = BAR
  %dst:ssub0 = COPY %src

The live range of %src interferes with the ssub0 lane of %dst, but that
lane is never read after %src would have clobbered it. That makes it
safe to merge the live ranges and eliminate the COPY:

  %dst = FOO
  %dst:ssub0 = BAR

This patch teaches the new coalescer to resolve conflicts where dead
vector lanes would be clobbered, at least as long as the clobbered
vector lanes don't escape the basic block.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164250 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-19 21:29:18 +00:00
Michael Liao
cd9ede9fc0 Unify the logic in SelectAtomicLoadAdd and SelectAtomicLoadArith
- Merge the processing of LOAD_ADD with other atomic load-arith
  operations
- Separate the logic getting target constant for atomic-load-op and add
  an optimization for atomic-load-add on i16 with negative value
- Optimize a minor case for atomic-fetch-add i16 with negative operand. Test
  case is revised.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164243 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-19 19:36:58 +00:00
Jordan Rose
3856b07276 Really XFAIL test/CodeGen/PowerPC/structsinregs.ll.
XFAIL needs a trailing colon. Hopefully this will get the buildbots
happy again while Bill works on getting it passing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164237 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-19 17:03:11 +00:00
Bill Schmidt
1bc12b579d XFAIL test/CodeGen/PowerPC/structsinregs.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164233 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-19 16:18:23 +00:00
Bill Schmidt
419f376564 Small structs for PPC64 SVR4 must be passed right-justified in registers.
lib/Target/PowerPC/PPCISelLowering.{h,cpp}
 Rename LowerFormalArguments_Darwin to LowerFormalArguments_Darwin_Or_64SVR4.
 Rename LowerFormalArguments_SVR4 to LowerFormalArguments_32SVR4.
 Receive small structs right-justified in LowerFormalArguments_Darwin_Or_64SVR4.
 Rename LowerCall_Darwin to LowerCall_Darwin_Or_64SVR4.
 Rename LowerCall_SVR4 to LowerCall_32SVR4.
 Pass small structs right-justified in LowerCall_Darwin_Or_64SVR4.

test/CodeGen/PowerPC/structsinregs.ll
 New test.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164228 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-19 15:42:13 +00:00
Hans Wennborg
c4ba62ca1d Move load_to_switch.ll to test/CodeGen/SPARC/
Because the test invokes llc -march=sparc, it needs to be in a directory
which is only run when the sparc target is built.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164211 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-19 09:25:03 +00:00
Evan Cheng
b37b6ca4bb MOVi16 (movw) is only legal on cpus with V6T2 support. rdar://12300648
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164169 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-18 21:24:16 +00:00
Roman Divacky
51ca601977 Add test for r164155 and remove two tests superseded by ppc64-calls.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164162 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-18 19:51:44 +00:00
Roman Divacky
f145c135f3 Avoid symbol name clash when filling TOC.
Patch by Adhemerval Zanella.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164141 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-18 17:10:37 +00:00
Roman Divacky
4cd56014ae On PPC64 emit the environment pointer. Patch by Adhemerval Zanella.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164139 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-18 16:55:29 +00:00
Roman Divacky
eb8b7dc536 Optimize local func calls to not emit nop for TOC restoration.
Patch by Adhemerval Zanella.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164138 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-18 16:47:58 +00:00
James Molloy
97ecb83dff More domain conversion; convert VFP VMOVS to NEON instructions in more cases - when we may clobber the other S-lane by converting an S to a D instruction, make an effort to work out if the S lane is clobberable or not.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164114 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-18 08:31:15 +00:00
Evan Cheng
d10eab0a95 Use vld1 / vst2 for unaligned v2f64 load / store. e.g. Use vld1.16 for 2-byte
aligned address. Based on patch by David Peixotto.

Also use vld1.64 / vst1.64 with 128-bit alignment to take advantage of alignment
hints. rdar://12090772, rdar://12238782


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164089 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-18 01:42:45 +00:00
Jakob Stoklund Olesen
87f7864c6d Merge into undefined lanes under -new-coalescer.
Add LIS::pruneValue() and extendToIndices(). These two functions are
used by the register coalescer when merging two live ranges requires
more than a trivial value mapping as supported by LiveInterval::join().

The pruneValue() function can remove the part of a value number that is
going to conflict in join(). Afterwards, extendToIndices can restore the
live range, using any new dominating value numbers and updating the SSA
form.

Use this complex value mapping to support merging a register into a
vector lane that has a conflicting value, but the clobbered lane is
undef.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164074 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-17 23:03:25 +00:00
Jan Wen Voung
b024b7014a Add some cases to x86 OptimizeCompare to handle DEC and INC, too.
While we are setting the earlier def to true, also make it live.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164056 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-17 22:04:23 +00:00
Michael Liao
bb73002247 Fix PR13859
- Preserve the original NOutVT during casting from vector to integer by
  extracting vector elements.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164042 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-17 18:05:20 +00:00
Silviu Baranga
c8bf0f8662 Removed the VMLxForwarding feature for the Cortex-A15 target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164030 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-17 14:10:54 +00:00
Nadav Rotem
6fc671ca63 Fix the testcase to work on all platforms.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163997 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-16 07:58:47 +00:00
Nadav Rotem
638e4c13cb The PMOVZXWD family of functions had patterns extends narrow vector types to wide vector types.
It had patterns for zext-loading and extending. This commit adds patterns for loading a wide type, performing a bitcast,
and extending. This is an odd pattern, but it is commonly used when writing code with intrinsics.

rdar://11897677



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163995 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-16 07:39:07 +00:00
Benjamin Kramer
562b240fc5 X86: Emitting x87 fsin/fcos for sinf/cosf is not safe without unsafe fp math.
This was only an issue if sse is disabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163967 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-15 12:44:27 +00:00
Akira Hatanaka
f934d159ae Handled unaligned load/stores properly in Mips16
Patch by Reed Kotler.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163956 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-15 01:02:03 +00:00
Eric Christopher
ffaf69b8b1 Fix both the test for zero and what we do if we have a zero for
umulo legalization.

Fixes PR13839

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163856 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-13 23:24:02 +00:00
Michael Liao
f966e4e5b3 Add wider vector/integer support for PR12312
- Enhance the fix to PR12312 to support wider integer, such as 256-bit
  integer. If more than 1 fully evaluated vectors are found, POR them
  first followed by the final PTEST.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163832 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-13 20:24:54 +00:00
Michael Liao
092122f124 Enhance type legalization on bitcast from vector to integer
- Find a legal vector type before casting and extracting element from it.
- As the new vector type may have more than 2 elements, build the final
  hi/lo pair by BFS pairing them from bottom to top.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163830 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-13 19:58:21 +00:00
Jakob Stoklund Olesen
da0e8219b7 Fix test case to avoid PIC magic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163827 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-13 19:47:45 +00:00
Jakob Stoklund Olesen
7bba7d0efc Fix the TCRETURNmi64 bug differently.
Add a PatFrag to match X86tcret using 6 fixed registers or less. This
avoids folding loads into TCRETURNmi64 using 7 or more volatile
registers.

<rdar://problem/12282281>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163819 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-13 18:31:27 +00:00
Jakob Stoklund Olesen
0767dc546e Revert r163761 "Don't fold indexed loads into TCRETURNmi64."
The patch caused "Wrong topological sorting" assertions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163810 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-13 16:52:17 +00:00
Silviu Baranga
616471d4bf This patch introduces A15 as a target in LLVM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163803 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-13 15:05:10 +00:00
Nadav Rotem
91a7e0184a Fix a dagcombine optimization. The optimization attempts to optimize a bitcast of fneg to integers
by xoring the high-bit. This fails if the source operand is a vector because we need to negate
each of the elements in the vector.

Fix rdar://12281066 PR13813.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163802 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-13 14:54:28 +00:00
Nadav Rotem
0cd19b9301 Stack Coloring: We have code that checks that all of the uses of allocas
are within the lifetime zone. Sometime legitimate usages of allocas are
hoisted outside of the lifetime zone. For example, GEPS may calculate the
address of a member of an allocated struct. This commit makes sure that
we only check (abort regions or assert) for instructions that read and write
memory using stack frames directly. Notice that by allowing legitimate
usages outside the lifetime zone we also stop checking for instructions
which use derivatives of allocas. We will catch less bugs in user code
and in the compiler itself.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163791 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-13 12:38:37 +00:00
Jakob Stoklund Olesen
aa0cfea9a4 Don't fold indexed loads into TCRETURNmi64.
We don't have enough GR64_TC registers when calling a varargs function
with 6 arguments. Since %al holds the number of vector registers used,
only %r11 is available as a scratch register.

This means that addressing modes using both base and index registers
can't be folded into TCRETURNmi64.

<rdar://problem/12282281>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163761 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-13 00:25:00 +00:00
Michael Liao
6c7ccaa3fd Fix PR11985
- BlockAddress has no support of BA + offset form and there is no way to
  propagate that offset into machine operand;
- Add BA + offset support and a new interface 'getTargetBlockAddress' to
  simplify target block address forming;
- All targets are modified to use new interface and X86 backend is enhanced to
  support BA + offset addressing.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163743 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-12 21:43:09 +00:00
Roman Divacky
9d760ae5c6 This patch corrects logic in PPCFrameLowering for save and restore of
nonvolatile condition register fields across calls under the SVR4 ABIs.                                            
                                                                                                                   
 * With the 64-bit ABI, the save location is at a fixed offset of 8 from                                           
the stack pointer.  The frame pointer cannot be used to access this                                                
portion of the stack frame since the distance from the frame pointer may                                           
change with alloca calls.                                                                                          
                                                                                                                   
 * With the 32-bit ABI, the save location is just below the general
register save area, and is accessed via the frame pointer like the rest
of the save areas.  This is an optional slot, so it must only be created                                           
if any of CR2, CR3, and CR4 were modified.                                                                      
                                                                                                                   
 * For both ABIs, save/restore logic is generated only if one of the     
nonvolatile CR fields were modified.                                   

I also took this opportunity to clean up an extra FIXME in
PPCFrameLowering.h.  Save area offsets for 32-bit GPRs are meaningless
for the 64-bit ABI, so I removed them for correctness and efficiency.


Fixes PR13708 and partially also PR13623. It lets us enable exception handling
on PPC64.

Patch by William J. Schmidt!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163713 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-12 14:47:47 +00:00
Kristof Beyls
789efbad2a Fix constant folding through bitcasts by no longer relying on undefined behaviour (converting NaN values between float and double).
SelectionDAG::getConstantFP(double Val, EVT VT, bool isTarget);
should not be used when Val is not a simple constant (as the comment in
SelectionDAG.h indicates). This patch avoids using this function
when folding an unknown constant through a bitcast, where it cannot be
guaranteed that Val will be a simple constant.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163703 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-12 11:25:02 +00:00
Nadav Rotem
0a16da4457 Stack coloring: remove lifetime intervals which contain escaped allocas.
The input program may contain intructions which are not inside lifetime
markers. This can happen due to a bug in the compiler or due to a bug in
user code (for example, returning a reference to a local variable).
This commit adds checks that all of the instructions in the function and
invalidates lifetime ranges which do not contain all of the instructions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163678 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-12 04:57:37 +00:00
Chad Rosier
a7b159ccdd [ms-inline asm] Split the parsing of IR asm strings into GCC and MS variants.
Add support in the EmitMSInlineAsmStr() function for handling integer consts.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163645 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-11 19:09:56 +00:00
Chad Rosier
ef68cfa8b8 Formatting. No functional change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163627 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-11 16:33:10 +00:00
Nadav Rotem
8754bbbe67 Stack Coloring: Dont crash on dbg values which use stack frames.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163616 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-11 12:34:27 +00:00
NAKAMURA Takumi
985dcfc351 test/CodeGen/X86/ms-inline-asm.ll: Relax for non-darwin x86 targets. '##InlineAsm' could not be seen in other hosts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163554 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 22:04:54 +00:00
Chad Rosier
24f5fddbdf [ms-inline asm] Properly emit the asm directives when the AsmPrinterVariant
and InlineAsmVariant don't match.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163550 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 21:36:05 +00:00
Chad Rosier
5c3dcb7bc0 Update test case for Release builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163549 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 21:31:43 +00:00
Chad Rosier
3b132fab0b [ms-inline asm] Pass the correct AsmVariant to the PrintAsmOperand() function
and update the printOperand() function accordingly.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163544 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 21:10:49 +00:00
Jakob Stoklund Olesen
519daf5d2d Don't attempt to use flags from predicated instructions.
The ARM backend can eliminate cmp instructions by reusing flags from a
nearby sub instruction with similar arguments.

Don't do that if the sub is predicated - the flags are not written
unconditionally.

<rdar://problem/12263428>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163535 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 19:17:25 +00:00
Nadav Rotem
6165dba25f Stack Coloring: Handle the case where END markers come before BEGIN markers properly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163530 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 18:51:09 +00:00
Michael Liao
b8150d8523 Enhance PR11334 fix to support extload from v2f32/v4f32
- Fix an remaining issue of PR11674 as well



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163528 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 18:33:51 +00:00
Michael Liao
7fdc66bf73 Add boolean simplification support from CMOV
- If a boolean value is generated from CMOV and tested as boolean value,
  simplify the use of test result by referencing the original condition.
  RDRAND intrinisc is one of such cases.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163516 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 16:36:16 +00:00
James Molloy
8cd08bf4ac Fix an assertion failure when optimising a shufflevector incorrectly into concat_vectors, and a followup bug with SelectionDAG::getNode() creating nodes with invalid types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163511 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 14:01:21 +00:00
Nadav Rotem
e47feeb823 Stack Coloring: Add support for multiple regions of the same slot, within a single basic block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163507 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 12:39:35 +00:00
Elena Demikhovsky
8100d244ff The VPSHUFB 256-bit instruction may be generated when one of input vector is undefined or zeroinitializer.
I've added the "zeroinitializer" case in this patch.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163506 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 12:13:11 +00:00
Nadav Rotem
9a2ae00c85 Teach the DAGBuilder about lifetime markers which are generated from PHINodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163494 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 08:43:23 +00:00
Craig Topper
956342b210 Teach DAG combiner to constant fold fneg of a BUILD_VECTOR of constants.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163483 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-09 22:58:45 +00:00
Craig Topper
12fb5c667f Add instruction selection for ffloor of vectors when SSE4.1 or AVX is enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163473 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-08 17:42:27 +00:00
Craig Topper
4362067d7c Add support for lowering FABS of vector types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163461 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-08 07:31:51 +00:00
Craig Topper
a1fb1d2ed7 Set operation action for FFLOOR to Expand for all vector types for X86. Set FFLOOR of v4f32 to Expand for ARM. v2f64 was already correct.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163458 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-08 04:58:43 +00:00
Jakob Stoklund Olesen
45c5c57179 Allow overlaps between virtreg and physreg live ranges.
The RegisterCoalescer understands overlapping live ranges where one
register is defined as a copy of the other. With this change, register
allocators using LiveRegMatrix can do the same, at least for copies
between physical and virtual registers.

When a physreg is defined by a copy from a virtreg, allow those live
ranges to overlap:

  %CL<def> = COPY %vreg11:sub_8bit; GR32_ABCD:%vreg11
  %vreg13<def,tied1> = SAR32rCL %vreg13<tied0>, %CL<imp-use,kill>

We can assign %vreg11 to %ECX, overlapping the live range of %CL.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163336 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 18:15:23 +00:00
Nadav Rotem
79cb162e5d Disable stack coloring by default in order to resolve the i386 failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163316 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 14:27:06 +00:00
Elena Demikhovsky
4178946afb AVX2 optimization.
Added generation of VPSHUB instruction for <32 x i8> vector shuffle when possible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163312 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 12:42:01 +00:00
Nadav Rotem
a76d7d64a4 Fix the test by specifying an exact cpu model.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163307 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 10:33:33 +00:00
James Molloy
ba8562af44 Improve codegen for BUILD_VECTORs on ARM.
If we have a BUILD_VECTOR that is mostly a constant splat, it is often better to splat that constant then insertelement the non-constant lanes instead of insertelementing every lane from an undef base.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163304 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 09:55:02 +00:00
Nadav Rotem
c05d30601c Add a new optimization pass: Stack Coloring, that merges disjoint static allocations (allocas). Allocas are known to be
disjoint if they are marked by disjoint lifetime markers (@llvm.lifetime.XXX intrinsics).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163299 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 09:17:37 +00:00
James Molloy
6c822eea47 Optimize codegen for VSETLNi{8,16,32} operating on Q registers. Degenerate to a VSETLN on D registers, instead of an (INSERT_SUBREG (VSETLN (EXTRACT_SUBREG ))) sequence to help the register coalescer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163298 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 09:16:01 +00:00
Craig Topper
07149fe715 Add patterns for converting stores of subvector_extracts of lower 128-bits of a 256-bit vector to VMOVAPSmr/VMOVUPSmr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163292 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 05:15:01 +00:00
Jakob Stoklund Olesen
098c6a547f Use predication instead of pseudo-opcodes when folding into MOVCC.
Now that it is possible to dynamically tie MachineInstr operands,
predicated instructions are possible in SSA form:

  %vreg3<def> = SUBri %vreg1, -2147483647, pred:14, pred:%noreg, %opt:%noreg
  %vreg4<def,tied1> = MOVCCr %vreg3<tied0>, %vreg1, %pred:12, pred:%CPSR

Becomes a predicated SUBri with a tied imp-use:

  SUBri %vreg1, -2147483647, pred:13, pred:%CPSR, opt:%noreg, %vreg1<imp-use,tied0>

This means that any instruction that is safe to move can be folded into
a MOVCC, and the *CC pseudo-instructions are no longer needed.

The test case changes reflect that Thumb2SizeReduce recognizes the
predicated instructions. It didn't understand the pseudos.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163274 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05 23:58:02 +00:00
Tim Northover
7bebddf55e Strip old MachineInstrs *after* we know we can put them back.
Previous patch accidentally decided it couldn't convert a VFP to a
NEON instruction after it had already destroyed the old one. Not a
good move.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163230 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05 18:37:53 +00:00
Pranav Bhandarkar
4c3d3ecdf8 LLVM Bug Fix 13709: Remove needless lsr(Rp, #32) instruction access the
subreg_hireg of register pair Rp.

	* lib/Target/Hexagon/HexagonPeephole.cpp(PeepholeDoubleRegsMap): New
	 DenseMap similar to PeepholeMap that additionally records subreg info
	 too.
        (runOnMachineFunction): Record information in PeepholeDoubleRegsMap
        and copy propagate the high sub-reg of Rp0 in Rp1 = lsr(Rp0, #32) to
	the instruction Rx = COPY Rp1:logreg_subreg.
	* test/CodeGen/Hexagon/remove_lsr.ll: New test.
	



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163214 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05 16:01:40 +00:00
Silviu Baranga
3d5e161fe4 Fixed the DAG combiner to better handle the folding of AND nodes for vector types. The previous code was making the assumption that the length of the bitmask returned by isConstantSplat was equal to the size of the vector type. Now we first make sure that the splat value has at least the length of the vector lane type, then we only use as many fields as we have available in the splat value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163203 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05 08:57:21 +00:00
Logan Chien
fd91d8dd7e Fix UseInitArray option for MIPS target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163193 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05 06:17:17 +00:00
Jakob Stoklund Olesen
daddf07497 Move tie checks into MachineVerifier::visitMachineOperand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163152 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04 18:38:28 +00:00
Preston Gurd
2e2efd9600 Generic Bypass Slow Div
- CodeGenPrepare pass for identifying div/rem ops
- Backend specifies the type mapping using addBypassSlowDivType
- Enabled only for Intel Atom with O2 32-bit -> 8-bit
- Replace IDIV with instructions which test its value and use DIVB if the value
is positive and less than 256.
- In the case when the quotient and remainder of a divide are used a DIV
and a REM instruction will be present in the IR. In the non-Atom case
they are both lowered to IDIVs and CSE removes the redundant IDIV instruction,
using the quotient and remainder from the first IDIV. However,
due to this optimization CSE is not able to eliminate redundant
IDIV instructions because they are located in different basic blocks.
This is overcome by calculating both the quotient (DIV) and remainder (REM)
in each basic block that is inserted by the optimization and reusing the result
values when a subsequent DIV or REM instruction uses the same operands.
- Test cases check for the presents of the optimization when calculating
either the quotient, remainder,  or both.

Patch by Tyler Nowicki!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163150 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04 18:22:17 +00:00
Sergei Larin
3e59040810 Porting Hexagon MI Scheduler to the new API.
Change current Hexagon MI scheduler to use new converging
scheduler. Integrates DFA resource model into it.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163137 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04 14:49:56 +00:00
Arnold Schwaighofer
67514e9066 Patch to implement UMLAL/SMLAL instructions for the ARM architecture
This patch corrects the definition of umlal/smlal instructions and adds support
for matching them to the ARM dag combiner.

Bug 12213

Patch by Yin Ma!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163136 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04 14:37:49 +00:00
Elena Demikhovsky
3251020738 This patch optimizes shuffle instruction - generates 2 instructions instead of 4.
Since this specific shuffle is widely used in many workloads we have ~10% performance on them.

shufflevector <8 x float> %A, <8 x float> %B, <8 x i32> <i32 0, i32 8, i32 2, i32 10, i32 4, i32 12, i32 6, i32 14>

vmovaps (%rdx), %ymm0
vshufps $8, %ymm0, %ymm0, %ymm0
vmovaps (%rcx), %ymm1
vshufps $8, %ymm0, %ymm1, %ymm1
vunpcklps       %ymm0, %ymm1, %ymm0

vmovaps (%rcx), %ymm0
vmovsldup       (%rdx), %ymm1
vblendps        $85, %ymm0, %ymm1, %ymm0


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163134 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04 12:49:02 +00:00
Nadav Rotem
9f40cb32ac Not all targets have efficient ISel code generation for select instructions.
For example, the ARM target does not have efficient ISel handling for vector
selects with scalar conditions. This patch adds a TLI hook which allows the
different targets to report which selects are supported well and which selects
should be converted to CF duting codegen prepare.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163093 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-02 12:10:19 +00:00
Nadav Rotem
f55ef64544 Generate better select code by allowing the target to use scalar select, and not sign-extend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163086 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-02 08:20:07 +00:00
Pete Cooper
0fc44aba18 Revert "Take account of boolean vector contents when promoting a build vector from i1 to some other type. rdar://problem/12210060"
This reverts commit 5dd9e214fb.

Thanks to Duncan for explaining how this should have been done.

Conflicts:

	test/CodeGen/X86/vec_select.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163064 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01 17:37:55 +00:00
Logan Chien
8fccd013d8 Fix Thumb2 fixup kind in the integrated-as.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163063 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01 15:06:36 +00:00
Owen Anderson
58d5729540 Teach DAG combine a number of tricks to simplify FMA expressions in fast-math mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163051 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01 06:04:27 +00:00
NAKAMURA Takumi
5cf8bac4cc llvm/test/CodeGen/X86/fp-fast.ll: Suppress FMA4 on AMD Bulldozer host, corresponding to r162999.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163041 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01 00:26:28 +00:00
Manman Ren
c11b7193a7 Fix Atom bots for r163036.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163040 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01 00:17:06 +00:00
Manman Ren
2b7a2e8833 SelectionDAG: when constructing VZEXT_LOAD from other loads, make sure its
output chain is correctly setup.

As an example, if the original load must happen before later stores, we need
to make sure the constructed VZEXT_LOAD is constrained to be before the stores.

rdar://11457792


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163036 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31 23:16:57 +00:00
Craig Topper
dfb1e4babd Mark FMA4 instructions as commutable and add them to the folding tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163035 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31 23:10:34 +00:00
Michael Liao
265bcb1e5b Fix PR12359
- In addition to undefined, if V2 is zero vector, skip 2nd PSHUFB and POR as
  well as PSHUFB will zero elements with negative indices.

  Patch by Sriram Murali <sriram.murali@intel.com>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163018 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31 20:12:31 +00:00
Craig Topper
cb0848696d Mark FMA3 instructions as commutable so that the operands to the multiply part can be commuted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163001 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31 16:31:13 +00:00
Craig Topper
bf4043768c Add support for converting llvm.fma to fma4 instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162999 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31 15:40:30 +00:00
Jakob Stoklund Olesen
908c0c01f6 Don't enforce ordered inline asm operands.
I was too optimistic, inline asm can have tied operands that don't
follow the def order.

Fixes PR13742.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162998 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31 15:34:59 +00:00
NAKAMURA Takumi
2a1b0e7864 llvm/test/CodeGen/X86/vec_select.ll: Fix failure on xmm-less hosts, to add -mattr=+sse2.
FIXME: Should this be tested with both +avx and -avx,+sse2?

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162983 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31 10:02:22 +00:00
Jakob Stoklund Olesen
05e80f2714 Fix a couple of typos in EmitAtomic.
Thumb2 instructions are mostly constrained to rGPR, not tGPR which is
for Thumb1.

rdar://problem/12203728

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162968 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31 02:08:34 +00:00
Pete Cooper
5dd9e214fb Take account of boolean vector contents when promoting a build vector from i1 to some other type. rdar://problem/12210060
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162960 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30 23:58:52 +00:00
Owen Anderson
9e3b6dfc2f Try to make this test more generic to unbreak buildbots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162958 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30 23:51:20 +00:00
Owen Anderson
43da6c7f13 Teach the DAG combiner to turn chains of FADDs (x+x+x+x+...) into FMULs by constants. This is only enabled in unsafe FP math mode, since it does not preserve rounding effects for all such constants.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162956 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30 23:35:16 +00:00
Nadav Rotem
e757f00446 Currently targets that do not support selects with scalar conditions and vector operands - scalarize the code. ARM is such a target
because it does not support CMOV of vectors. To implement this efficientlyi, we broadcast the condition bit and use a sequence of NAND-OR
to select between the two operands. This is the same sequence we use for targets that don't have vector BLENDs (like SSE2).

rdar://12201387



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162926 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30 19:17:29 +00:00
Michael Liao
a03c44117b Introduce 'UseSSEx' to force SSE legacy encoding
- Add 'UseSSEx' to force SSE legacy insn not being selected when AVX is
  enabled.

  As the penalty of inter-mixing SSE and AVX instructions, we need
  prevent SSE legacy insn from being generated except explicitly
  specified through some intrinsics. For patterns supported by both
  SSE and AVX, so far, we force AVX insn will be tried first relying on
  AddedComplexity or position in td file. It's error-prone and
  introduces bugs accidentally.

  'UseSSEx' is disabled when AVX is turned on. For SSE insns inherited
  by AVX, we need this predicate to force VEX encoding or SSE legacy
  encoding only.

  For insns not inherited by AVX, we still use the previous predicates,
  i.e. 'HasSSEx'. So far, these insns fall into the following
  categories:
  * SSE insns with MMX operands
  * SSE insns with GPR/MEM operands only (xFENCE, PREFETCH, CLFLUSH,
    CRC, and etc.)
  * SSE4A insns.
  * MMX insns.
  * x87 insns added by SSE.

2 test cases are modified:

 - test/CodeGen/X86/fast-isel-x86-64.ll
   AVX code generation is different from SSE one. 'vcvtsi2sdq' cannot be
   selected by fast-isel due to complicated pattern and fast-isel
   fallback to materialize it from constant pool.

 - test/CodeGen/X86/widen_load-1.ll
   AVX code generation is different from SSE one after fixing SSE/AVX
   inter-mixing. Exec-domain fixing prefers 'vmovapd' instead of
   'vmovaps'.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162919 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30 16:54:46 +00:00
Tim Northover
c4a32e6596 Add support for moving pure S-register to NEON pipeline if desired
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162898 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30 10:17:45 +00:00
Michael Liao
b6efbd2145 Should put test case under test/ExecutionEngine/MCJIT/
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162885 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30 00:43:57 +00:00
Michael Liao
faa1159a69 Fix PR13727
- The root cause is that target constant materialization in X86 fast-isel
  creates a PC-rel addressing which may overflow 32-bit range in non-Small code
  model if .rodata section is allocated too far away from code segment in
  MCJIT, which uses Large code model so far.
- Follow the similar logic to fix non-Small code model in fast-isel by skipping
  non-Small code model.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162881 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30 00:30:16 +00:00
Hal Finkel
bbd169b1d9 Reserve space for the mandatory traceback fields on PPC64.
We need to reserve space for the mandatory traceback fields,
though leaving them as zero is appropriate for now.

Although the ABI calls for these fields to be filled in fully, no
compiler on Linux currently does this, and GDB does not read these
fields.  GDB uses the first word of zeroes during exception handling to
find the end of the function and the size field, allowing it to compute
the beginning of the function.  DWARF information is used for everything
else.  We need the extra 8 bytes of pad so the size field is found in
the right place.

As a comparison, GCC fills in a few of the fields -- language, number
of saved registers -- but ignores the rest.  IBM's proprietary OSes do
make use of the full traceback table facility.

Patch by Bill Schmidt.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162854 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-29 20:22:24 +00:00
Jush Lu
c4dc2490c4 [arm-fast-isel] Add support for ARM PIC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162823 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-29 02:41:21 +00:00
Roman Divacky
c6c2ced384 Emit word of zeroes after the last instruction as a start of the mandatory
traceback table on PowerPC64. This helps gdb handle exceptions. The other
mandatory fields are ignored by gdb and harder to implement so just add
there a FIXME.

Patch by Bill Schmidt. PR13641.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162778 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-28 19:06:55 +00:00
Hal Finkel
621b77ade2 Add PPC Freescale e500mc and e5500 subtargets.
Add subtargets for Freescale e500mc (32-bit) and e5500 (64-bit) to
the PowerPC backend.

Patch by Tobias von Koch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162764 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-28 16:12:39 +00:00
Bill Wendling
eeba6e8317 The commutative flag is already correctly set within the multiclass. If we set
it here, then a 'register-memory' version would wrongly get the commutative
flag.
<rdar://problem/12180135>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162741 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-28 07:36:46 +00:00
Craig Topper
13897fb263 Merge AVX_SET0PSY/AVX_SET0PDY/AVX2_SET0 into a single post-RA pseudo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162738 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-28 07:05:28 +00:00
NAKAMURA Takumi
2f820a5d64 llvm/test/CodeGen/X86/pr12312.ll: Add -mtriple=x86_64-unknown-unknown.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162736 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-28 04:04:29 +00:00
Michael Liao
dbf8b5be97 Fix PR12312
- Add a target-specific DAG optimization to recognize a pattern PTEST-able.
  Such a pattern is a OR'd tree with X86ISD::OR as the root node. When
  X86ISD::OR node has only its flag result being used as a boolean value and
  all its leaves are extracted from the same vector, it could be folded into an
  X86ISD::PTEST node.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162735 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-28 03:34:40 +00:00
Jakob Stoklund Olesen
36d29bc723 Remove extra MayLoad/MayStore flags from atomic_load/store.
These extra flags are not required to properly order the atomic
load/store instructions. SelectionDAGBuilder chains atomics as if they
were volatile, and SelectionDAG::getAtomic() sets the isVolatile bit on
the memory operands of all atomic operations.

The volatile bit is enough to order atomic loads and stores during and
after SelectionDAG.

This means we set mayLoad on atomic_load, mayStore on atomic_store, and
mayLoad+mayStore on the remaining atomic read-modify-write operations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162733 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-28 03:11:32 +00:00
Akira Hatanaka
273956d8c6 Fix mips' long branch pass.
Instructions emitted to compute branch offsets now use immediate operands
instead of symbolic labels. This change was needed because there were problems
when R_MIPS_HI16/LO16 relocations were used to make shared objects.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162731 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-28 03:03:05 +00:00
Akira Hatanaka
1d522388bf Fix bug 13532.
In SelectionDAGLegalize::ExpandLegalINT_TO_FP, expand INT_TO_FP nodes without
using any f64 operations if f64 is not a legal type.

Patch by Stefan Kristiansson. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162728 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-28 02:12:42 +00:00
Hal Finkel
f3c3828e57 Allow remat of LI on PPC.
Allow load-immediates to be rematerialised in the register coalescer for
PPC. This makes test/CodeGen/PowerPC/big-endian-formal-args.ll fail,
because it relies on a register move getting emitted. The immediate load is
equivalent, so change this test case.

Patch by Tobias von Koch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162727 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-28 02:10:33 +00:00
Hal Finkel
82b3821208 Eliminate redundant CR moves on PPC32.
The 32-bit ABI requires CR bit 6 to be set if the call has fp arguments and
unset if it doesn't. The solution up to now was to insert a MachineNode to
set/unset the CR bit, which produces a CR vreg. This vreg was then copied
into CR bit 6. When the register allocator saw a bunch of these in the same
function, it allocated the set/unset CR bit in some random CR register (1
extra instruction) and then emitted CR moves before every vararg function
call, rather than just setting and unsetting CR bit 6 directly before every
vararg function call. This patch instead inserts a PPCcrset/PPCcrunset
instruction which are then matched by a dedicated instruction pattern.

Patch by Tobias von Koch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162725 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-28 02:10:27 +00:00
Hal Finkel
97d047dec7 Optimize zext on PPC64.
The zeroextend IR instruction is lowered to an 'and' node with an immediate
mask operand, which in turn gets legalised to a sequence of ori's & ands.
This can be done more efficiently using the rldicl instruction.

Patch by Tobias von Koch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162724 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-28 02:10:15 +00:00
Bill Wendling
47aa9a2bb5 Make sure we add the predicate after all of the registers are added.
<rdar://problem/12183003>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162703 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-27 22:12:44 +00:00
NAKAMURA Takumi
1ba64859f5 llvm/test/CodeGen/X86/fma.ll: Add -march=x86, or two tests would fail on non-x86 hosts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162667 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-27 11:50:26 +00:00
NAKAMURA Takumi
fdc35405cd llvm/test/CodeGen/X86/fma_patterns.ll: Add -mtriple=x86_64. It was incompatible on i686 and Windows x64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162664 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-27 09:37:54 +00:00
Craig Topper
43e4c62c43 Commit test change for r162658.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162660 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-27 07:55:50 +00:00
Anitha Boyapati
0aa63fcbcb FMA3 tests on bdver2 target for changes made in rev 162012. Also made
corresponding changes to existing tests for darwin triple to ensure that
same pattern is tested for bdver2 target.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162655 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-27 06:59:01 +00:00
Craig Topper
e10fa862f8 Make sure that FMA3 is favored even when FMA4 is also enabled. Test case for r162454.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162653 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-27 03:38:15 +00:00
Jakob Stoklund Olesen
4ad27eda29 Infer instruction properties from single-instruction patterns.
Previously, instructions without a primary patterns wouldn't get their
properties inferred. Now, we use all single-instruction patterns for
inference, including 'def : Pat<>' instances.

This causes a lot of instruction flags to change.

- Many instructions no longer have the UnmodeledSideEffects flag because
  their flags are now inferred from a pattern.

- Instructions with intrinsics will get a mayStore flag if they already
  have UnmodeledSideEffects and a mayLoad flag if they already have
  mayStore. This is because intrinsics properties are linear.

- Instructions with atomic_load patterns get a mayStore flag because
  atomic loads can't be reordered. The correct workaround is to create
  pseudo-instructions instead of using normal loads. PR13693.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162614 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-24 22:46:53 +00:00
Akira Hatanaka
16865d0612 Disable Mips' delay slot filler when optimization level is O0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162589 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-24 20:40:15 +00:00
Akira Hatanaka
45d8dbc92d In MipsDAGToDAGISel::SelectAddr, fold add node into address operand, if its
second operand is MipsISD::GPRel.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162584 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-24 20:21:49 +00:00
Manman Ren
1a710fdde1 BranchProb: modify the definition of an edge in BranchProbabilityInfo to handle
the case of multiple edges from one block to another.

A simple example is a switch statement with multiple values to the same
destination. The definition of an edge is modified from a pair of blocks to
a pair of PredBlock and an index into the successors.

Also set the weight correctly when building SelectionDAG from LLVM IR,
especially when converting a Switch.
IntegersSubsetMapping is updated to calculate the weight for each cluster.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162572 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-24 18:14:27 +00:00
Roman Divacky
9fb8b49380 Lower constant pools and jump tables via TOC on PPC64/SVR4.
In collaboration with Adhemerval Zanella.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162562 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-24 16:26:02 +00:00
Stepan Dyatkovskiy
fdeb9fe5e0 Rejected 169195. As Duncan commented, bitcasting to proper type is wrong approach. We need to insert some valid TRANCATE node here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162354 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-22 09:33:55 +00:00
Akira Hatanaka
e7338cd550 Add register Mips::GP to the list of reserved registers if target is bare-metal
to prevent it from being clobbered. mips uses $gp to access small data section.

This bug was originally reported by Carl Norum.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162340 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-22 03:18:13 +00:00
Akira Hatanaka
6522a9e04b Add option disable-mips-delay-filler. Turn on mips' delay slot filler by
default.

Patch by Carl Norum.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162339 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-22 02:51:28 +00:00
Tim Northover
d43d7fec10 Add correct set of regression tests for r162094 commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162276 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-21 12:43:03 +00:00
Jakob Stoklund Olesen
5379904807 Add a missing def flag.
*** Bad machine code: Explicit definition marked as use ***
- function:    test_cos
- basic block: BB#0 L.entry (0x7ff2a2024fd0)
- instruction: VSETLNi32 %D11, %D11<undef>, %R0, 0, pred:14, pred:%noreg, %Q5<imp-use,kill>, %Q5<imp-def>
- operand 0:   %D11

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162247 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-21 00:34:53 +00:00
Jakob Stoklund Olesen
e7fdef420d Don't add CFG edges for redundant conditional branches.
IR that hasn't been through SimplifyCFG can look like this:

  br i1 %b, label %r, label %r

Make sure we don't create duplicate Machine CFG edges in this case.

Fix the machine code verifier to accept conditional branches with a
single CFG edge.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162230 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-20 21:39:52 +00:00
Michael Liao
24438b8359 fix a case where all operands of BUILD_VECTOR are undefined
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162214 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-20 17:59:18 +00:00
Stepan Dyatkovskiy
bee05dc840 Forget to add testcase for r162195. Sorry.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162196 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-20 08:03:18 +00:00
Nadav Rotem
d60cb11afd When unsafe math is used, we can use commutative FMAX and FMIN. In some cases
this allows for better code generation.

Added a new DAGCombine transformation to convert FMAX and FMIN to FMANC and
FMINC, which are commutative.

For example:

  movaps  %xmm0, %xmm1
  movsd LC(%rip), %xmm0
  minsd %xmm1, %xmm0

becomes:

  minsd LC(%rip), %xmm0




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162187 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-19 13:06:16 +00:00
Jakob Stoklund Olesen
864c8702ba Also combine zext/sext into selects for ARM.
This turns common i1 patterns into predicated instructions:

  (add (zext cc), x) -> (select cc (add x, 1), x)
  (add (sext cc), x) -> (select cc (add x, -1), x)

For a function like:

  unsigned f(unsigned s, int x) {
    return s + (x>0);
  }

We now produce:

  cmp r1, #0
  it  gt
  addgt.w r0, r0, #1

Instead of:

  movs  r2, #0
  cmp r1, #0
  it  gt
  movgt r2, #1
  add r0, r2

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162177 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-18 21:25:22 +00:00
Jakob Stoklund Olesen
dcd2342d32 Also pass logical ops to combineSelectAndUse.
Add these transformations to the existing add/sub ones:

  (and (select cc, -1, c), x) -> (select cc, x, (and, x, c))
  (or  (select cc, 0, c), x)  -> (select cc, x, (or, x, c))
  (xor (select cc, 0, c), x)  -> (select cc, x, (xor, x, c))

The selects can then be transformed to a single predicated instruction
by peephole.

This transformation will make it possible to eliminate the ISD::CAND,
COR, and CXOR custom DAG nodes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162176 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-18 21:25:16 +00:00
Nadav Rotem
b9d6b8449d Reapply r162160 with a fix: Optimize Arith->Trunc->SETCC sequence to allow better compare/branch code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162172 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-18 17:53:03 +00:00
Nadav Rotem
d5c66a0b1f Revert r162160 because it made a few buildbots fail.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162164 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-18 05:02:36 +00:00
Nadav Rotem
b5838689c6 The X86 backend has a number of optimizations for SETCC nodes which use
arithmetic instructions. However, when small data types are used, a truncate
node appears between the SETCC node and the arithmetic operation. This patch
adds support for this pattern.

Before:
  xorl  %esi, %edi
  testb %dil, %dil
  setne %al
  ret

After:
  xorb  %dil, %sil
  setne %al
  ret

rdar://12081007



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162160 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-18 02:43:28 +00:00
Eli Friedman
fd45fa1503 Make atomic load and store of pointers work. Tighten verification of atomic operations
so other unexpected operations don't slip through.  Based on patch by Logan Chien.
PR11786/PR13186.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162146 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-17 23:24:29 +00:00
Jakob Stoklund Olesen
a7fb3f6804 Avoid folding ADD instructions with FI operands.
PEI can't handle the pseudo-instructions. This can be removed when the
pseudo-instructions are replaced by normal predicated instructions.

Fixes PR13628.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162130 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-17 20:55:34 +00:00
Benjamin Kramer
b97cebdfcc TargetLowering: Use the large shift amount during legalize types. The legalizer may call us with an overly large type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162101 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-17 15:54:21 +00:00
Benjamin Kramer
4e81d40545 Fix broken check lines.
I really need to find a way to automate this, but I can't come up with a regex
that has no false positives while handling tricky cases like custom check
prefixes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162097 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-17 12:28:26 +00:00
Tim Northover
3c8ad92455 Implement NEON domain switching for scalar <-> S-register vmovs on ARM
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162094 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-17 11:32:52 +00:00
Jakob Stoklund Olesen
083b48af14 Add ADD and SUB to the predicable ARM instructions.
It is not my plan to duplicate the entire ARM instruction set with
predicated versions. We need a way of representing predicated
instructions in SSA form without requiring a separate opcode.

Then the pseudo-instructions can go away.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162061 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-16 23:21:55 +00:00
Jush Lu
2ff4e9d5af [arm-fast-isel] Add support for fastcc.
Without fastcc support, the caller just falls through to CallingConv::C
for fastcc, but callee still uses fastcc, this inconsistency of calling
convention is a problem, and fastcc support can fix it.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162013 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-16 05:15:53 +00:00
Akira Hatanaka
a75e1a4d36 Test case for r162008.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162009 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-16 03:48:41 +00:00
Jakob Stoklund Olesen
2860b7ea3a Fold predicable instructions into MOVCC / t2MOVCC.
The ARM select instructions are just predicated moves. If the select is
the only use of an operand, the instruction defining the operand can be
predicated instead, saving one instruction and decreasing register
pressure.

This implementation can turn AND/ORR/EOR instructions into their
corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to
predicate any instruction, but we don't yet support predicated
instructions in SSA form.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161994 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-15 22:16:39 +00:00
Bill Wendling
1e0b864d23 Rework test so that it reproduces the error without the horrible flag.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161989 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-15 21:10:18 +00:00
Bill Wendling
9c0f0dc7b2 Remove invalid test. This test requires that dead basic blocks be kept
around. That's not how we do things. Besides, the commit message tells us that
it is covered by the GCC test suite.

------------------------------------------------------------------------
r127497 | zwarich | 2011-03-11 13:51:56 -0800 (Fri, 11 Mar 2011) | 3 lines

Fix the GCC test suite issue exposed by r127477, which was caused by stack
protector insertion not working correctly with unreachable code. Since that
revision was rolled out, this test doesn't actual fail before this fix.
------------------------------------------------------------------------



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161985 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-15 20:54:09 +00:00
Evan Cheng
a99c508c8d Use vld1/vst1 to load/store f64 if alignment is < 4 and the target allows unaligned access. rdar://12091029
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161962 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-15 17:44:53 +00:00
Anton Korobeynikov
652199961a The names of VFP variants of half-to-float conversion instructions were
reversed. This leads to wrong codegen for float-to-half conversion
intrinsics which are used to support storage-only fp16 type.
NEON variants of same instructions are fine.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161907 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-14 23:36:01 +00:00
Michael Liao
7091b2451d fix PR11334
- FP_EXTEND only support extending from vectors with matching elements.
  This results in the scalarization of extending to v2f64 from v2f32,
  which will be legalized to v4f32 not matching with v2f64.
- add X86-specific VFPEXT supproting extending from v4f32 to v2f64.
- add BUILD_VECTOR lowering helper to recover back the original
  extending from v4f32 to v2f64.
- test case is enhanced to include different vector width.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161894 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-14 21:24:47 +00:00
Nadav Rotem
3e883734fa During the CodeGenPrepare we often lower intrinsics (such as objsize)
and allow some optimizations to turn conditional branches into unconditional.
This commit adds a simple control-flow optimization which merges two consecutive
basic blocks which are connected by a single edge. This allows the codegen to
operate on larger basic blocks.

rdar://11973998



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161852 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-14 05:19:07 +00:00
NAKAMURA Takumi
1251263144 llvm/test/CodeGen/ARM/floorf.ll: Add explicit -mtriple=arm-unknown-unknown, or it fails on msvc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161825 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-14 00:56:06 +00:00
Owen Anderson
7c626d3097 Add a roundToIntegral method to APFloat, which can be parameterized over various rounding modes. Use this to implement SelectionDAG constant folding of FFLOOR, FCEIL, and FTRUNC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161807 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-13 23:32:49 +00:00
Bill Wendling
6de47d611e Rename test since it's not linux-specific.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161792 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-13 21:32:42 +00:00
Jakob Stoklund Olesen
bc70ff3cb9 Handle extra Tail predecessors in if-conversion.
It is still possible to if-convert if the tail block has extra
predecessors, but the tail phis must be rewritten instead of being
removed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161781 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-13 20:49:04 +00:00
Arnold Schwaighofer
d252aa43c9 [Hexagon] Don't mark callee saved registers as clobbered by a tail call
This was causing unnecessary spills/restores of callee saved registers.

Fixes PR13572.

Patch by Pranav Bhandarkar!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161778 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-13 19:54:01 +00:00
Manman Ren
2e018f1cf6 Fix failure on Atom bot due to r161769
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161777 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-13 19:34:29 +00:00
Nadav Rotem
df8320313b Do not optimize (or (and X,Y), Z) into BFI and other sequences if the AND ISDNode has more than one user.
rdar://11876519



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161775 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-13 18:52:44 +00:00
Manman Ren
c586d26812 X86: move Int_CVTSD2SSrr, Int_CVTSI2SSrr, Int_CVTSI2SDrr, Int_CVTSS2SDrr from
OpTbl1 to OpTbl2 since they have 3 operands and the last operand can be changed
to a memory operand.

PR13576


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161769 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-13 18:29:41 +00:00
Eric Christopher
001d219b97 Add support for the %H output modifier.
Patch by Weiming Zhao.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161768 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-13 18:18:52 +00:00
Tim Northover
88fc697b7c Add test for previous commit correcting NEON load patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161750 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-13 10:38:45 +00:00
Arnold Schwaighofer
a7016d6fc1 Revert 161581: Patch to implement UMLAL/SMLAL instructions for the ARM
architecture

It broke MultiSource/Applications/JM/ldecod/ldecod on armv7 thumb O0 g and armv7
thumb O3.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161736 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-12 05:11:56 +00:00
Michael Liao
9eac20ac88 fix PR13577, an issue introduced by r161687
- FCMOV only supports a subset of X86 conditions. Skip boolean
  simplification if X86 condition is not valid for FCMOV.
- add a minimal test case for PR13577.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161732 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-11 23:47:06 +00:00
Benjamin Kramer
cfc0ad6e48 PR13578: Teach MachineCSE that instructions that use a constant register can be CSE'd safely.
This is common e.g. when doing rip-relative addressing on x86_64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161728 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-11 19:05:13 +00:00
Manman Ren
743a2cff04 X86: when we are auto-detecting the subtarget features, make sure we turn on
FeatureFastUAMem for Nehalem, Westmere and Sandy Bridge.

FeatureFastUAMem is already on if we pass in nehalem or westmere as a command
argument.

rdar: 7252306


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161717 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-10 23:43:32 +00:00
Eli Friedman
6b951b25c3 The normal edge of an invoke is not allowed to branch to a block with a
landingpad.  Enforce it in the verifier, and fix the regression tests to match.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161697 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-10 20:55:20 +00:00
Michael Liao
2a33cec66a add X86-specific DAG optimization to simplify boolean test
- if a boolean test (X86ISD::CMP or X86ISD:SUB) checks a boolean value
  generated from X86ISD::SETCC, try to simplify the boolean value
  generation and checking by reusing the original EFLAGS with proper
  condition code
- add hooks to X86 specific SETCC/BRCOND/CMOV, the major 3 places
  consuming EFLAGS

part of patches fixing PR12312



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161687 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-10 19:58:13 +00:00
Jakob Stoklund Olesen
15121ca0d1 Update edge weights correctly in replaceSuccessor().
When replacing Old with New, it can happen that New is already a
successor. Add the old and new edge weights instead of creating a
duplicate edge.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161653 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-10 03:23:27 +00:00
Jakob Stoklund Olesen
c7908037d8 Reapply r161633-161634 "Partition use lists so defs always come before uses.""
No changes to these patches, MRI needed to be notified when changing
uses into defs and vice versa.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161644 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-10 00:21:30 +00:00
Jakob Stoklund Olesen
1134aae4e7 Revert r161633-161634 "Partition use lists so defs always come before uses."
These commits broke a number of buildbots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161640 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-09 23:31:36 +00:00
Jakob Stoklund Olesen
81a6995243 Partition use lists so defs always come before uses.
This makes it possible to speed up def_iterator by stopping at the first
use. This makes def_empty() and getUniqueVRegDef() much faster when
there are many uses.

In a +Asserts build, LiveVariables is 100x faster in one case because
getVRegDef() has an assertion that would scan to the end of a
def_iterator chain.

Spill weight calculation is significantly faster (300x in one case)
because isTriviallyReMaterializable() calls MRI->isConstantPhysReg(%RIP)
which calls def_empty(%RIP).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161634 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-09 22:49:46 +00:00
Jakob Stoklund Olesen
46f4c35372 Don't use pointer-pointers for the register use lists.
Use a more conventional doubly linked list where the Prev pointers form
a cycle. This means it is no longer necessary to adjust the Prev
pointers when reallocating the VRegInfo array.

The test changes are required because the register allocation hint is
using the use-list order to break ties.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161633 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-09 22:49:42 +00:00
Jakob Stoklund Olesen
69a0aa87f8 Don't modify MO while use_iterator is still pointing to it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161626 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-09 22:08:24 +00:00
Arnold Schwaighofer
bcc4c1d2d1 Patch to implement UMLAL/SMLAL instructions for the ARM architecture
This patch corrects the definition of umlal/smlal instructions and adds support
for matching them to the ARM dag combiner.

Bug 12213

Patch by Yin Ma!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161581 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-09 15:25:52 +00:00
Nadav Rotem
0b66bd9b07 Fix the legalization of ExtLoad on ARM. ExpandUnalignedLoad did not properly
handle the cases where the memory value type was illegal. 
PR 13111. 



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161565 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-09 01:56:44 +00:00
Bob Wilson
5f91a99427 Add test triples to fix win32 failures. Revert workaround from r161292.
I don't have a win32 system to test, so hopefully I got them all fixed here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161519 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-08 20:31:37 +00:00
Manman Ren
39ad568c62 X86: enable CSE between CMP and SUB
We perform the following:
1> Use SUB instead of CMP for i8,i16,i32 and i64 in ISel lowering.
2> Modify MachineCSE to correctly handle implicit defs.
3> Convert SUB back to CMP if possible at peephole.

Removed pattern matching of (a>b) ? (a-b):0 and like, since they are handled
by peephole now.

rdar://11873276


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161462 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-08 00:51:41 +00:00
Evan Cheng
b64dd5f2b5 X86 cmp lowering is looking past truncate on the condition node. It should only
do so when the high bits are known zero. This caused a subtle miscompilation.

rdar://12027825 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161451 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-07 22:21:00 +00:00
Chandler Carruth
e6450dc2af Add a much more conservative strategy for aligning branch targets.
Previously, MBP essentially aligned every branch target it could. This
bloats code quite a bit, especially non-looping code which has no real
reason to prefer aligned branch targets so heavily.

As Andy said in review, it's still a bit odd to do this without a real
cost model, but this at least has much more plausible heuristics.

Fixes PR13265.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161409 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-07 09:45:24 +00:00
Manman Ren
ba86b13ad9 MachineCSE: Update the heuristics for isProfitableToCSE.
If the result of a common subexpression is used at all uses of the candidate
expression, CSE should not increase the live range of the common subexpression.

rdar://11393714 and rdar://11819721


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161396 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-07 06:16:46 +00:00
Hal Finkel
f45717e985 MFTB on PPC64 should really be encoded using MFSPR.
The MFTB instruction itself is being phased out, and its functionality
is provided by MFSPR. According to the ISA docs, using MFSPR works on all known
chips except for the 601 (which did not have a timebase register anyway)
and the POWER3.

Thanks to Adhemerval Zanella for pointing this out!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161346 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-06 21:21:44 +00:00
Craig Topper
4feb647283 Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161318 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-06 06:22:36 +00:00
Craig Topper
d230d20b73 Update test to check for r161305
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161307 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-05 09:06:28 +00:00
Hal Finkel
8cc3474f72 Add readcyclecounter lowering on PPC64.
On PPC64, this can be done with a simple TableGen pattern.
To enable this, I've added the (otherwise missing) readcyclecounter
SDNode definition to TargetSelectionDAG.td.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161302 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-04 14:10:46 +00:00
Anton Korobeynikov
b58d7d0312 Add stack spill / reload instructions for DTriple and DQuad register classes, which
were missed for no reason. This fixes PR13377


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161299 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-04 13:16:12 +00:00
Bob Wilson
53624a2df5 Refactor and check "onlyReadsMemory" before optimizing builtins.
This patch is mostly just refactoring a bunch of copy-and-pasted code, but
it also adds a check that the call instructions are readnone or readonly.
That check was already present for sin, cos, sqrt, log2, and exp2 calls, but
it was missing for the rest of the builtins being handled in this code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161282 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-03 23:29:17 +00:00
Akira Hatanaka
24e79e55da 1. Redo mips16 instructions to avoid multiple opcodes for same instruction.
Change these to patterns.
2. Add another 16 instructions.

Patch by Reed Kotler.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161272 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-03 22:57:02 +00:00
Bob Wilson
772af92cb1 Fix memcmp code-gen to honor -fno-builtin.
I noticed that SelectionDAGBuilder::visitCall was missing a check for memcmp
in TargetLibraryInfo, so that it would use custom code for memcmp calls even
with -fno-builtin.  I also had to add a new -disable-simplify-libcalls option
to llc so that I could write a test for this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161262 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-03 21:26:18 +00:00
Bob Wilson
d49edb7ab0 Fall back to selection DAG isel for calls to builtin functions.
Fast isel doesn't currently have support for translating builtin function
calls to target instructions.  For embedded environments where the library
functions are not available, this is a matter of correctness and not
just optimization.  Most of this patch is just arranging to make the
TargetLibraryInfo available in fast isel.  <rdar://problem/12008746>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161232 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-03 04:06:28 +00:00
Jush Lu
2946549a28 [arm-fast-isel] Add support for shl, lshr, and ashr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161230 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-03 02:37:48 +00:00
Manman Ren
127eea87d6 X86 Peephole: fold loads to the source register operand if possible.
Add more comments and use early returns to reduce nesting in isLoadFoldable.
Also disable folding for V_SET0 to avoid introducing a const pool entry and
a const pool load.

rdar://10554090 and rdar://11873276


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161207 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-02 19:37:32 +00:00
Akira Hatanaka
bddf83614a Set transient stack alignment in constructor of MipsFrameLowering and re-enable
test o32_cc_vararg.ll.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161189 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-02 18:15:13 +00:00
NAKAMURA Takumi
ac89c0ddfd llvm/test/CodeGen/X86/fold-pcmpeqd-1.ll: Make sure this is testing without +avx.
FIXME: Could +avx be checked here too?

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161156 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-02 06:36:56 +00:00
NAKAMURA Takumi
25fa9a4890 llvm/test/CodeGen/X86/fold-pcmpeqd-1.ll: Rewrite expressions to pass regardless of PR11031.
- Relax to match even if epilogue (pop %ebp) were emitted.
  - Assume the return value is stored to %xmm0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161155 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-02 06:33:58 +00:00
Manman Ren
d7d003c2b7 X86 Peephole: fold loads to the source register operand if possible.
Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.

This patch is a rework of r160919 and was tested on clang self-host on my local
machine.

rdar://10554090 and rdar://11873276


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161152 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-02 00:56:42 +00:00
Matt Beaumont-Gay
b705e4a8b6 Line endings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161117 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-01 16:42:35 +00:00
Elena Demikhovsky
1503aba4a0 Added FMA functionality to X86 target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161110 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-01 12:06:00 +00:00
Akira Hatanaka
cdb3ba71ce Add definitions of two subclasses of MipsFrameLowering, Mips16FrameLowering and
MipsSEFrameLowering.

Implement MipsSEFrameLowering::hasReservedCallFrame. Call frames will not be
reserved if there is a call with a large call frame or there are variable sized
objects on the stack.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161090 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-31 22:50:19 +00:00
Akira Hatanaka
1d53f1bbab Let PEI::calculateFrameObjectOffsets compute the final stack size rather than
computing it in MipsFrameLowering::emitPrologue.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161078 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-31 21:28:49 +00:00
Akira Hatanaka
1d165f1c25 Expand DYNAMIC_STACKALLOC nodes rather than doing custom-lowering.
The frame object which points to the dynamically allocated area will not be
needed after changes are made to cease reserving call frames.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161076 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-31 20:54:48 +00:00
Akira Hatanaka
e2d529ac11 When store nodes or memcpy nodes are created to copy the function call
arguments to the stack in MipsISelLowering::LowerCall, use stack pointer and
integer offset operands rather than frame object operands.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161068 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-31 18:46:41 +00:00
Chad Rosier
d97f3a5ab0 [x86 frame lowering] In 32-bit mode, use ESI as the base pointer.
Previously, we were using EBX, but PIC requires the GOT to be in EBX before 
function calls via PLT GOT pointer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161066 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-31 18:29:21 +00:00
Akira Hatanaka
36bcc11236 Fix type of LUXC1 and SUXC1. These instructions were incorrectly defined as
single-precision load and store.

Also avoid selecting LUXC1 and SUXC1 instructions during isel. It is incorrect
to map unaligned floating point load/store nodes to these instructions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161063 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-31 18:16:49 +00:00
Manman Ren
53b59d1d97 MachineSink: Sort the successors before trying to find SuccToSinkTo.
One motivating example is to sink an instruction from a basic block which has
two successors: one outside the loop, the other inside the loop. We should try
to sink the instruction outside the loop.

rdar://11980766


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161062 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-31 18:10:39 +00:00
Jakob Stoklund Olesen
34af6f597b Clear kill flags in removeCopyByCommutingDef().
We are extending live ranges, so kill flags are not accurate. They
aren't needed until they are recomputed after RA anyway.

<rdar://problem/11950722>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161023 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-31 02:47:24 +00:00
Manman Ren
1123614317 Reverse order of the two branches at end of a basic block if it is profitable.
We branch to the successor with higher edge weight first.
Convert from
     je    LBB4_8  --> to outer loop
     jmp   LBB4_14 --> to inner loop
to
     jne   LBB4_14
     jmp   LBB4_8

PR12750
rdar: 11393714


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161018 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-31 01:11:07 +00:00
Pete Cooper
32ecfb4158 Consider address spaces for hashing and CSEing DAG nodes. Otherwise two loads from different x86 segments but the same address would get CSEd
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160987 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-30 20:23:19 +00:00
Manman Ren
e8b4a4a9d1 Revert r160920 and r160919 due to dragonegg and clang selfhost failure
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160927 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-29 02:44:09 +00:00
Manman Ren
14148c41d9 X86 Peephole: fold loads to the source register operand if possible.
Trying to fix the bot by specifying a triple in the failing testing cases.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160920 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-28 17:51:24 +00:00
Manman Ren
0eb3edea9c X86 Peephole: fold loads to the source register operand if possible.
Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.

rdar://10554090 and rdar://11873276


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160919 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-28 16:48:01 +00:00
Manman Ren
43d9ab1812 X86 Peephole: fix PR13475 in optimizeCompare.
It is possible that an instruction can use and update EFLAGS.
When checking the safety, we should check the usage of EFLAGS first before
declaring it is safe to optimize due to the update.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160912 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-28 03:15:46 +00:00
Evan Cheng
9c777a4844 Teach CodeGenPrep to look past bitcast when it's duplicating return instruction
into predecessor blocks to enable tail call optimization.

rdar://11958338


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160894 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-27 21:21:26 +00:00
Jakob Stoklund Olesen
72e7dbf88b Add <imp-def> of super-register when lowering SUBREG_TO_REG.
Patch by Tyler Nowicki!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160888 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-27 20:19:49 +00:00
Jakob Stoklund Olesen
369a4c7759 Eliminate a batch of uses of sub_ss and sub_sd in the X86 target.
These idempotent sub-register indices don't do anything --- They simply
map XMM registers to themselves.  They no longer affect register classes
either since the SubRegClasses field has been removed from Target.td.

This patch replaces XMM->XMM EXTRACT_SUBREG and INSERT_SUBREG patterns
with COPY_TO_REGCLASS patterns which simply become COPY instructions.

The number of IMPLICIT_DEF instructions before register allocation is
reduced, and that is the cause of the test case changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160816 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-26 21:40:42 +00:00
Akira Hatanaka
e11246c64e Fix call setup for PIC.
Patch by Reed Kotler.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160774 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-26 02:24:43 +00:00
Manman Ren
24182757bf Update testing case for Atom when disabling rematerialization in
TwoAddressInstructionPass.

The generated code for Atom has a different code sequence. This is realted
to commit r160749.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160755 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-25 20:17:14 +00:00
Manman Ren
d68e8cda24 Disable rematerialization in TwoAddressInstructionPass.
It is redundant; RegisterCoalescer will do the remat if it can't eliminate
the copy. Collected instruction counts before and after this. A few extra
instructions are generated due to spilling but it is normal to see these kinds
of changes with almost any small codegen change, according to Jakob.

This also fixed rdar://11830760 where xor is expected instead of movi0.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160749 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-25 18:28:13 +00:00
Rafael Espindola
1cee71099c When a return struct pointer is passed in registers, the called has nothing
to pop.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160725 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-25 13:41:10 +00:00
Akira Hatanaka
de4a127470 Eliminate the stack slot used to save the global base register.
The long branch pass (fixed in r160601) no longer uses the global base register
to compute addresses of branch destinations, so it is not necessary to reserve
a slot on the stack.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160703 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-25 03:16:47 +00:00
Rafael Espindola
d9fb65dee6 Add a cpu to the test. Should fix the atom bot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160701 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-24 22:56:06 +00:00
Rafael Espindola
423ebb1f38 Add a triple to the test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160698 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-24 21:55:04 +00:00
Rafael Espindola
cde227bc2a In order to correctly compile
struct s {
  double x1;
  float x2;
};
__attribute__((regparm(3))) struct s f(int a, int b, int c);
void g(void) {
  f(41, 42, 43);
}

We need to be able to represent passing the address of s to f (sret) in a
register (inreg). Turns out that all that is needed is to not mark them as
mutually incompatible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160695 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-24 21:40:17 +00:00
David Chisnall
23a62cbaf5 ELF does not imply GNU/Linux. Do not assume GNU conventions just because we
are targeting an ELF platform.  Only fold gs-relative (and fs-relative) loads
if it is actually sensible to do so for the target platform.

This fixes PR13438.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160687 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-24 20:04:16 +00:00
Akira Hatanaka
3ee306cbc0 Add basic ability to setup call frame, and make procedure calls.
Hello world will compile and execute with this patch.

Patch by Reed Kotler.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160651 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-23 23:45:54 +00:00
Sylvestre Ledru
c8e41c5917 Fix a typo (the the => the)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160621 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-23 08:51:15 +00:00
Nadav Rotem
ed1a335ece Fixed DAGCombine optimizations which generate select_cc for targets
that do not support it (X86 does not lower select_cc).

PR: 13428

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160619 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-23 07:59:50 +00:00
Akira Hatanaka
60287963c7 Fix Mips long branch pass.
This pass no longer requires that the global pointer value be saved to the
stack or register since it uses bal instruction to compute branch distance.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160601 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-21 03:30:44 +00:00
Jakob Stoklund Olesen
2ec0cda5d5 Avoid folding loads that are unsafe to move.
LiveRangeEdit::foldAsLoad() can eliminate a register by folding a load
into its only use. Only do that when the load is safe to move, and it
won't extend any live ranges.

This fixes PR13414.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160575 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-20 21:29:31 +00:00
Jakob Stoklund Olesen
c321a20b2e Split loop exiting edges more aggressively.
PHIElimination splits critical edges when it predicts it can resolve
interference and eliminate copies. It doesn't split the edge if the
interference wouldn't be resolved anyway because the phi-use register is
live in the critical edge anyway.

Teach PHIElimination to split loop exiting edges with interference, even
if it wouldn't resolve the interference. This removes the necessary
copies from the loop, which is still an improvement from injecting the
copies into the loop.

The test case demonstrates the improvement. Before:

LBB0_1:
  cmpb  $0, (%rdx)
  leaq  1(%rdx), %rdx
  movl  %esi, %eax
  je  LBB0_1

After:

LBB0_1:
  cmpb  $0, (%rdx)
  leaq  1(%rdx), %rdx
  je  LBB0_1

  movl  %esi, %eax

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160571 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-20 20:49:53 +00:00
Preston Gurd
5b50701b1c Fix remaining lit tests which were failing when run on an Atom
processor.

Patches by Tyler Nowicki, Andy Zhang, and Preston Gurd!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160520 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-19 18:53:21 +00:00
Jush Lu
ee649839a2 [arm-fast-isel] Add support for vararg function calls.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160500 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-19 09:49:00 +00:00
Manman Ren
62a89f5808 X86: remove redundant cmp against zero.
Updated OptimizeCompare in peephole to remove redundant cmp against zero.
We only remove Compare if CF and OF are not used.

rdar://11855129


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160454 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-18 21:40:01 +00:00
Preston Gurd
d4d961615c This patch fixes 8 out of 20 unexpected failures in "make check"
when run on an Intel Atom processor. The failures have arisen due
to changes elsewhere in the trunk over the past 8 weeks or so.

These failures were not detected by the Atom buildbot because the
CPU on the Atom buildbot was not being detected as an Atom CPU.
The fix for this problem is in Host.cpp and X86Subtarget.cpp, but
shall remain commented out until the current set of Atom test failures
are fixed.

Patch by Andy Zhang and Tyler Nowicki!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160451 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-18 20:49:17 +00:00
Chandler Carruth
32d75bec4b Fix a somewhat nasty crasher in PR13378. This crashes inside of
LiveIntervals due to the two-addr pass generating bogus MI code.

The crux of the issue was a loop nesting problem. The intent of the code
which attempts to transform instructions before converting them to
two-addr form is to defer and reprocess any transformed instructions as
the second processing is likely to have more opportunities to coalesce
copies, etc. Unfortunately, there was one section of processing that was
not deferred -- the INSERT_SUBREG rewriting. Due to quirks of how this
rewriting proceeded, not only did it occur early, it removed the bits of
information needed for the deferred processing to correctly generate the
necessary two address form (specifically inserting a copy), but didn't
trigger any immediate assertions and produced what appeared to be
already valid two-address from code. Thus, the assertion only fired much
later in the pipeline.

The fix is to hoist the transformation logic up layer to where it can
more firmly defer all further processing, and to teach the normal
processing to handle an edge case previously handled as part of the
transformation logic. This edge case (already matched tied register
operands) needs to *not* defer any steps.

As has been brought up repeatedly in the process: wow does this code
need refactoring. I *may* squeeze in some time to at least bring sanity
to this loop... but wow... =]

Thanks to Jakob for helpful hints on the way here, and the review.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160443 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-18 18:58:22 +00:00
Victor Oliveira
26ad67adcd test commit
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160438 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-18 17:53:05 +00:00
Jack Carter
a0f14afee1 Mips specific inline asm operand modifier 'M':
Print the high order register of a double word register operand.

In 32 bit mode, a 64 bit double word integer will be represented
by 2 32 bit registers. This modifier causes the high order register
to be used in the asm expression. It is useful if you are using 
doubles in assembler and continue to control register to variable
relationships.

This patch also fixes a related bug in a previous patch:

    case 'D': // Second part of a double word register operand
    case 'L': // Low order register of a double word register operand
    case 'M': // High order register of a double word register operand

I got 'D' and 'M' confused. The second part of a double word operand
will only match 'M' for one of the endianesses. I had 'L' and 'D'
be the opposite twins when 'L' and 'M' are.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160429 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-18 06:41:36 +00:00
Joel Jones
7c82e6a32a More replacing of target-dependent intrinsics with target-indepdent
intrinsics.  The second instruction(s) to be handled are the vector versions 
of count set bits (ctpop).

The changes here are to clang so that it generates a target independent 
vector ctpop when it sees an ARM dependent vector bits set count.  The changes 
in llvm are to match the target independent vector ctpop and in 
VMCore/AutoUpgrade.cpp to update any existing bc files containing ARM 
dependent vector pop counts with target-independent ctpops.  There are also 
changes to an existing test case in llvm for ARM vector count instructions and 
to a test for the bitcode upgrade.

<rdar://problem/11892519>

There is deliberately no test for the change to clang, as so far as I know, no
consensus has been reached regarding how to test neon instructions in clang;
q.v. <rdar://problem/8762292>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160410 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-18 00:02:16 +00:00
Evan Cheng
afb24cecff Add test case for r160387
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160389 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-17 19:40:05 +00:00
Nadav Rotem
5589a69f0a Fix a crash in the legalization of large vectors.
When truncating a result of a vector that is split we need
to use the result of the split vector, and not re-split the dead node.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160357 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-17 09:07:37 +00:00
Evan Cheng
f5c0539092 Implement r160312 as target indepedenet dag combine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160354 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-17 08:31:11 +00:00
Evan Cheng
70e10d3fe4 This is another case where instcombine demanded bits optimization created
large immediates. Add dag combine logic to recover in case the large
immediates doesn't fit in cmp immediate operand field.

int foo(unsigned long l) {
  return (l>> 47) == 1;
}

we produce

  %shr.mask = and i64 %l, -140737488355328
  %cmp = icmp eq i64 %shr.mask, 140737488355328
  %conv = zext i1 %cmp to i32
  ret i32 %conv

which codegens to

movq    $0xffff800000000000,%rax
andq    %rdi,%rax
movq    $0x0000800000000000,%rcx
cmpq    %rcx,%rax
sete    %al
movzbl    %al,%eax
ret

TargetLowering::SimplifySetCC would transform
(X & -256) == 256 -> (X >> 8) == 1
if the immediate fails the isLegalICmpImmediate() test. For x86,
that's immediates which are not a signed 32-bit immediate.

Based on a patch by Eli Friedman.

PR10328
rdar://9758774


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160346 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-17 06:53:39 +00:00
Akira Hatanaka
1bda7863e3 Fix function select_cc_f32 in test/CodeGen/Mips/selectcc.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160329 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-16 23:56:51 +00:00
Evan Cheng
98819c9d1e For something like
uint32_t hi(uint64_t res)
{
        uint_32t hi = res >> 32;
        return !hi;
}

llvm IR looks like this:
define i32 @hi(i64 %res) nounwind uwtable ssp {
entry:
  %lnot = icmp ult i64 %res, 4294967296
  %lnot.ext = zext i1 %lnot to i32
  ret i32 %lnot.ext
}

The optimizer has optimize away the right shift and truncate but the resulting
constant is too large to fit in the 32-bit immediate field. The resulting x86
code is worse as a result:
        movabsq $4294967296, %rax       ## imm = 0x100000000
        cmpq    %rax, %rdi
        sbbl    %eax, %eax
        andl    $1, %eax

This patch teaches the x86 lowering code to handle ult against a large immediate
with trailing zeros. It will issue a right shift and a truncate followed by
a comparison against a shifted immediate.
        shrq    $32, %rdi
        testl   %edi, %edi
        sete    %al
        movzbl  %al, %eax

It also handles a ugt comparison against a large immediate with trailing bits
set. i.e. X >  0x0ffffffff -> (X >> 32) >= 1

rdar://11866926


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160312 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-16 19:35:43 +00:00
Nadav Rotem
7ee0e5ae60 Make ComputeDemandedBits return a deterministic result when computing an AssertZext value.
In the added testcase the constant 55 was behind an AssertZext of type i1, and ComputeDemandedBits
reported that some of the bits were both known to be one and known to be zero.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160305 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-16 18:34:53 +00:00
Tom Stellard
a4be0720c5 Revert "test/CodeGen/R600: Add some basic tests v6"
This reverts commit 11d3457afcda7848448dd7f11b2ede6552ffb9ea.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160300 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-16 18:19:43 +00:00
Alexey Samsonov
e0f5aedf97 Fix tests that failed on i686-win32 after r160248:
1. FileCheck-ize epilogue.ll and allow another asm instruction to restore %rsp.
2. Remove check in widen_arith-3.ll that was hitting instruction in epilogue instead of
vector add.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160274 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-16 14:33:36 +00:00
Tom Stellard
2015236dfc test/CodeGen/R600: Add some basic tests v6
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160273 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-16 14:17:19 +00:00
Nadav Rotem
d93ea88cde Fix a bug in the 3-address conversion of LEA when one of the operands is an
undef virtual register. The problem is that ProcessImplicitDefs removes the
definition of the register and marks all uses as undef. If we lose the undef
marker then we get a register which has no def, is not marked as undef. The
live interval analysis does not collect information for these virtual
registers and we crash in later passes.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160260 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-16 10:52:25 +00:00
Alexey Samsonov
99a92f269d This CL changes the function prologue and epilogue emitted on X86 when stack needs realignment.
It is intended to fix PR11468.

Old prologue and epilogue looked like this:
push %rbp
mov %rsp, %rbp
and $alignment, %rsp
push %r14
push %r15
...
pop %r15
pop %r14
mov %rbp, %rsp
pop %rbp

The problem was to reference the locations of callee-saved registers in exception handling:
locations of callee-saved had to be re-calculated regarding the stack alignment operation. It would
take some effort to implement this in LLVM, as currently MachineLocation can only have the form
"Register + Offset". Funciton prologue and epilogue are now changed to:

push %rbp
mov %rsp, %rbp
push %14
push %15
and $alignment, %rsp
...
lea -$size_of_saved_registers(%rbp), %rsp
pop %r15
pop %r14
pop %rbp

Reviewed by Chad Rosier.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160248 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-16 06:54:09 +00:00
Nadav Rotem
46646572f7 Fix a bug in the scalarization of BUILD_VECTOR. BUILD_VECTOR elements may be wider than the output element type. Make sure to trunc them if needed.
Together with Michael Kuperstein <michael.m.kuperstein@intel.com>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160235 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-15 20:39:08 +00:00
Nadav Rotem
d896e24299 Teach getTargetVShiftNode about TargetConstant nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160234 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-15 20:27:43 +00:00
NAKAMURA Takumi
a2a179dd7d llvm/test/CodeGen/X86/2012-07-15-broadcastfold.ll: Rewrite expressions to fit various targets.
- Make sure existence of "barrier".
  - Confirm reload corresponding to spill.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160232 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-15 14:38:35 +00:00
Nadav Rotem
aec9f382dd Rename VBROADCASTSDrm into VBROADCASTSDYrm to match the naming convention.
Allow the folding of vbroadcastRR to vbroadcastRM, where the memory operand is a spill slot.

PR12782.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160230 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-15 12:26:30 +00:00
Nadav Rotem
65f489fd7d AVX: Fix a bug in getTargetVShiftNode. The shift amount has to be a 128bit vector with the same element type as the input vector.
This is needed because of the patterns we have for the VP[SLL/SRA/SRL][W/D/Q] instructions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160222 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-14 22:26:05 +00:00
Nadav Rotem
b7e230d999 Add a dagcombine optimization to convert concat_vectors of undefs into a single undef.
The unoptimized concat_vectors isd prevented the canonicalization of the vector_shuffle node.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160221 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-14 21:30:27 +00:00
Joel Jones
06a6a300c5 This is one of the first steps at moving to replace target-dependent
intrinsics with target-indepdent intrinsics.  The first instruction(s) to be 
handled are the vector versions of count leading zeros (ctlz).

The changes here are to clang so that it generates a target independent 
vector ctlz when it sees an ARM dependent vector ctlz.  The changes in llvm 
are to match the target independent vector ctlz and in VMCore/AutoUpgrade.cpp 
to update any existing bc files containing ARM dependent vector ctlzs with 
target-independent ctlzs.  There are also changes to an existing test case in 
llvm for ARM vector count instructions and a new test for the bitcode upgrade.

<rdar://problem/11831778>

There is deliberately no test for the change to clang, as so far as I know, no
consensus has been reached regarding how to test neon instructions in clang;
q.v. <rdar://problem/8762292>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160200 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-13 23:25:25 +00:00
Duncan Sands
ab6e47bee6 Restrict this to x86, hopefully fixing ARM buildbots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160163 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-13 07:02:00 +00:00
Benjamin Kramer
feae00a68e Give the rdrand instructions a SideEffect flag and a chain so MachineCSE and MachineLICM don't touch it.
I already had the necessary things in place for IR-level passes but missed the machine passes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160137 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-12 18:14:57 +00:00
Nadav Rotem
4dff258bfb The LIT tests below do not specify the exact cpu model and fail on AVX2 machines, because we select different instructions such as vbroadcast, new shuffles, etc.
Patch by Michael Liao.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160129 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-12 13:45:15 +00:00
NAKAMURA Takumi
223875e3f8 llvm/test/CodeGen/X86/rdrand.ll: Relax expression corresponding to Win64 CC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160124 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-12 10:22:57 +00:00
Benjamin Kramer
51bf934e4d Use %s instead of the explicit name, the latter doesn't work in out-of-tree builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160120 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-12 09:36:29 +00:00
Benjamin Kramer
b9bee04995 Add intrinsics for Ivy Bridge's rdrand instruction.
The rdrand/cmov sequence is the same that is emitted by both
GCC and ICC.

Fixes PR13284.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160117 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-12 09:31:43 +00:00
Duncan Sands
4e8982a34d The result type of EXTRACT_VECTOR_ELT doesn't have to match the element type of
the input vector, it can be bigger (this is helpful for powerpc where <2 x i16>
is a legal vector type but i16 isn't a legal type, IIRC).  However this wasn't
being taken into account by ExpandRes_EXTRACT_VECTOR_ELT, causing PR13220.
Lightly tweaked version of a patch by Michael Liao.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160116 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-12 09:01:35 +00:00
Craig Topper
5aba78bd80 Update GATHER instructions to support 2 read-write operands. Patch from myself and Manman Ren.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160110 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-12 06:52:41 +00:00
Manman Ren
45ed19499b ARM: Fix optimizeCompare to correctly check safe condition.
It is safe if CPSR is killed or re-defined.
When we are done with the basic block, check whether CPSR is live-out.
Do not optimize away cmp if CPSR is live-out.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160090 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-11 22:51:44 +00:00
Akira Hatanaka
ce66afb22d Test case for r160036.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160067 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-11 19:50:46 +00:00
Manman Ren
84ae7e9034 X86: Update to peephole optimization to move Movr0 before (Sub, Cmp) pair.
When Movr0 is between sub and cmp, we move Movr0 before sub if it enables
removal of Cmp.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160066 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-11 19:35:12 +00:00
Akira Hatanaka
3fef29d881 Implement MipsTargetLowering::LowerSELECT_CC to custom lower SELECT_CC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160064 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-11 19:32:27 +00:00
Benjamin Kramer
597f2950d8 PR13326: Fix a subtle edge case in the udiv -> magic multiply generator.
This caused 6 of 65k possible 8 bit udivs to be wrong.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160058 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-11 18:31:59 +00:00
Nadav Rotem
5cd95e1478 When ext-loading and trunc-storing vectors to memory, on x86 32bit systems, allow loads/stores of 64bit values from xmm registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160044 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-11 13:27:05 +00:00
Akira Hatanaka
ba584fe8fe Lower RETURNADDR node in Mips backend.
Patch by Sasa Stankovic.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160031 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-11 00:53:32 +00:00
Jack Carter
bb78930489 Mips specific inline asm operand modifier 'L'.
Low order register of a double word register operand. Operands 
   are defined by the name of the variable they are marked with in
   the inline assembler code. This is a way to specify that the 
   operand just refers to the low order register for that variable.
   
   It is the opposite of modifier 'D' which specifies the high order
   register.
   
   Example:
   
 main()
{

    long long ll_input = 0x1111222233334444LL;
    long long ll_val = 3;
    int i_result = 0;

    __asm__ __volatile__( 
		   "or	%0, %L1, %2"
	     : "=r" (i_result) 
	     : "r" (ll_input), "r" (ll_val)); 
}

   Which results in:
   
   	lui	$2, %hi(_gp_disp)
	addiu	$2, $2, %lo(_gp_disp)
	addiu	$sp, $sp, -8
	addu	$2, $2, $25
	sw	$2, 0($sp)
	lui	$2, 13107
	ori	$3, $2, 17476     <-- Low 32 bits of ll_input
	lui	$2, 4369
	ori	$4, $2, 8738      <-- High 32 bits of ll_input
	addiu	$5, $zero, 3  <-- Low 32 bits of ll_val
	addiu	$2, $zero, 0  <-- High 32 bits of ll_val
	#APP
	or	$3, $4, $5        <-- or i_result, high 32 ll_input, low 32 of ll_val
	#NO_APP
	addiu	$sp, $sp, 8
	jr	$ra

If not direction is done for the long long for 32 bit variables results
in using the low 32 bits as ll_val shows.

There is an existing bug if 'L' or 'D' is used for the destination register
for 32 bit long longs in that the target value will be updated incorrectly
for the non-specified part unless explicitly set within the inline asm code.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160028 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-10 22:41:20 +00:00
Chad Rosier
542e35f62d Add newline.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160006 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-10 17:57:00 +00:00
Chad Rosier
3d3c75c665 Add test case accidentally omitted from r160002.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160004 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-10 17:49:39 +00:00
Chad Rosier
3f0dbab963 Add support for dynamic stack realignment in the presence of dynamic allocas on
X86.  Basically, this is a reapplication of r158087 with a few fixes.

Specifically, (1) the stack pointer is restored from the base pointer before
popping callee-saved registers and (2) in obscure cases (see comments in patch)
we must cache the value of the original stack adjustment in the prologue and
apply it in the epilogue.

rdar://11496434


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160002 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-10 17:45:53 +00:00
Nadav Rotem
2dd83eb1ab Improve the loading of load-anyext vectors by allowing the codegen to load
multiple scalars and insert them into a vector. Next, we shuffle the elements
into the correct places, as before.
Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the
migration of bitcasts happened too late in the SelectionDAG process.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159991 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-10 13:25:08 +00:00
Akira Hatanaka
182ef6fcaa Make register Mips::RA allocatable if not in mips16 mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159971 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-10 00:19:06 +00:00
Owen Anderson
d9bf71fdd2 Teach the DAG combiner to turn sitofp/uitofp from i1 into a conditional move, since there are only two possible values.
Previously, this would become an integer extension operation, followed by a real integer->float conversion.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159957 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-09 20:31:12 +00:00
Manman Ren
6209364834 X86: implement functions to analyze & synthesize CMOV|SET|Jcc
getCondFromSETOpc, getCondFromCMovOpc, getSETFromCond, getCMovFromCond

No functional change intended.
If we want to update the condition code of CMOV|SET|Jcc, we first analyze the
opcode to get the condition code, then update the condition code, finally
synthesize the new opcode form the new condition code.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159955 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-09 18:57:12 +00:00
Manman Ren
2d4215f759 X86: Fix optimizeCompare to correctly check safe condition.
It is safe if EFLAGS is killed or re-defined.
When we are done with the basic block, check whether EFLAGS is live-out.
Do not optimize away cmp if EFLAGS is live-out.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159888 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-07 03:34:46 +00:00
Manman Ren
2af66dc51a X86: peephole optimization to remove cmp instruction
For each Cmp, we check whether there is an earlier Sub which make Cmp
redundant. We handle the case where SUB operates on the same source operands as
Cmp, including the case where the two source operands are swapped.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159838 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-06 17:36:20 +00:00
Chad Rosier
fd065bbed1 [fast-isel] Tell fast-isel to do nothing with the new donothing intrinsic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159837 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-06 17:33:39 +00:00
Duncan Sands
4c3916f840 Attempt to fix windows buildbots. Patch by James Benton.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159826 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-06 14:43:16 +00:00
NAKAMURA Takumi
365f1b84d3 test/CodeGen/X86/sext-setcc-self.ll: Mark it as XFAIL: cygwin,mingw32,win32. Investigating.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159820 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-06 12:12:39 +00:00
NAKAMURA Takumi
bd985efa99 Revert r159804, "[arm-fast-isel] Add support for vararg function calls."
It broke LLVM :: CodeGen/Thumb2/large-call.ll on several hosts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159817 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-06 11:12:44 +00:00
Jush Lu
a8c4d739f2 [arm-fast-isel] Add support for vararg function calls.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159804 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-06 03:02:37 +00:00
Jack Carter
244a84ee57 Mips specific inline asm operand modifier D.
Print the second half of a double word operand.
   
   The include list was cleaned up a bit as well.
   
   Also the test case was modified to test for both
   big and little patterns.
   


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159787 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-05 23:58:21 +00:00
Akira Hatanaka
d45e37a0a5 test case for r159770.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159771 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-05 19:29:31 +00:00
Duncan Sands
e7de3b29f7 Use the right kind of booleans: we were emitting 0/1 booleans, instead of 0/-1
booleans.  Patch by James Benton.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159739 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-05 09:32:46 +00:00
Jakob Stoklund Olesen
b872078701 Ensure CopyToReg nodes are always glued to the call instruction.
The CopyToReg nodes that set up the argument registers before a call
must be glued to the call instruction. Otherwise, the scheduler may emit
the physreg copies long before the call, causing long live ranges for
the fixed registers.

Besides disabling good register allocation, that can also expose
problems when EmitInstrWithCustomInserter() splits a basic block during
the live range of a physreg.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159721 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-04 19:28:31 +00:00
Rafael Espindola
25dd5fc1cd Add a testcase for pr13209. It is not a great test, but it still fails if
159509 and 159479 are reverted. It would be really nice to be able to run
just the coalescer :-(

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159715 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-04 16:06:00 +00:00
Jakob Stoklund Olesen
59bde4d8a1 Add early if-conversion support to X86.
Implement the TII hooks needed by EarlyIfConversion to create cmov
instructions and estimate their latency.

Early if-conversion is still not enabled by default.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159695 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-04 00:09:58 +00:00
NAKAMURA Takumi
84f2ae332f test/CodeGen/SPARC/private.ll: Fixup. Forgot to prune old RUN lines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159643 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-03 04:29:20 +00:00
NAKAMURA Takumi
0176dfed27 test/CodeGen/SPARC/private.ll: FileCheck-ize.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159642 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-03 04:21:57 +00:00
NAKAMURA Takumi
ea957f0c56 test/CodeGen/X86/sincos.ll: FileCheck-ize.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159639 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-03 03:59:22 +00:00
NAKAMURA Takumi
a16d8c30cc test/CodeGen/X86/fabs.ll: FileCheck-ize.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159638 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-03 03:59:15 +00:00
NAKAMURA Takumi
40b7e7eb97 test/CodeGen/X86/2007-09-05-InvalidAsm.ll: FileCheck-ize.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159637 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-03 03:59:08 +00:00
NAKAMURA Takumi
0e0d62ebd9 test/CodeGen/X86/2004-03-30-Select-Max.ll: FileCheck-ize.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159636 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-03 03:58:59 +00:00
Jack Carter
10de025a67 mips32 long long register inline asm constraint support.
inlineasm-cnstrnt-bad-r-1.ll is NOT supposed to fail, so it was removed.    This resulted in the removal of a negative test (inlineasm-cnstrnt-bad-r-1.ll)
    


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159625 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 23:35:23 +00:00
Eric Christopher
80c1b38eff Revert " mips32 long long register inline asm constraint support." as
it appears to be breaking the bots.

This reverts commit 1b055ce320.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159619 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 23:22:25 +00:00
Jack Carter
3aaefc12bb deleted test/CodeGen/Mips/inlineasm-cnstrnt-bad-r-1.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159617 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 23:21:22 +00:00
Jack Carter
1b055ce320 mips32 long long register inline asm constraint support.
inlineasm-cnstrnt-bad-r-1.ll is NOT supposed to fail, so it was removed.    This resulted in the removal of a negative test (inlineasm-cnstrnt-bad-r-1.ll)
    


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159610 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 22:39:45 +00:00
Bob Wilson
30a507a1f5 Extend TargetPassConfig to allow running only a subset of the normal passes.
This is still a work in progress but I believe it is currently good enough
to fix PR13122 "Need unit test driver for codegen IR passes".  For example,
you can run llc with -stop-after=loop-reduce to have it dump out the IR after
running LSR.  Serializing machine-level IR is not yet supported but we have
some patches in progress for that.

The plan is to serialize the IR to a YAML file, containing separate sections
for the LLVM IR, machine-level IR, and whatever other info is needed.  Chad
suggested that we stash the stop-after pass in the YAML file and use that
instead of the start-after option to figure out where to restart the
compilation.  I think that's a great idea, but since it's not implemented yet
I put the -start-after option into this patch for testing purposes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159570 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 19:48:45 +00:00
Chandler Carruth
1de43ede89 Fix the remaining TCL-style quotes found in the testsuite. This is
another mechanical change accomplished though the power of terrible Perl
scripts.

I have manually switched some "s to 's to make escaping simpler.

While I started this to fix tests that aren't run in all configurations,
the massive number of tests is due to a really frustrating fragility of
our testing infrastructure: things like 'grep -v', 'not grep', and
'expected failures' can mask broken tests all too easily.

Essentially, I'm deeply disturbed that I can change the testsuite so
radically without causing any change in results for most platforms. =/

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159547 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 19:09:46 +00:00
Chandler Carruth
49589f0d0e Convert the uses of '|&' to use '2>&1 |' instead, which works on old
versions of Bash. In addition, I can back out the change to the lit
built-in shell test runner to support this.

This should fix the majority of fallout on Darwin, but I suspect there
will be a few straggling issues.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159544 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 18:37:59 +00:00
Bob Wilson
ac03af4ea9 Do not attempt to use ROR for Thumb1.
Patch by Matt Fischer!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159538 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 17:22:47 +00:00
Chandler Carruth
b9c5bd00e9 Fix the TCL-style quoting in one random test that somehow slipped
through my perl nets.

With this, the test suite passes even if I force it to run with the
built-in shell test logic, except for a test which REQUIREs shell.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159529 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 13:29:47 +00:00
Chandler Carruth
4177e6fff5 Convert all tests using TCL-style quoting to use shell-style quoting.
This was done through the aid of a terrible Perl creation. I will not
paste any of the horrors here. Suffice to say, it require multiple
staged rounds of replacements, state carried between, and a few
nested-construct-parsing hacks that I'm not proud of. It happens, by
luck, to be able to deal with all the TCL-quoting patterns in evidence
in the LLVM test suite.

If anyone is maintaining large out-of-tree test trees, feel free to poke
me and I'll send you the steps I used to convert things, as well as
answer any painful questions etc. IRC works best for this type of thing
I find.

Once converted, switch the LLVM lit config to use ShTests the same as
Clang. In addition to being able to delete large amounts of Python code
from 'lit', this will also simplify the entire test suite and some of
lit's architecture.

Finally, the test suite runs 33% faster on Linux now. ;]
For my 16-hardware-thread (2x 4-core xeon e5520): 36s -> 24s

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159525 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 12:47:22 +00:00
Chandler Carruth
38adf0bdaa Rewrite three tests that had truly egregious abuses of 'grep' in them to
use FileCheck.

Aside from removing a dependence on TCL-style quoting, this also makes
the tests ... significantly more robust. =] It would be really, *really*
great of the maintainer(s) of the CellSPU backend went through and
systematically rewrite these tests to use FileCheck. There are a lot
more that have nearly this bad of abuses.

Another step along the path to a TclTest-free testsuite.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159523 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 12:20:14 +00:00
Rafael Espindola
9c3d5a70f4 Now that RegistersDefinedFromSameValue handles one instruction being an
implicit_def, the other instruction can be anything, including instructions
that define multiple values. Be careful about that and don't assume what operand
0 is.
Fixes pr13249.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159509 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-01 17:08:01 +00:00
Elena Demikhovsky
8f40f7b867 Optimization of shuffle node that can fit to the register form of VBROADCAST instruction on AVX2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159504 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-01 06:12:26 +00:00
Jakob Stoklund Olesen
8ccaad526a Clear kill flags in InstrEmitter::EmitSubregNode().
When a local virtual register is made global, make sure to clear any
existing kill flags.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159461 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-29 21:00:03 +00:00
Rafael Espindola
94e3b388e5 In the initial exec mode we always do a load to find the address of a variable.
Before this patch in pic 32 bit code we would add the global base register
and not load from that address. This is a really old bug, but before the
introduction of the tls attributes we would never select initial exec for
pic code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159409 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-29 04:22:35 +00:00
Manman Ren
40307c7dbe X86: add more GATHER intrinsics in LLVM
Corrected type for index of llvm.x86.avx2.gather.d.pd.256
  from 256-bit to 128-bit.
Corrected types for src|dst|mask of llvm.x86.avx2.gather.q.ps.256
  from 256-bit to 128-bit.

Support the following intrinsics:
  llvm.x86.avx2.gather.d.q, llvm.x86.avx2.gather.q.q
  llvm.x86.avx2.gather.d.q.256, llvm.x86.avx2.gather.q.q.256
  llvm.x86.avx2.gather.d.d, llvm.x86.avx2.gather.q.d
  llvm.x86.avx2.gather.d.d.256, llvm.x86.avx2.gather.q.d.256


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159402 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-29 00:54:20 +00:00
Nuno Lopes
85b408991a add a new @llvm.donothing intrinsic that, well, does nothing, and teach CodeGen to ignore calls to it
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159383 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-28 22:30:12 +00:00
Jack Carter
7c3cd4d24e The Mips specific inline asm operand modifier 'z' has the
following description in the gnu sources:

    Print $0 if operand is zero otherwise print the op normally.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159324 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-28 01:33:40 +00:00
Akira Hatanaka
e3e12450e6 Test case for r159240.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159242 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-27 00:40:34 +00:00
Rafael Espindola
275c85f1a7 Fix llc's -print-before=pass and -print-after=pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159227 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-26 21:33:36 +00:00
Manman Ren
1f7a1b68a0 X86: add GATHER intrinsics (AVX2) in LLVM
Support the following intrinsics:
llvm.x86.avx2.gather.d.pd, llvm.x86.avx2.gather.q.pd
llvm.x86.avx2.gather.d.pd.256, llvm.x86.avx2.gather.q.pd.256
llvm.x86.avx2.gather.d.ps, llvm.x86.avx2.gather.q.ps
llvm.x86.avx2.gather.d.ps.256, llvm.x86.avx2.gather.q.ps.256

Modified Disassembler to handle VSIB addressing mode.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159221 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-26 19:47:59 +00:00
Jack Carter
0518fca843 There are a number of generic inline asm operand modifiers that
up to r158925 were handled as processor specific. Making them 
generic and putting tests for these modifiers in the CodeGen/Generic
directory caused a number of targets to fail. 

This commit addresses that problem by having the targets call 
the generic routine for generic modifiers that they don't currently
have explicit code for.

For now only generic print operands 'c' and 'n' are supported.vi


Affected files:

    test/CodeGen/Generic/asm-large-immediate.ll
    lib/Target/PowerPC/PPCAsmPrinter.cpp
    lib/Target/NVPTX/NVPTXAsmPrinter.cpp
    lib/Target/ARM/ARMAsmPrinter.cpp
    lib/Target/XCore/XCoreAsmPrinter.cpp
    lib/Target/X86/X86AsmPrinter.cpp
    lib/Target/Hexagon/HexagonAsmPrinter.cpp
    lib/Target/CellSPU/SPUAsmPrinter.cpp
    lib/Target/Sparc/SparcAsmPrinter.cpp
    lib/Target/MBlaze/MBlazeAsmPrinter.cpp
    lib/Target/Mips/MipsAsmPrinter.cpp
    
MSP430 isn't represented because it did not even run with
the long existing 'c' modifier and it was not apparent what
needs to be done to get it inline asm ready.

Contributer: Jack Carter



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159203 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-26 13:49:27 +00:00
Elena Demikhovsky
1596373671 Shuffle optimization for AVX/AVX2.
The current patch optimizes frequently used shuffle patterns and gives these instruction sequence reduction.
Before:
      vshufps $-35, %xmm1, %xmm0, %xmm2 ## xmm2 = xmm0[1,3],xmm1[1,3]
       vpermilps       $-40, %xmm2, %xmm2 ## xmm2 = xmm2[0,2,1,3]
       vextractf128    $1, %ymm1, %xmm1
       vextractf128    $1, %ymm0, %xmm0
       vshufps $-35, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[1,3],xmm1[1,3]
       vpermilps       $-40, %xmm0, %xmm0 ## xmm0 = xmm0[0,2,1,3]
       vinsertf128     $1, %xmm0, %ymm2, %ymm0
After:
      vshufps $13, %ymm0, %ymm1, %ymm1 ## ymm1 = ymm1[1,3],ymm0[0,0],ymm1[5,7],ymm0[4,4]
      vshufps $13, %ymm0, %ymm0, %ymm0 ## ymm0 = ymm0[1,3,0,0,5,7,4,4]
      vunpcklps       %ymm1, %ymm0, %ymm0 ## ymm0 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159188 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-26 08:04:10 +00:00
Andrew Trick
c9b1e25493 Enable the new LoopInfo algorithm by default.
The primary advantage is that loop optimizations will be applied in a
stable order. This helps debugging and unit test creation. It is also
a better overall implementation without pathologically bad performance
on deep functions.

On large functions (llvm-stress --size=200000 | opt -loops)
Before: 0.1263s
After:  0.0225s

On deep functions (after tweaking llvm-stress, thanks Nadav):
Before: 0.2281s
After:  0.0227s

See r158790 for more comments.

The loop tree is now consistently generated in forward order, but loop
passes are applied in reverse order over the program. If we have a
loop optimization that prefers forward order, that can easily be
achieved by adding a different type of LoopPassManager.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159183 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-26 04:11:38 +00:00
Eli Friedman
52d418df5d Make some ugly hacks for inline asm operands which name a specific register a bit more thorough. PR13196.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159176 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-25 23:42:33 +00:00
Manman Ren
540cda34b0 ARM: update peephole optimization.
More condition codes are included when deciding whether to remove cmp after
a sub instruction. Specifically, we extend from GE|LT|GT|LE to 
GE|LT|GT|LE|HS|LS|HI|LO|EQ|NE. If we have "sub a, b; cmp b, a; movhs", we
should be able to replace with "sub a, b; movls".

rdar: 11725965


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159166 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-25 21:49:38 +00:00
Jakob Stoklund Olesen
a4e6397fd9 Enforce stricter liveness rules for PHIs.
Verify that all paths from the entry block to a virtual register read
pass through a def. Enable this check even when MRI->isSSA() is false.

Verify that the live range of a virtual register is live out of all
predecessor blocks, even for PHI-values.

This requires that PHIElimination sometimes inserts IMPLICIT_DEF
instruction in predecessor blocks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159150 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-25 18:18:27 +00:00
Jakob Stoklund Olesen
5984d2b31f Run ProcessImplicitDefs on SSA form where it can be much simpler.
Implicitly defined virtual registers can simply have the <undef> bit set
on all uses, and copies can be turned into implicit defs recursively.

Physical registers are a bit trickier. We handle the common case where a
physreg def is used by a nearby instruction in the same basic block. For
more complicated cases, just leave the IMPLICIT_DEF instruction in.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159149 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-25 18:12:18 +00:00
Jakob Stoklund Olesen
82d58b147f %RCX is not a function live-out in eh.return functions.
The function live-out registers must be live at all function returns,
and %RCX is only used by eh.return. When a function also has a normal
return, only %RAX holds a return value.

This fixes PR13188.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159116 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-24 15:53:01 +00:00
Pete Cooper
b49998d76c DAG legalisation can now handle illegal fma vector types by scalarisation
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159092 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-24 00:05:44 +00:00
Hans Wennborg
ce718ff9f4 Extend the IL for selecting TLS models (PR9788)
This allows the user/front-end to specify a model that is better
than what LLVM would choose by default. For example, a variable
might be declared as

  @x = thread_local(initialexec) global i32 42

if it will not be used in a shared library that is dlopen'ed.

If the specified model isn't supported by the target, or if LLVM can
make a better choice, a different model may be used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159077 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-23 11:37:03 +00:00
Rafael Espindola
ce0a5cda8a Handle aliases to tls variables in all architectures, not just x86.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159058 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-23 00:30:03 +00:00
Evan Cheng
fc47253294 (sub X, imm) gets canonicalized to (add X, -imm)
There are patterns to handle immediates when they fit in the immediate field.
e.g. %sub = add i32 %x, -123
=>   sub r0, r0, #123
Add patterns to catch immediates that do not fit but should be materialized
with a single movw instruction rather than movw + movt pair.
e.g. %sub = add i32 %x, -65535
=>   movw r1, #65535
     sub r0, r0, r1

rdar://11726136


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159057 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-23 00:29:06 +00:00
Hal Finkel
009f7afbeb Add support for the PPC isel instruction.
The isel (integer select) instruction is supported on the 440 and A2
embedded cores and on the POWER7.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159045 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-22 23:10:08 +00:00
Chad Rosier
e5457d2116 FileCheckize tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159044 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-22 23:04:02 +00:00
Lang Hames
59d454959f Rename fp-op fusion option (yet again) for compatibility with GCC option.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159042 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-22 22:31:00 +00:00
Evan Cheng
c90a1fcf9f EmitZerofill should take a 64-bit size or else it's chopping off large zero-filled global. rdar://11729134
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159023 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-22 20:14:46 +00:00
NAKAMURA Takumi
84f64f317f test/CodeGen/Generic/asm-large-immediate.ll: Mark it as XFAIL: powerpc, possibly due to r158939.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158994 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-22 13:41:00 +00:00
Jakob Stoklund Olesen
e208c49172 Functions calling __builtin_eh_return must have a frame pointer.
The code in X86TargetLowering::LowerEH_RETURN() assumes that a frame
pointer exists, but the frame pointer was forced by the presence of
llvm.eh.unwind.init which isn't guaranteed.

If llvm.eh.unwind.init is actually required in functions calling
eh.return (is it?), we should diagnose that instead of emitting bad
machine code.

This should fix the dragonegg-x86_64-linux-gcc-4.6-test bot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158961 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-22 03:04:27 +00:00
Andrew Trick
ef2d9e59ab ARM scheduling fix: compute predicated implicit use properly.
Minor drive by fix to cleanup latency computation. Calling
getOperandLatency with a deliberately incorrect operand index does not
give you the latency you want.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158959 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-22 02:50:31 +00:00
Lang Hames
e023141322 Rename -allow-excess-fp-precision flag to -fuse-fp-ops, and switch from a
boolean flag to an enum: { Fast, Standard, Strict } (default = Standard).

This option controls the creation by optimizations of fused FP ops that store
intermediate results in higher precision than IEEE allows (E.g. FMAs). The
behavior of this option is intended to match the behaviour specified by a
soon-to-be-introduced frontend flag: '-ffuse-fp-ops'.

Fast mode - allows formation of fused FP ops whenever they're profitable.

Standard mode - allow fusion only for 'blessed' FP ops. At present the only
blessed op is the fmuladd intrinsic. In the future more blessed ops may be
added.

Strict mode - allow fusion only if/when it can be proven that the excess
precision won't effect the result.

Note: This option only controls formation of fused ops by the optimizers.  Fused
operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic)
will always be honored, regardless of the value of this option.

Internally TargetOptions::AllowExcessFPPrecision has been replaced by
TargetOptions::AllowFPOpFusion.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158956 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-22 01:09:09 +00:00
Jack Carter
4db98becf7 The inline asm operand modifier 'n' is suppose
to be generic across architectures. It has the
following description in the gnu sources:

    Negate the immediate constant

Several Architectures such as x86 have local implementations
of operand modifier 'n' which go beyond the above description
slightly. This won't affect them.

Affected files:

    lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp
        Added 'n' to the switch cases.

    test/CodeGen/Generic/asm-large-immediate.ll
        Generic compiled test (x86 for me)

    test/CodeGen/Mips/asm-large-immediate.ll
        Mips compiled version of the generic one

Contributer: Jack Carter



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158939 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-21 21:37:54 +00:00
Akira Hatanaka
54c5bc8799 1. fix null program output after some other changes
2. re-enable null.ll test
3. fix some minor style violations

Patch by Reed Kotler.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158935 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-21 20:39:10 +00:00
Hal Finkel
2bbc9193b4 Treat TargetGlobalAddress as a constant for the purpose of matching pre-inc stores on PPC.
Thanks to Tobias von Koch for pointing out this problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158932 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-21 20:10:48 +00:00
Jack Carter
d5e11ad51a The inline asm operand modifier 'c' is suppose
to be generic across architectures. It has the
following description in the gnu sources:

    Substitute immediate value without immediate syntax

Several Architectures such as x86 have local implementations
of operand modifier 'c' which go beyond the above description
slightly. To make use of the generic modifiers without overriding
local implementation one can make a call to the base class method
for AsmPrinter::PrintAsmOperand() in the locally derived method's 
"default" case in the switch statement. That way if it is already
defined locally the generic version will never get called.

This change is needed when test/CodeGen/generic/asm-large-immediate.ll
failed on a native Mips board. The test was assuming a generic
implementation was in place.

Affected files:

    lib/Target/Mips/MipsAsmPrinter.cpp:
        Changed the default case to call the base method.
    lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp
        Added 'c' to the switch cases.
    test/CodeGen/Mips/asm-large-immediate.ll
        Mips compiled version of the generic one

Contributer: Jack Carter



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158925 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-21 17:14:46 +00:00
NAKAMURA Takumi
e42e9ce20f Revert r158209, "test/CodeGen/Generic/APIntLoadStore.ll: Mark as XFAIL:ppc since r157911."
It passes according to ppc changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158917 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-21 13:43:06 +00:00
Lang Hames
dc13d2ed2f Add a missing llvm.fma -> VFNMS pattern to the ARM backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158902 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-21 06:10:00 +00:00
Evan Cheng
8ef0968dc2 Emit a single _udivmodsi4 libcall instead of two separate _udivsi3 and
_umodsi3 libcalls if they have the same arguments. This optimization
was apparently broken if one of the node was replaced in place.
rdar://11714607


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158900 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-21 05:56:05 +00:00
Jakob Stoklund Olesen
c4118452bc Remove the -live-regunits command line option.
Register allocators depend on it being permanently enabled now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158873 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-20 23:31:34 +00:00
Jakob Stoklund Olesen
7824152557 Only update regunit live ranges that have been precomputed.
Regunit live ranges are computed on demand, so when mi-sched calls
handleMove, some regunits may not have live ranges yet.

That makes updating them easier: Just skip the non-existing ranges. They
will be computed correctly from the rescheduled machine code when they
are needed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158831 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-20 18:00:57 +00:00
Hal Finkel
0fcdd8b2cc Add support for generating reg+reg (indexed) pre-inc loads on PPC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158823 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-20 15:43:03 +00:00