Commit Graph

6815 Commits

Author SHA1 Message Date
Andrew Trick
1c2d3c538c sched: fix latency of memory dependence chain edges for consistency.
For store->load dependencies that may alias, we should always use
TrueMemOrderLatency, which may eventually become a subtarget hook. In
effect, we should guarantee at least TrueMemOrderLatency on at least
one DAG path from a store to a may-alias load.

This should fix the standard mode as well as -enable-aa-sched-mi".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158380 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13 02:39:03 +00:00
Chad Rosier
49d6fc02ef [arm-fast-isel] Add support for -arm-long-calls.
Patch by Jush Lu <jush.msn@gmail.com>.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158368 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-12 19:25:13 +00:00
Jakob Stoklund Olesen
ad27086e66 Fix test that depends on register allocation.
The test is really checking the prolog/epilog load/store multiple
formation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158328 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-11 21:14:28 +00:00
Jakob Stoklund Olesen
7ed1ad54f2 Fix test case to work on ARM.
Patch by James Benton!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158316 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-11 16:01:14 +00:00
Bill Wendling
ad5c880892 Re-enable the CMN instruction.
We turned off the CMN instruction because it had semantics which we weren't
getting correct. If we are comparing with an immediate, then it's okay to use
the CMN instruction.
<rdar://problem/7569620>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158302 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-11 08:07:26 +00:00
Hal Finkel
71ffcfe9f8 Enable ILP scheduling for all nodes by default on PPC.
Over the entire test-suite, this has an insignificantly negative average
performance impact, but reduces some of the worst slowdowns from the
anti-dep. change (r158294).

Largest speedups:
SingleSource/Benchmarks/Stanford/Quicksort - 28%
SingleSource/Benchmarks/Stanford/Towers - 24%
SingleSource/Benchmarks/Shootout-C++/matrix - 23%
MultiSource/Benchmarks/SciMark2-C/scimark2 - 19%
MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount - 15%
(matrix and automotive-bitcount were both in the top-5 slowdown list from the
anti-dep. change)

Largest slowdowns:
MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 28%
MultiSource/Benchmarks/mediabench/gsm/toast/toast - 26%
MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan - 21%
SingleSource/Benchmarks/CoyoteBench/lpbench - 20%
MultiSource/Applications/d/make_dparser - 16%

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158296 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-10 19:32:29 +00:00
Hal Finkel
0a3e33b633 Improve ext/trunc patterns on PPC64.
The PPC64 backend had patterns for i32 <-> i64 extensions and truncations that
would leave self-moves in the final assembly. Replacing those patterns with ones
based on the SUBREG builtins yields better-looking code.

Thanks to Jakob and Owen for their suggestions in this matter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158283 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09 22:10:19 +00:00
Craig Topper
c29106b36f Replace XOP vpcom intrinsics with fewer intrinsics that take the immediate as an argument.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158278 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09 16:46:13 +00:00
Hal Finkel
8bf75ed41c Enable tail merging on PPC.
Tail merging had been disabled on PPC because it would disturb bundling decisions
made during pre-RA scheduling on the 970 cores. Now, however, all bundling decisions
are made during post-RA scheduling, and tail merging is generally beneficial (the
average test-suite speedup is insignificantly positive).

Largest test-suite speedups:
MultiSource/Benchmarks/mediabench/gsm/toast/toast - 30%
MultiSource/Benchmarks/BitBench/uuencode/uuencode - 23%
SingleSource/Benchmarks/Shootout-C++/ary - 21%
SingleSource/Benchmarks/Stanford/Queens - 17%

Largest slowdowns:
MultiSource/Benchmarks/MiBench/security-sha/security-sha - 24%
MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 22%
MultiSource/Applications/JM/ldecod/ldecod - 14%
MultiSource/Benchmarks/mediabench/g721/g721encode/encode - 9%

This is improved by using full (instead of just critical) anti-dependency breaking,
but doing so still causes miscompiles and so cannot yet be enabled by default.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158259 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09 03:14:50 +00:00
Jakob Stoklund Olesen
6660ed5f2f Don't run RAFast in the optimizing regalloc pipeline.
The fast register allocator is not supposed to work in the optimizing
pipeline. It doesn't make sense to compute live intervals, run full copy
coalescing, and then run RAFast.

Fast register allocation in the optimizing pipeline is better done by
RABasic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158242 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 23:15:12 +00:00
Hal Finkel
7255d2a808 Enable PPC CTR loop formation by default.
Thanks to Jakob's help, this now causes no new test suite failures!

Over the entire test suite, this gives an average 1% speedup. The largest speedups are:
SingleSource/Benchmarks/Misc/pi - 108%
SingleSource/Benchmarks/CoyoteBench/lpbench - 54%
MultiSource/Benchmarks/Prolangs-C/unix-smail/unix-smail - 50%
SingleSource/Benchmarks/Shootout/ary3 - 32%
SingleSource/Benchmarks/Shootout-C++/matrix - 30%

The largest slowdowns are:
MultiSource/Benchmarks/mediabench/gsm/toast/toast - -30%
MultiSource/Benchmarks/Prolangs-C/bison/mybison - -25%
MultiSource/Benchmarks/BitBench/uuencode/uuencode - -22%
MultiSource/Applications/d/make_dparser - -14%
SingleSource/Benchmarks/Shootout-C++/ary - -13%

In light of these slowdowns, additional profiling work is obviously needed!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158223 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 19:19:53 +00:00
Manman Ren
6620ccf5d8 Test case for r158160
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158218 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 18:42:37 +00:00
Chad Rosier
28dd960cd1 Fix a crash in APInt::lshr when shiftAmt > BitWidth.
Patch by James Benton <jbenton@vmware.com>.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158213 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 18:04:52 +00:00
NAKAMURA Takumi
7c18922f85 test/CodeGen/Generic/APIntLoadStore.ll: Mark as XFAIL:ppc since r157911.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158209 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 16:28:06 +00:00
Hal Finkel
09fdc7baae Disable the PPC CTR-Loops pass by default.
The pass itself works well, but the something in the Machine* infrastructure
does not understand terminators which define registers. Without the ability
to use the block-placement pass, etc. this causes performance regressions (and
so is turned off by default). Turning off the analysis turns off the problems
with the Machine* infrastructure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158206 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 15:38:25 +00:00
Hal Finkel
daa03ec604 Fix a bug in the new PPC CTR-Loops pass.
The code which tests for an induction operation cannot assume that any
ADDI instruction will have a register operand because the operand could
also be a frame index; for example:
    %vreg16<def> = ADDI8 <fi#0>, 0; G8RC:%vreg16

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158205 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 15:38:23 +00:00
Hal Finkel
99f823f943 Add the PPCCTRLoops pass: a PPC machine-code-level optimization pass to form CTR-based loop branching code.
This pass is derived from the Hexagon HardwareLoops pass. The only significant enhancement over the Hexagon
pass is that PPCCTRLoops will also attempt to delete the replaced add and compare operations if they are
no longer otherwise used. Also, invalid preheader DebugLoc is not used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158204 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 15:38:21 +00:00
Manman Ren
9236362a64 X86: optimize generated code for integer ABS
This patch will generate the following for integer ABS:
      movl    %edi, %eax
      negl    %eax
      cmovll  %edi, %eax
INSTEAD OF
      movl    %edi, %ecx
      sarl    $31, %ecx
      leal    (%rdi,%rcx), %eax
      xorl    %ecx, %eax

There exists a target-independent DAG combine for integer ABS, which converts
integer ABS to sar+add+xor. For X86, we match this pattern back to neg+cmov. 
This is implemented in PerformXorCombine.

rdar://10695237


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158175 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-07 22:39:10 +00:00
Rafael Espindola
c07f5bbd3b Use a base register instead of an index register with the local dynamic model.
Fixes pr13048.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158158 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-07 18:39:19 +00:00
Manman Ren
87253c2ebd X86: replace SUB with CMP if possible
This patch will optimize the following
    movq    %rdi, %rax
    subq    %rsi, %rax
    cmovsq  %rsi, %rdi
    movq    %rdi, %rax
to
    cmpq    %rsi, %rdi
    cmovsq  %rsi, %rdi
    movq    %rdi, %rax

Perform this optimization if the actual result of SUB is not used.

rdar: 11540023


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158126 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-07 00:42:47 +00:00
Manman Ren
2afde7782d Revert r157755.
The commit is intended to fix rdar://11540023.
It is implemented as part of peephole optimization. We can actually implement
this in the SelectionDAG lowering phase.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158122 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-06 23:53:03 +00:00
Chad Rosier
a97b180fc4 Add support for dynamic stack realignment in the presence of dynamic allocas on
X86.
rdar://11496434


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158087 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-06 17:37:40 +00:00
Joel Jones
e061053051 Revert commit r157966
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157972 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-05 00:47:21 +00:00
Joel Jones
dd52bf2ed8 This change handles a another case for generating the bic instruction
when a compile time constant is known.  This occurs when implicitly zero 
extending function arguments from 16 bits to 32 bits.

<rdar://problem/11481151>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157966 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-04 23:38:57 +00:00
Akira Hatanaka
f0378f4381 Add a test case for mips64 unaligned load/store instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157939 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-04 17:57:06 +00:00
Akira Hatanaka
8c345b4567 Rename test/CodeGen/Mips/load-shift-left-right.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157938 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-04 17:50:36 +00:00
Roman Divacky
fd42ed676e Implement local-exec TLS on PowerPC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157935 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-04 17:36:38 +00:00
Nadav Rotem
fcb2c3cf5e Remove the "-promote-elements" flag. This flag is now enabled by default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157925 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-04 11:27:21 +00:00
Hal Finkel
77838f9ca9 Enable generating PPC pre-increment (r+imm) instructions by default.
It seems that this no longer causes test suite failures on PPC64 (after r157159),
and often gives a performance benefit, so it can be enabled by default.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157911 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-04 02:21:00 +00:00
Craig Topper
a15f9d5311 Rename FMA3 feature flag to just FMA to match gcc so it can be added to clang.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157903 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-03 18:58:46 +00:00
Craig Topper
529ce07c5f Rename fma4 intrinsics to just fma since they are now used for both FMA4 and FMA3. Autoupgrade support coming in a separate commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157898 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-03 07:26:46 +00:00
Manman Ren
c73ea9102b Revert r157831
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157896 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-03 03:14:24 +00:00
Craig Topper
57ae246a6a Use sse_load_f32/64 for scalar FMA3 intrinsic patterns instead of 128-bit loads to match instruction behavior.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157895 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-03 01:40:43 +00:00
Manman Ren
be4258adc5 ARM: add testing case for struct byval
rdar://9877866


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157876 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-02 05:37:44 +00:00
Akira Hatanaka
3309dbab30 Add another test case which tests Mips' unaligned load/store instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157874 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-02 01:13:10 +00:00
Akira Hatanaka
bdd26783b8 Fix test cases in test/CodeGen/Mips.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157868 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-02 00:05:45 +00:00
Manman Ren
73c2f7f5ed X86: peephole optimization to remove cmp instruction
This patch will optimize the following:
  sub r1, r3
  cmp r3, r1 or cmp r1, r3
  bge L1
TO
  sub r1, r3
  bge L1 or ble L1

If the branch instruction can use flag from "sub", then we can eliminate
the "cmp" instruction.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157831 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 19:49:33 +00:00
Chris Lattner
2b76473929 testcase for PR13006, thanks to Duncan for filing it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157824 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 18:19:46 +00:00
Hans Wennborg
f0234fcbc9 Implement the local-dynamic TLS model for x86 (PR3985)
This implements codegen support for accesses to thread-local variables
using the local-dynamic model, and adds a clean-up pass so that the base
address for the TLS block can be re-used between local-dynamic access on
an execution path.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157818 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 16:27:21 +00:00
Craig Topper
3a8172ad8d Remove fadd(fmul) patterns for FMA3. This needs to be implemented by paying attention to FP_CONTRACT and matching @llvm.fma which is not available yet. This will allow us to enablle intrinsic use at least though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157804 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 06:07:48 +00:00
Chris Lattner
f59e4e3452 enhance the logic for looking through tailcalls to look through transparent casts
in multiple-return value scenarios, like what happens on X86-64 when returning
small structs.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157800 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 05:29:15 +00:00
Chris Lattner
5b0d946537 enhance getNoopInput to know about vector<->vector bitcasts of legal
types, as well as int<->ptr casts.  This allows us to tailcall functions
with some trivial casts between the call and return (i.e. because the
return types disagree).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157798 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 05:16:33 +00:00
Chris Lattner
09c14c0836 add some simple 64-bit tail call tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157797 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 05:03:31 +00:00
Chris Lattner
e8ea60b8ba merge some tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157795 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 05:00:54 +00:00
Chris Lattner
e109648880 rename test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157794 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 04:58:50 +00:00
Owen Anderson
a835f00702 Make this testcase independent of register allocation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157761 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-31 18:07:02 +00:00
Manman Ren
91c5346d91 X86: replace SUB with CMP if possible
This patch will optimize the following
        movq    %rdi, %rax
        subq    %rsi, %rax
        cmovsq  %rsi, %rdi
        movq    %rdi, %rax
to
        cmpq    %rsi, %rdi
        cmovsq  %rsi, %rdi
        movq    %rdi, %rax

Perform this optimization if the actual result of SUB is not used.

rdar: 11540023


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157755 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-31 17:20:29 +00:00
Elena Demikhovsky
177cf1e1a3 Added FMA3 Intel instructions.
I disabled FMA3 autodetection, since the result may differ from expected for some benchmarks.
I added tests for GodeGen and intrinsics.
I did not change llvm.fma.f32/64 - it may be done later.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157737 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-31 09:20:20 +00:00
Craig Topper
0559a2f8ae Add intrinsic for pclmulqdq instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157731 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-31 04:37:40 +00:00
Jakob Stoklund Olesen
9cda1be0aa Prioritize smaller register classes for urgent evictions.
It helps compile exotic inline asm. In the test case, normal GR32
virtual registers use up eax-edx so the final GR32_ABCD live range has
no registers left. Since all the live ranges were tiny, we had no way of
prioritizing the smaller register class.

This patch allows tiny unspillable live ranges to be evicted by tiny
unspillable live ranges from a smaller register class.

<rdar://problem/11542429>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157715 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-30 21:46:58 +00:00