Commit Graph

6850 Commits

Author SHA1 Message Date
Hal Finkel
2bbc9193b4 Treat TargetGlobalAddress as a constant for the purpose of matching pre-inc stores on PPC.
Thanks to Tobias von Koch for pointing out this problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158932 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-21 20:10:48 +00:00
Jack Carter
d5e11ad51a The inline asm operand modifier 'c' is suppose
to be generic across architectures. It has the
following description in the gnu sources:

    Substitute immediate value without immediate syntax

Several Architectures such as x86 have local implementations
of operand modifier 'c' which go beyond the above description
slightly. To make use of the generic modifiers without overriding
local implementation one can make a call to the base class method
for AsmPrinter::PrintAsmOperand() in the locally derived method's 
"default" case in the switch statement. That way if it is already
defined locally the generic version will never get called.

This change is needed when test/CodeGen/generic/asm-large-immediate.ll
failed on a native Mips board. The test was assuming a generic
implementation was in place.

Affected files:

    lib/Target/Mips/MipsAsmPrinter.cpp:
        Changed the default case to call the base method.
    lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp
        Added 'c' to the switch cases.
    test/CodeGen/Mips/asm-large-immediate.ll
        Mips compiled version of the generic one

Contributer: Jack Carter



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158925 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-21 17:14:46 +00:00
NAKAMURA Takumi
e42e9ce20f Revert r158209, "test/CodeGen/Generic/APIntLoadStore.ll: Mark as XFAIL:ppc since r157911."
It passes according to ppc changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158917 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-21 13:43:06 +00:00
Lang Hames
dc13d2ed2f Add a missing llvm.fma -> VFNMS pattern to the ARM backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158902 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-21 06:10:00 +00:00
Evan Cheng
8ef0968dc2 Emit a single _udivmodsi4 libcall instead of two separate _udivsi3 and
_umodsi3 libcalls if they have the same arguments. This optimization
was apparently broken if one of the node was replaced in place.
rdar://11714607


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158900 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-21 05:56:05 +00:00
Jakob Stoklund Olesen
c4118452bc Remove the -live-regunits command line option.
Register allocators depend on it being permanently enabled now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158873 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-20 23:31:34 +00:00
Jakob Stoklund Olesen
7824152557 Only update regunit live ranges that have been precomputed.
Regunit live ranges are computed on demand, so when mi-sched calls
handleMove, some regunits may not have live ranges yet.

That makes updating them easier: Just skip the non-existing ranges. They
will be computed correctly from the rescheduled machine code when they
are needed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158831 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-20 18:00:57 +00:00
Hal Finkel
0fcdd8b2cc Add support for generating reg+reg (indexed) pre-inc loads on PPC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158823 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-20 15:43:03 +00:00
Craig Topper
703c38bf58 Don't insert 128-bit UNDEF into 256-bit vectors. Just keep the 256-bit vector. Original patch by Elena Demikhovsky. Tweaked by me to allow possibility of covering more cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158792 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-20 05:39:26 +00:00
Lang Hames
d693cafcfb Add DAG-combines for aggressive FMA formation.
This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or
FSUB + FMUL. The combines are performed when:
(a) Either
      AllowExcessFPPrecision option (-enable-excess-fp-precision for llc)
        OR
      UnsafeFPMath option (-enable-unsafe-fp-math)
    are set, and
(b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of
    the FADD/FSUB, and
(c) The FMUL only has one user (the FADD/FSUB).

If your target has fast FMA instructions you can make use of these combines by
overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for
types supported by your FMA instruction, and adding patterns to match ISD::FMA
to your FMA instructions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158757 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19 22:51:23 +00:00
Jakob Stoklund Olesen
9efc6d3ac3 Add a triple.
The test was failing on Linux because of asm syntax differences.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158748 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19 21:46:25 +00:00
Jakob Stoklund Olesen
7164288c3e Implement PPCInstrInfo::isCoalescableExtInstr().
The PPC::EXTSW instruction preserves the low 32 bits of its input, just
like some of the x86 instructions. Use it to reduce register pressure
when the low 32 bits have multiple uses.

This requires a small change to PeepholeOptimizer since EXTSW takes a
64-bit input register.

This is related to PR5997.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158743 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19 21:14:34 +00:00
Hal Finkel
ac81cc3282 Add support for generating reg+reg preinc stores on PPC.
PPC will now generate STWUX and friends.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158698 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19 02:34:32 +00:00
Rafael Espindola
565bdbf598 really add a triple :-(
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158696 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19 02:17:35 +00:00
Rafael Espindola
e08c1347d9 Add a triple to the test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158695 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19 01:42:34 +00:00
Rafael Espindola
d6b43a317e Move the support for using .init_array from ARM to the generic
TargetLoweringObjectFileELF. Use this to support it on X86. Unlike ARM,
on X86 it is not easy to find out if .init_array should be used or not, so
the decision is made via TargetOptions and defaults to off.

Add a command line option to llc that enables it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158692 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19 00:48:28 +00:00
Manman Ren
eda9fdf979 ARM: use NOEN loads and stores if possible when handling struct byval.
This change is to be enabled in clang.

rdar://9877866


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158684 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-18 22:23:48 +00:00
Joel Jones
96ef284da4 This change handles a another case for generating the bic instruction
when a compile time constant is known.  This occurs when implicitly zero 
extending function arguments from 16 bits to 32 bits.  The 8 bit case doesn't
need to be handled, as the 8 bit constants are encoded directly, thereby
not needing a separate load instruction to form the constant into a register.

<rdar://problem/11481151>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158659 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-18 14:51:32 +00:00
Chandler Carruth
457dfbac8a Add a regression test for the bug exposed by r158087, which has been
temporarily reverted.

This test is annoyingly overspecified, but I don't know of another way
to thoroughly test the saving and restoring of the registers. While this
will have to be adjusted even with the issue fixed in order to re-apply
r158087, those adjustments should very clearly indicate that it is still
correct (%esp getting restored prior to pops), whereas without it, this
case can easily slip under the radar.

Still, any suggestions for improvements are very welcome.

All credit to Matt Beaumont-Gay for reducing this out of an insane
Address Sanitizer crash to a reasonably small seg-faulting C program
when built with -mstackrealign. I just reduced it to IR, which was much
simpler. =]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158656 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-18 09:15:04 +00:00
Chandler Carruth
43369249e7 Temporarily revert r158087.
This patch causes problems when both dynamic stack realignment and
dynamic allocas combine in the same function. With this patch, we no
longer build the epilog correctly, and silently restore registers from
the wrong position in the stack.

Thanks to Matt for tracking this down, and getting at least an initial
test case to Chad. I'm going to try to check a variation of that test
case in so we can easily track the fixes required.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158654 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-18 07:03:12 +00:00
Hal Finkel
2741d2cfdf Cleanup trip-count finding for PPC CTR loops (and some bug fixes).
This cleans up the method used to find trip counts in order to form CTR loops on PPC.
This refactoring allows the pass to find loops which have a constant trip count but also
happen to end with a comparison to zero. This also adds explicit FIXMEs to mark two different
classes of loops that are currently ignored.

In addition, we now search through all potential induction operations instead of just the first.
Also, we check the predicate code on the conditional branch and abort the transformation if the
code is not EQ or NE, and we then make sure that the branch to be transformed matches the
condition register defined by the comparison (multiple possible comparisons will be considered).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158607 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-16 20:34:07 +00:00
Manman Ren
307473dec0 ARM: optimization for sub+abs.
This patch will optimize abs(x-y)
FROM
sub, movs, rsbmi
TO
subs, rsbmi

For abs, we will use cmp instead of movs. This is necessary because we already
have an existing peephole pass which optimizes away cmp following sub.

rdar: 11633193


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158551 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-15 21:32:12 +00:00
Jakob Stoklund Olesen
d628a587b6 Preserve <undef> flags in ARMExpandPseudo.
This probably mostly shows up in bugpoint-generated code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158527 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-15 17:46:54 +00:00
Akira Hatanaka
1418045472 1. introduce MipsPat in place of Pat in order to exclude those from
being used by Mips16 or Micro Mips
2. clean up a few lines too long encountered

Patch by Reed Kotler.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158470 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-14 21:03:23 +00:00
Akira Hatanaka
6b0cd9b9c6 Make machine verifier check the first instruction of the last bundle instead of
the last instruction of a basic block.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158468 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-14 20:51:13 +00:00
Manman Ren
39f5eb1563 Revert: test/CodeGen/ARM/iabs.ll in r158441
Sorry that I accidently checked in this file with my previous commit.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158442 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-14 06:04:02 +00:00
Manman Ren
7a0575b9a8 InstCombine: fix a bug when combining (fcmp cc0 x, y) && (fcmp cc1 x, y).
uno && ueq was converted to ueq, it should be converted to uno.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158441 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-14 05:57:42 +00:00
Akira Hatanaka
ce5c6fb55f Test case for MIPS long branch pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158438 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-14 02:12:21 +00:00
Akira Hatanaka
163d706c46 Fix test cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158435 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-14 01:21:00 +00:00
Akira Hatanaka
8782707f50 Implement a DAGCombine in MipsISelLowering.cpp which transforms the following
pattern:

(add v0, (add v1, abs_lo(tjt))) => (add (add v0, v1), abs_lo(tjt))

"tjt" is a TargetJumpTable node. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158419 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13 20:33:18 +00:00
Akira Hatanaka
e193b32583 Set a higher value for maxStoresPerMemcpy in MipsISelLowering.cpp.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158414 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13 19:33:32 +00:00
Akira Hatanaka
777a120285 Implement fastcc calling convention for MIPS.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158410 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13 18:06:00 +00:00
Richard Osborne
aa08c8b2ba Fix pattern for MKMSK instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158409 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13 17:59:12 +00:00
Craig Topper
cc95b57d42 Fix intrinsics for XOP frczss/sd instructions. These instructions only take one source register and zero the upper bits of the destination rather than preserving them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158396 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13 07:18:53 +00:00
Akira Hatanaka
942918d13e disable use of directive .set nomicromips
until this directive is pushed in gas to open source fsf

Patch by Reed Kotler.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158381 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13 02:41:14 +00:00
Andrew Trick
1c2d3c538c sched: fix latency of memory dependence chain edges for consistency.
For store->load dependencies that may alias, we should always use
TrueMemOrderLatency, which may eventually become a subtarget hook. In
effect, we should guarantee at least TrueMemOrderLatency on at least
one DAG path from a store to a may-alias load.

This should fix the standard mode as well as -enable-aa-sched-mi".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158380 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13 02:39:03 +00:00
Chad Rosier
49d6fc02ef [arm-fast-isel] Add support for -arm-long-calls.
Patch by Jush Lu <jush.msn@gmail.com>.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158368 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-12 19:25:13 +00:00
Jakob Stoklund Olesen
ad27086e66 Fix test that depends on register allocation.
The test is really checking the prolog/epilog load/store multiple
formation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158328 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-11 21:14:28 +00:00
Jakob Stoklund Olesen
7ed1ad54f2 Fix test case to work on ARM.
Patch by James Benton!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158316 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-11 16:01:14 +00:00
Bill Wendling
ad5c880892 Re-enable the CMN instruction.
We turned off the CMN instruction because it had semantics which we weren't
getting correct. If we are comparing with an immediate, then it's okay to use
the CMN instruction.
<rdar://problem/7569620>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158302 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-11 08:07:26 +00:00
Hal Finkel
71ffcfe9f8 Enable ILP scheduling for all nodes by default on PPC.
Over the entire test-suite, this has an insignificantly negative average
performance impact, but reduces some of the worst slowdowns from the
anti-dep. change (r158294).

Largest speedups:
SingleSource/Benchmarks/Stanford/Quicksort - 28%
SingleSource/Benchmarks/Stanford/Towers - 24%
SingleSource/Benchmarks/Shootout-C++/matrix - 23%
MultiSource/Benchmarks/SciMark2-C/scimark2 - 19%
MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount - 15%
(matrix and automotive-bitcount were both in the top-5 slowdown list from the
anti-dep. change)

Largest slowdowns:
MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 28%
MultiSource/Benchmarks/mediabench/gsm/toast/toast - 26%
MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan - 21%
SingleSource/Benchmarks/CoyoteBench/lpbench - 20%
MultiSource/Applications/d/make_dparser - 16%

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158296 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-10 19:32:29 +00:00
Hal Finkel
0a3e33b633 Improve ext/trunc patterns on PPC64.
The PPC64 backend had patterns for i32 <-> i64 extensions and truncations that
would leave self-moves in the final assembly. Replacing those patterns with ones
based on the SUBREG builtins yields better-looking code.

Thanks to Jakob and Owen for their suggestions in this matter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158283 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09 22:10:19 +00:00
Craig Topper
c29106b36f Replace XOP vpcom intrinsics with fewer intrinsics that take the immediate as an argument.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158278 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09 16:46:13 +00:00
Hal Finkel
8bf75ed41c Enable tail merging on PPC.
Tail merging had been disabled on PPC because it would disturb bundling decisions
made during pre-RA scheduling on the 970 cores. Now, however, all bundling decisions
are made during post-RA scheduling, and tail merging is generally beneficial (the
average test-suite speedup is insignificantly positive).

Largest test-suite speedups:
MultiSource/Benchmarks/mediabench/gsm/toast/toast - 30%
MultiSource/Benchmarks/BitBench/uuencode/uuencode - 23%
SingleSource/Benchmarks/Shootout-C++/ary - 21%
SingleSource/Benchmarks/Stanford/Queens - 17%

Largest slowdowns:
MultiSource/Benchmarks/MiBench/security-sha/security-sha - 24%
MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 22%
MultiSource/Applications/JM/ldecod/ldecod - 14%
MultiSource/Benchmarks/mediabench/g721/g721encode/encode - 9%

This is improved by using full (instead of just critical) anti-dependency breaking,
but doing so still causes miscompiles and so cannot yet be enabled by default.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158259 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09 03:14:50 +00:00
Jakob Stoklund Olesen
6660ed5f2f Don't run RAFast in the optimizing regalloc pipeline.
The fast register allocator is not supposed to work in the optimizing
pipeline. It doesn't make sense to compute live intervals, run full copy
coalescing, and then run RAFast.

Fast register allocation in the optimizing pipeline is better done by
RABasic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158242 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 23:15:12 +00:00
Hal Finkel
7255d2a808 Enable PPC CTR loop formation by default.
Thanks to Jakob's help, this now causes no new test suite failures!

Over the entire test suite, this gives an average 1% speedup. The largest speedups are:
SingleSource/Benchmarks/Misc/pi - 108%
SingleSource/Benchmarks/CoyoteBench/lpbench - 54%
MultiSource/Benchmarks/Prolangs-C/unix-smail/unix-smail - 50%
SingleSource/Benchmarks/Shootout/ary3 - 32%
SingleSource/Benchmarks/Shootout-C++/matrix - 30%

The largest slowdowns are:
MultiSource/Benchmarks/mediabench/gsm/toast/toast - -30%
MultiSource/Benchmarks/Prolangs-C/bison/mybison - -25%
MultiSource/Benchmarks/BitBench/uuencode/uuencode - -22%
MultiSource/Applications/d/make_dparser - -14%
SingleSource/Benchmarks/Shootout-C++/ary - -13%

In light of these slowdowns, additional profiling work is obviously needed!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158223 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 19:19:53 +00:00
Manman Ren
6620ccf5d8 Test case for r158160
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158218 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 18:42:37 +00:00
Chad Rosier
28dd960cd1 Fix a crash in APInt::lshr when shiftAmt > BitWidth.
Patch by James Benton <jbenton@vmware.com>.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158213 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 18:04:52 +00:00
NAKAMURA Takumi
7c18922f85 test/CodeGen/Generic/APIntLoadStore.ll: Mark as XFAIL:ppc since r157911.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158209 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 16:28:06 +00:00
Hal Finkel
09fdc7baae Disable the PPC CTR-Loops pass by default.
The pass itself works well, but the something in the Machine* infrastructure
does not understand terminators which define registers. Without the ability
to use the block-placement pass, etc. this causes performance regressions (and
so is turned off by default). Turning off the analysis turns off the problems
with the Machine* infrastructure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158206 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 15:38:25 +00:00