Commit Graph

1507 Commits

Author SHA1 Message Date
Arnold Schwaighofer
67514e9066 Patch to implement UMLAL/SMLAL instructions for the ARM architecture
This patch corrects the definition of umlal/smlal instructions and adds support
for matching them to the ARM dag combiner.

Bug 12213

Patch by Yin Ma!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163136 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04 14:37:49 +00:00
Nadav Rotem
9f40cb32ac Not all targets have efficient ISel code generation for select instructions.
For example, the ARM target does not have efficient ISel handling for vector
selects with scalar conditions. This patch adds a TLI hook which allows the
different targets to report which selects are supported well and which selects
should be converted to CF duting codegen prepare.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163093 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-02 12:10:19 +00:00
Nadav Rotem
f55ef64544 Generate better select code by allowing the target to use scalar select, and not sign-extend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163086 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-02 08:20:07 +00:00
Owen Anderson
58d5729540 Teach DAG combine a number of tricks to simplify FMA expressions in fast-math mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163051 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01 06:04:27 +00:00
Jakob Stoklund Olesen
05e80f2714 Fix a couple of typos in EmitAtomic.
Thumb2 instructions are mostly constrained to rGPR, not tGPR which is
for Thumb1.

rdar://problem/12203728

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162968 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31 02:08:34 +00:00
Nadav Rotem
e757f00446 Currently targets that do not support selects with scalar conditions and vector operands - scalarize the code. ARM is such a target
because it does not support CMOV of vectors. To implement this efficientlyi, we broadcast the condition bit and use a sequence of NAND-OR
to select between the two operands. This is the same sequence we use for targets that don't have vector BLENDs (like SSE2).

rdar://12201387



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162926 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30 19:17:29 +00:00
Tim Northover
c4a32e6596 Add support for moving pure S-register to NEON pipeline if desired
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162898 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30 10:17:45 +00:00
Jush Lu
c4dc2490c4 [arm-fast-isel] Add support for ARM PIC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162823 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-29 02:41:21 +00:00
Bill Wendling
47aa9a2bb5 Make sure we add the predicate after all of the registers are added.
<rdar://problem/12183003>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162703 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-27 22:12:44 +00:00
Stepan Dyatkovskiy
fdeb9fe5e0 Rejected 169195. As Duncan commented, bitcasting to proper type is wrong approach. We need to insert some valid TRANCATE node here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162354 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-22 09:33:55 +00:00
Tim Northover
d43d7fec10 Add correct set of regression tests for r162094 commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162276 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-21 12:43:03 +00:00
Jakob Stoklund Olesen
5379904807 Add a missing def flag.
*** Bad machine code: Explicit definition marked as use ***
- function:    test_cos
- basic block: BB#0 L.entry (0x7ff2a2024fd0)
- instruction: VSETLNi32 %D11, %D11<undef>, %R0, 0, pred:14, pred:%noreg, %Q5<imp-use,kill>, %Q5<imp-def>
- operand 0:   %D11

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162247 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-21 00:34:53 +00:00
Jakob Stoklund Olesen
e7fdef420d Don't add CFG edges for redundant conditional branches.
IR that hasn't been through SimplifyCFG can look like this:

  br i1 %b, label %r, label %r

Make sure we don't create duplicate Machine CFG edges in this case.

Fix the machine code verifier to accept conditional branches with a
single CFG edge.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162230 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-20 21:39:52 +00:00
Stepan Dyatkovskiy
bee05dc840 Forget to add testcase for r162195. Sorry.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162196 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-20 08:03:18 +00:00
Jakob Stoklund Olesen
864c8702ba Also combine zext/sext into selects for ARM.
This turns common i1 patterns into predicated instructions:

  (add (zext cc), x) -> (select cc (add x, 1), x)
  (add (sext cc), x) -> (select cc (add x, -1), x)

For a function like:

  unsigned f(unsigned s, int x) {
    return s + (x>0);
  }

We now produce:

  cmp r1, #0
  it  gt
  addgt.w r0, r0, #1

Instead of:

  movs  r2, #0
  cmp r1, #0
  it  gt
  movgt r2, #1
  add r0, r2

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162177 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-18 21:25:22 +00:00
Jakob Stoklund Olesen
dcd2342d32 Also pass logical ops to combineSelectAndUse.
Add these transformations to the existing add/sub ones:

  (and (select cc, -1, c), x) -> (select cc, x, (and, x, c))
  (or  (select cc, 0, c), x)  -> (select cc, x, (or, x, c))
  (xor (select cc, 0, c), x)  -> (select cc, x, (xor, x, c))

The selects can then be transformed to a single predicated instruction
by peephole.

This transformation will make it possible to eliminate the ISD::CAND,
COR, and CXOR custom DAG nodes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162176 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-18 21:25:16 +00:00
Jakob Stoklund Olesen
a7fb3f6804 Avoid folding ADD instructions with FI operands.
PEI can't handle the pseudo-instructions. This can be removed when the
pseudo-instructions are replaced by normal predicated instructions.

Fixes PR13628.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162130 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-17 20:55:34 +00:00
Tim Northover
3c8ad92455 Implement NEON domain switching for scalar <-> S-register vmovs on ARM
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162094 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-17 11:32:52 +00:00
Jakob Stoklund Olesen
083b48af14 Add ADD and SUB to the predicable ARM instructions.
It is not my plan to duplicate the entire ARM instruction set with
predicated versions. We need a way of representing predicated
instructions in SSA form without requiring a separate opcode.

Then the pseudo-instructions can go away.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162061 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-16 23:21:55 +00:00
Jush Lu
2ff4e9d5af [arm-fast-isel] Add support for fastcc.
Without fastcc support, the caller just falls through to CallingConv::C
for fastcc, but callee still uses fastcc, this inconsistency of calling
convention is a problem, and fastcc support can fix it.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162013 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-16 05:15:53 +00:00
Jakob Stoklund Olesen
2860b7ea3a Fold predicable instructions into MOVCC / t2MOVCC.
The ARM select instructions are just predicated moves. If the select is
the only use of an operand, the instruction defining the operand can be
predicated instead, saving one instruction and decreasing register
pressure.

This implementation can turn AND/ORR/EOR instructions into their
corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to
predicate any instruction, but we don't yet support predicated
instructions in SSA form.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161994 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-15 22:16:39 +00:00
Bill Wendling
1e0b864d23 Rework test so that it reproduces the error without the horrible flag.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161989 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-15 21:10:18 +00:00
Evan Cheng
a99c508c8d Use vld1/vst1 to load/store f64 if alignment is < 4 and the target allows unaligned access. rdar://12091029
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161962 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-15 17:44:53 +00:00
Anton Korobeynikov
652199961a The names of VFP variants of half-to-float conversion instructions were
reversed. This leads to wrong codegen for float-to-half conversion
intrinsics which are used to support storage-only fp16 type.
NEON variants of same instructions are fine.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161907 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-14 23:36:01 +00:00
Nadav Rotem
3e883734fa During the CodeGenPrepare we often lower intrinsics (such as objsize)
and allow some optimizations to turn conditional branches into unconditional.
This commit adds a simple control-flow optimization which merges two consecutive
basic blocks which are connected by a single edge. This allows the codegen to
operate on larger basic blocks.

rdar://11973998



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161852 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-14 05:19:07 +00:00
NAKAMURA Takumi
1251263144 llvm/test/CodeGen/ARM/floorf.ll: Add explicit -mtriple=arm-unknown-unknown, or it fails on msvc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161825 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-14 00:56:06 +00:00
Owen Anderson
7c626d3097 Add a roundToIntegral method to APFloat, which can be parameterized over various rounding modes. Use this to implement SelectionDAG constant folding of FFLOOR, FCEIL, and FTRUNC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161807 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-13 23:32:49 +00:00
Nadav Rotem
df8320313b Do not optimize (or (and X,Y), Z) into BFI and other sequences if the AND ISDNode has more than one user.
rdar://11876519



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161775 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-13 18:52:44 +00:00
Eric Christopher
001d219b97 Add support for the %H output modifier.
Patch by Weiming Zhao.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161768 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-13 18:18:52 +00:00
Tim Northover
88fc697b7c Add test for previous commit correcting NEON load patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161750 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-13 10:38:45 +00:00
Arnold Schwaighofer
a7016d6fc1 Revert 161581: Patch to implement UMLAL/SMLAL instructions for the ARM
architecture

It broke MultiSource/Applications/JM/ldecod/ldecod on armv7 thumb O0 g and armv7
thumb O3.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161736 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-12 05:11:56 +00:00
Arnold Schwaighofer
bcc4c1d2d1 Patch to implement UMLAL/SMLAL instructions for the ARM architecture
This patch corrects the definition of umlal/smlal instructions and adds support
for matching them to the ARM dag combiner.

Bug 12213

Patch by Yin Ma!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161581 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-09 15:25:52 +00:00
Nadav Rotem
0b66bd9b07 Fix the legalization of ExtLoad on ARM. ExpandUnalignedLoad did not properly
handle the cases where the memory value type was illegal. 
PR 13111. 



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161565 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-09 01:56:44 +00:00
Bob Wilson
5f91a99427 Add test triples to fix win32 failures. Revert workaround from r161292.
I don't have a win32 system to test, so hopefully I got them all fixed here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161519 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-08 20:31:37 +00:00
Anton Korobeynikov
b58d7d0312 Add stack spill / reload instructions for DTriple and DQuad register classes, which
were missed for no reason. This fixes PR13377


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161299 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-04 13:16:12 +00:00
Bob Wilson
53624a2df5 Refactor and check "onlyReadsMemory" before optimizing builtins.
This patch is mostly just refactoring a bunch of copy-and-pasted code, but
it also adds a check that the call instructions are readnone or readonly.
That check was already present for sin, cos, sqrt, log2, and exp2 calls, but
it was missing for the rest of the builtins being handled in this code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161282 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-03 23:29:17 +00:00
Jush Lu
2946549a28 [arm-fast-isel] Add support for shl, lshr, and ashr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161230 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-03 02:37:48 +00:00
Jakob Stoklund Olesen
34af6f597b Clear kill flags in removeCopyByCommutingDef().
We are extending live ranges, so kill flags are not accurate. They
aren't needed until they are recomputed after RA anyway.

<rdar://problem/11950722>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161023 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-31 02:47:24 +00:00
Jush Lu
ee649839a2 [arm-fast-isel] Add support for vararg function calls.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160500 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-19 09:49:00 +00:00
Chandler Carruth
32d75bec4b Fix a somewhat nasty crasher in PR13378. This crashes inside of
LiveIntervals due to the two-addr pass generating bogus MI code.

The crux of the issue was a loop nesting problem. The intent of the code
which attempts to transform instructions before converting them to
two-addr form is to defer and reprocess any transformed instructions as
the second processing is likely to have more opportunities to coalesce
copies, etc. Unfortunately, there was one section of processing that was
not deferred -- the INSERT_SUBREG rewriting. Due to quirks of how this
rewriting proceeded, not only did it occur early, it removed the bits of
information needed for the deferred processing to correctly generate the
necessary two address form (specifically inserting a copy), but didn't
trigger any immediate assertions and produced what appeared to be
already valid two-address from code. Thus, the assertion only fired much
later in the pipeline.

The fix is to hoist the transformation logic up layer to where it can
more firmly defer all further processing, and to teach the normal
processing to handle an edge case previously handled as part of the
transformation logic. This edge case (already matched tied register
operands) needs to *not* defer any steps.

As has been brought up repeatedly in the process: wow does this code
need refactoring. I *may* squeeze in some time to at least bring sanity
to this loop... but wow... =]

Thanks to Jakob for helpful hints on the way here, and the review.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160443 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-18 18:58:22 +00:00
Joel Jones
7c82e6a32a More replacing of target-dependent intrinsics with target-indepdent
intrinsics.  The second instruction(s) to be handled are the vector versions 
of count set bits (ctpop).

The changes here are to clang so that it generates a target independent 
vector ctpop when it sees an ARM dependent vector bits set count.  The changes 
in llvm are to match the target independent vector ctpop and in 
VMCore/AutoUpgrade.cpp to update any existing bc files containing ARM 
dependent vector pop counts with target-independent ctpops.  There are also 
changes to an existing test case in llvm for ARM vector count instructions and 
to a test for the bitcode upgrade.

<rdar://problem/11892519>

There is deliberately no test for the change to clang, as so far as I know, no
consensus has been reached regarding how to test neon instructions in clang;
q.v. <rdar://problem/8762292>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160410 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-18 00:02:16 +00:00
Joel Jones
06a6a300c5 This is one of the first steps at moving to replace target-dependent
intrinsics with target-indepdent intrinsics.  The first instruction(s) to be 
handled are the vector versions of count leading zeros (ctlz).

The changes here are to clang so that it generates a target independent 
vector ctlz when it sees an ARM dependent vector ctlz.  The changes in llvm 
are to match the target independent vector ctlz and in VMCore/AutoUpgrade.cpp 
to update any existing bc files containing ARM dependent vector ctlzs with 
target-independent ctlzs.  There are also changes to an existing test case in 
llvm for ARM vector count instructions and a new test for the bitcode upgrade.

<rdar://problem/11831778>

There is deliberately no test for the change to clang, as so far as I know, no
consensus has been reached regarding how to test neon instructions in clang;
q.v. <rdar://problem/8762292>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160200 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-13 23:25:25 +00:00
Manman Ren
45ed19499b ARM: Fix optimizeCompare to correctly check safe condition.
It is safe if CPSR is killed or re-defined.
When we are done with the basic block, check whether CPSR is live-out.
Do not optimize away cmp if CPSR is live-out.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160090 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-11 22:51:44 +00:00
Owen Anderson
d9bf71fdd2 Teach the DAG combiner to turn sitofp/uitofp from i1 into a conditional move, since there are only two possible values.
Previously, this would become an integer extension operation, followed by a real integer->float conversion.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159957 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-09 20:31:12 +00:00
NAKAMURA Takumi
bd985efa99 Revert r159804, "[arm-fast-isel] Add support for vararg function calls."
It broke LLVM :: CodeGen/Thumb2/large-call.ll on several hosts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159817 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-06 11:12:44 +00:00
Jush Lu
a8c4d739f2 [arm-fast-isel] Add support for vararg function calls.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159804 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-06 03:02:37 +00:00
Chandler Carruth
1de43ede89 Fix the remaining TCL-style quotes found in the testsuite. This is
another mechanical change accomplished though the power of terrible Perl
scripts.

I have manually switched some "s to 's to make escaping simpler.

While I started this to fix tests that aren't run in all configurations,
the massive number of tests is due to a really frustrating fragility of
our testing infrastructure: things like 'grep -v', 'not grep', and
'expected failures' can mask broken tests all too easily.

Essentially, I'm deeply disturbed that I can change the testsuite so
radically without causing any change in results for most platforms. =/

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159547 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 19:09:46 +00:00
Chandler Carruth
49589f0d0e Convert the uses of '|&' to use '2>&1 |' instead, which works on old
versions of Bash. In addition, I can back out the change to the lit
built-in shell test runner to support this.

This should fix the majority of fallout on Darwin, but I suspect there
will be a few straggling issues.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159544 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 18:37:59 +00:00
Chandler Carruth
4177e6fff5 Convert all tests using TCL-style quoting to use shell-style quoting.
This was done through the aid of a terrible Perl creation. I will not
paste any of the horrors here. Suffice to say, it require multiple
staged rounds of replacements, state carried between, and a few
nested-construct-parsing hacks that I'm not proud of. It happens, by
luck, to be able to deal with all the TCL-quoting patterns in evidence
in the LLVM test suite.

If anyone is maintaining large out-of-tree test trees, feel free to poke
me and I'll send you the steps I used to convert things, as well as
answer any painful questions etc. IRC works best for this type of thing
I find.

Once converted, switch the LLVM lit config to use ShTests the same as
Clang. In addition to being able to delete large amounts of Python code
from 'lit', this will also simplify the entire test suite and some of
lit's architecture.

Finally, the test suite runs 33% faster on Linux now. ;]
For my 16-hardware-thread (2x 4-core xeon e5520): 36s -> 24s

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159525 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 12:47:22 +00:00
Rafael Espindola
9c3d5a70f4 Now that RegistersDefinedFromSameValue handles one instruction being an
implicit_def, the other instruction can be anything, including instructions
that define multiple values. Be careful about that and don't assume what operand
0 is.
Fixes pr13249.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159509 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-01 17:08:01 +00:00
Manman Ren
540cda34b0 ARM: update peephole optimization.
More condition codes are included when deciding whether to remove cmp after
a sub instruction. Specifically, we extend from GE|LT|GT|LE to 
GE|LT|GT|LE|HS|LS|HI|LO|EQ|NE. If we have "sub a, b; cmp b, a; movhs", we
should be able to replace with "sub a, b; movls".

rdar: 11725965


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159166 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-25 21:49:38 +00:00
Pete Cooper
b49998d76c DAG legalisation can now handle illegal fma vector types by scalarisation
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159092 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-24 00:05:44 +00:00
Hans Wennborg
ce718ff9f4 Extend the IL for selecting TLS models (PR9788)
This allows the user/front-end to specify a model that is better
than what LLVM would choose by default. For example, a variable
might be declared as

  @x = thread_local(initialexec) global i32 42

if it will not be used in a shared library that is dlopen'ed.

If the specified model isn't supported by the target, or if LLVM can
make a better choice, a different model may be used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159077 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-23 11:37:03 +00:00
Evan Cheng
fc47253294 (sub X, imm) gets canonicalized to (add X, -imm)
There are patterns to handle immediates when they fit in the immediate field.
e.g. %sub = add i32 %x, -123
=>   sub r0, r0, #123
Add patterns to catch immediates that do not fit but should be materialized
with a single movw instruction rather than movw + movt pair.
e.g. %sub = add i32 %x, -65535
=>   movw r1, #65535
     sub r0, r0, r1

rdar://11726136


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159057 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-23 00:29:06 +00:00
Lang Hames
59d454959f Rename fp-op fusion option (yet again) for compatibility with GCC option.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159042 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-22 22:31:00 +00:00
Andrew Trick
ef2d9e59ab ARM scheduling fix: compute predicated implicit use properly.
Minor drive by fix to cleanup latency computation. Calling
getOperandLatency with a deliberately incorrect operand index does not
give you the latency you want.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158959 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-22 02:50:31 +00:00
Lang Hames
e023141322 Rename -allow-excess-fp-precision flag to -fuse-fp-ops, and switch from a
boolean flag to an enum: { Fast, Standard, Strict } (default = Standard).

This option controls the creation by optimizations of fused FP ops that store
intermediate results in higher precision than IEEE allows (E.g. FMAs). The
behavior of this option is intended to match the behaviour specified by a
soon-to-be-introduced frontend flag: '-ffuse-fp-ops'.

Fast mode - allows formation of fused FP ops whenever they're profitable.

Standard mode - allow fusion only for 'blessed' FP ops. At present the only
blessed op is the fmuladd intrinsic. In the future more blessed ops may be
added.

Strict mode - allow fusion only if/when it can be proven that the excess
precision won't effect the result.

Note: This option only controls formation of fused ops by the optimizers.  Fused
operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic)
will always be honored, regardless of the value of this option.

Internally TargetOptions::AllowExcessFPPrecision has been replaced by
TargetOptions::AllowFPOpFusion.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158956 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-22 01:09:09 +00:00
Lang Hames
dc13d2ed2f Add a missing llvm.fma -> VFNMS pattern to the ARM backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158902 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-21 06:10:00 +00:00
Evan Cheng
8ef0968dc2 Emit a single _udivmodsi4 libcall instead of two separate _udivsi3 and
_umodsi3 libcalls if they have the same arguments. This optimization
was apparently broken if one of the node was replaced in place.
rdar://11714607


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158900 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-21 05:56:05 +00:00
Lang Hames
d693cafcfb Add DAG-combines for aggressive FMA formation.
This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or
FSUB + FMUL. The combines are performed when:
(a) Either
      AllowExcessFPPrecision option (-enable-excess-fp-precision for llc)
        OR
      UnsafeFPMath option (-enable-unsafe-fp-math)
    are set, and
(b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of
    the FADD/FSUB, and
(c) The FMUL only has one user (the FADD/FSUB).

If your target has fast FMA instructions you can make use of these combines by
overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for
types supported by your FMA instruction, and adding patterns to match ISD::FMA
to your FMA instructions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158757 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19 22:51:23 +00:00
Manman Ren
eda9fdf979 ARM: use NOEN loads and stores if possible when handling struct byval.
This change is to be enabled in clang.

rdar://9877866


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158684 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-18 22:23:48 +00:00
Joel Jones
96ef284da4 This change handles a another case for generating the bic instruction
when a compile time constant is known.  This occurs when implicitly zero 
extending function arguments from 16 bits to 32 bits.  The 8 bit case doesn't
need to be handled, as the 8 bit constants are encoded directly, thereby
not needing a separate load instruction to form the constant into a register.

<rdar://problem/11481151>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158659 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-18 14:51:32 +00:00
Manman Ren
307473dec0 ARM: optimization for sub+abs.
This patch will optimize abs(x-y)
FROM
sub, movs, rsbmi
TO
subs, rsbmi

For abs, we will use cmp instead of movs. This is necessary because we already
have an existing peephole pass which optimizes away cmp following sub.

rdar: 11633193


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158551 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-15 21:32:12 +00:00
Jakob Stoklund Olesen
d628a587b6 Preserve <undef> flags in ARMExpandPseudo.
This probably mostly shows up in bugpoint-generated code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158527 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-15 17:46:54 +00:00
Manman Ren
39f5eb1563 Revert: test/CodeGen/ARM/iabs.ll in r158441
Sorry that I accidently checked in this file with my previous commit.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158442 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-14 06:04:02 +00:00
Manman Ren
7a0575b9a8 InstCombine: fix a bug when combining (fcmp cc0 x, y) && (fcmp cc1 x, y).
uno && ueq was converted to ueq, it should be converted to uno.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158441 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-14 05:57:42 +00:00
Andrew Trick
1c2d3c538c sched: fix latency of memory dependence chain edges for consistency.
For store->load dependencies that may alias, we should always use
TrueMemOrderLatency, which may eventually become a subtarget hook. In
effect, we should guarantee at least TrueMemOrderLatency on at least
one DAG path from a store to a may-alias load.

This should fix the standard mode as well as -enable-aa-sched-mi".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158380 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13 02:39:03 +00:00
Chad Rosier
49d6fc02ef [arm-fast-isel] Add support for -arm-long-calls.
Patch by Jush Lu <jush.msn@gmail.com>.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158368 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-12 19:25:13 +00:00
Jakob Stoklund Olesen
ad27086e66 Fix test that depends on register allocation.
The test is really checking the prolog/epilog load/store multiple
formation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158328 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-11 21:14:28 +00:00
Bill Wendling
ad5c880892 Re-enable the CMN instruction.
We turned off the CMN instruction because it had semantics which we weren't
getting correct. If we are comparing with an immediate, then it's okay to use
the CMN instruction.
<rdar://problem/7569620>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158302 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-11 08:07:26 +00:00
Jakob Stoklund Olesen
6660ed5f2f Don't run RAFast in the optimizing regalloc pipeline.
The fast register allocator is not supposed to work in the optimizing
pipeline. It doesn't make sense to compute live intervals, run full copy
coalescing, and then run RAFast.

Fast register allocation in the optimizing pipeline is better done by
RABasic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158242 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 23:15:12 +00:00
Joel Jones
e061053051 Revert commit r157966
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157972 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-05 00:47:21 +00:00
Joel Jones
dd52bf2ed8 This change handles a another case for generating the bic instruction
when a compile time constant is known.  This occurs when implicitly zero 
extending function arguments from 16 bits to 32 bits.

<rdar://problem/11481151>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157966 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-04 23:38:57 +00:00
Nadav Rotem
fcb2c3cf5e Remove the "-promote-elements" flag. This flag is now enabled by default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157925 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-04 11:27:21 +00:00
Manman Ren
be4258adc5 ARM: add testing case for struct byval
rdar://9877866


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157876 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-02 05:37:44 +00:00
Owen Anderson
a835f00702 Make this testcase independent of register allocation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157761 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-31 18:07:02 +00:00
Owen Anderson
f917d20561 Switch the canonical FMA term operand order to match both the comment I wrote and the usual LLVM convention.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157708 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-30 18:54:50 +00:00
Owen Anderson
85ef6f4c99 Teach DAGCombine to canonicalize the position of a constant in the term operands of an FMA node.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157707 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-30 18:50:39 +00:00
Chad Rosier
ada759d5fa [arm-fast-isel] Add support for the llvm.frameaddress() intrinsic.
Patch by Jush Lu <jush.msn@gmail.com>.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157696 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-30 17:23:22 +00:00
Evan Cheng
eb25bd2356 Teach taildup to update livein set. rdar://11538365
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157663 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-30 00:42:39 +00:00
Chris Lattner
c32cef6aa1 These tests used intrinsics with the wrong prototype. They weren't caught because
the old verifier just checked that something "was a pointer", but not that the pointee
was correct.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157544 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-27 19:35:41 +00:00
Chad Rosier
1c8fccbc12 [arm-fast-isel] Add support for non-global callee.
Patch by Jush Lu <jush.msn@gmail.com>.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157336 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-23 18:38:57 +00:00
Nuno Lopes
23e75da7e0 revert my previous patches that introduced an additional parameter to the objectsize intrinsic.
After a lot of discussion, we realized it's not the best option for run-time bounds checking

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157255 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-22 15:25:31 +00:00
Jakob Stoklund Olesen
6e6269a976 Transfer memory operands to the right instruction.
They need to go on the PICLDR as the verifier points out.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157151 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-20 06:38:42 +00:00
Jim Grosbach
3e96531186 Refactor data-in-code annotations.
Use a dedicated MachO load command to annotate data-in-code regions.
This is the same format the linker produces for final executable images,
allowing consistency of representation and use of introspection tools
for both object and executable files.

Data-in-code regions are annotated via ".data_region"/".end_data_region"
directive pairs, with an optional region type.

data_region_directive := ".data_region" { region_type }
region_type := "jt8" | "jt16" | "jt32" | "jta32"
end_data_region_directive := ".end_data_region"

The previous handling of ARM-style "$d.*" labels was broken and has
been removed. Specifically, it didn't handle ARM vs. Thumb mode when
marking the end of the section.

rdar://11459456

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157062 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-18 19:12:01 +00:00
Tim Northover
44600d7081 Remove incorrect pattern for ARM SMML instruction.
Patch by Meador Inge.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156989 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-17 13:12:13 +00:00
Jakob Stoklund Olesen
83b3a29334 Enable sub-sub-register copy coalescing.
It is now possible to coalesce weird skewed sub-register copies by
picking a super-register class larger than both original registers. The
included test case produces code like this:

  vld2.32 {d16, d17, d18, d19}, [r0]!
  vst2.32 {d18, d19, d20, d21}, [r0]

We still perform interference checking as if it were a normal full copy
join, so this is still quite conservative. In particular, the f1 and f2
functions in the included test case still have remaining copies because
of false interference.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156878 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-15 23:31:35 +00:00
Chad Rosier
226ddf5278 [fast-isel] Add support for selecting @llvm.trap().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156646 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 21:33:49 +00:00
Chad Rosier
2b3b335f2d [fast-isel] Remove -disable-arm-fast-isel option. -fast-isel=0 suffices. Minor cleanup.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156632 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 19:40:25 +00:00
Chad Rosier
2a2e9d54e9 [fast-isel] Cleaner fix for when we're unable to handle a non-double multi-reg
retval.  Hoists check before emitting the call to avoid unnecessary work.
rdar://11430407
PR12796


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156628 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 18:51:55 +00:00
Manman Ren
247c5ab07c ARM: peephole optimization to remove cmp instruction
This patch will optimize the following cases:
  sub r1, r3 | sub r1, imm
  cmp r3, r1 or cmp r1, r3 | cmp r1, imm
  bge L1

TO
  subs r1, r3
  bge  L1 or ble L1

If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.

rdar: 10734411


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156599 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 01:30:47 +00:00
Manman Ren
fe65d98dad Revert: 156550 "ARM: peephole optimization to remove cmp instruction"
This commit broke an external linux bot and gave a compile-time warning.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156556 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-10 18:49:43 +00:00
Manman Ren
8ae4f062e4 ARM: peephole optimization to remove cmp instruction
This patch will optimize the following cases:
  sub r1, r3 | sub r1, imm
  cmp r3, r1 or cmp r1, r3 | cmp r1, imm
  bge L1

TO
  subs r1, r3
  bge  L1 or ble L1

If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.

rdar: 10734411


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156550 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-10 16:48:21 +00:00
Nuno Lopes
30759542aa change the objectsize intrinsic signature: add a 3rd parameter to denote the maximum runtime performance penalty that the user is willing to accept.
This commit only adds the parameter. Code taking advantage of it will follow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156473 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-09 15:52:43 +00:00
Owen Anderson
713e953118 Teach DAG combine to fold x-x to 0.0 when unsafe FP math is enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156324 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 20:51:25 +00:00
Owen Anderson
062c0a5b58 Teach DAGCombine the same multiply-by-1.0 folding trick when doing FMAs, just like it now knows for FMULs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156029 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-02 22:17:40 +00:00
Owen Anderson
363e4b90c0 Teach DAG combine that multiplication by 1.0 can always be constant folded.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156023 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-02 21:32:35 +00:00
Bob Wilson
ff73d8fef9 Don't introduce illegal types when creating vmull operations. <rdar://11324364>
ARM BUILD_VECTORs created after type legalization cannot use i8 or i16
operands, since those types are not legal.  Instead use i32 operands, which
will be implicitly truncated by the BUILD_VECTOR to match the element type.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155824 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-30 16:53:34 +00:00
Lang Hames
7787800481 Fix the order of the operands in the llvm.fma intrinsic patterns for ARM,
<rdar://problem/11325085>.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155724 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 18:51:24 +00:00
Evan Cheng
afb3b5ebe6 Implement a bastardized ABI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155686 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 02:11:10 +00:00