Commit Graph

420 Commits

Author SHA1 Message Date
Jakob Stoklund Olesen
6d5b7cc235 Revert r146997, "Heed spill slot alignment on ARM."
This patch caused a miscompilation of oggenc because a frame pointer was
suddenly needed halfway through register allocation.

<rdar://problem/10625436>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147487 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-03 22:34:35 +00:00
Jakob Stoklund Olesen
f06f6f50e9 Experimental support for aligned NEON spills.
ARM targets with NEON units have access to aligned vector loads and
stores that are potentially faster than unaligned operations.

Add support for spilling the callee-saved NEON registers to an aligned
stack area using 16-byte aligned NEON loads and store.

This feature is off by default, controlled by an -align-neon-spills
command line option.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147211 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-23 00:36:18 +00:00
Jakob Stoklund Olesen
52346e964f Heed spill slot alignment on ARM.
Use the spill slot alignment as well as the local variable alignment to
determine when the stack needs to be realigned. This works now that the
ARM target can always realign the stack by using a base pointer.

Still respect the ARMBaseRegisterInfo::canRealignStack() function
vetoing a realigned stack.  Don't use aligned spill code in that case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146997 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-20 22:15:04 +00:00
Evan Cheng
afff941211 ARM target code clean up. Check for iOS, not Darwin where it makes sense.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146981 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-20 18:26:50 +00:00
Evan Cheng
b16db81719 Fix a CPSR liveness tracking bug introduced when I converted IT block to bundle.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146805 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-17 01:25:34 +00:00
Jakob Stoklund Olesen
b076fb7762 Fix off-by-one error in bucket sort.
The bad sorting caused a misaligned basic block when building 176.vpr in
ARM mode.

<rdar://problem/10594653>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146767 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-16 23:00:05 +00:00
Evan Cheng
ddfd1377d2 - Add MachineInstrBundle.h and MachineInstrBundle.cpp. This includes a function
to finalize MI bundles (i.e. add BUNDLE instruction and computing register def
  and use lists of the BUNDLE instruction) and a pass to unpack bundles.
- Teach more of MachineBasic and MachineInstr methods to be bundle aware.
- Switch Thumb2 IT block to MI bundles and delete the hazard recognizer hack to
  prevent IT blocks from being broken apart.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146542 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-14 02:11:42 +00:00
Chandler Carruth
ddbc274169 Manually upgrade the test suite to specify the flag to cttz and ctlz.
I followed three heuristics for deciding whether to set 'true' or
'false':

- Everything target independent got 'true' as that is the expected
  common output of the GCC builtins.
- If the target arch only has one way of implementing this operation,
  set the flag in the way that exercises the most of codegen. For most
  architectures this is also the likely path from a GCC builtin, with
  'true' being set. It will (eventually) require lowering away that
  difference, and then lowering to the architecture's operation.
- Otherwise, set the flag differently dependending on which target
  operation should be tested.

Let me know if anyone has any issue with this pattern or would like
specific tests of another form. This should allow the x86 codegen to
just iteratively improve as I teach the backend how to differentiate
between the two forms, and everything else should remain exactly the
same.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146370 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-12 11:59:10 +00:00
Owen Anderson
4a4fdf3476 Teach SelectionDAG to match more calls to libm functions onto existing SDNodes. Mark these nodes as illegal by default, unless the target declares otherwise.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146171 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-08 19:32:14 +00:00
Chris Lattner
d2bf432b2b Upgrade syntax of tests using volatile instructions to use 'load volatile' instead of 'volatile load', which is archaic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145171 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-27 06:54:59 +00:00
Evan Cheng
eaa192af18 Add vmov.f32 to materialize f32 immediate splats which cannot be handled by
integer variants. rdar://10437054


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144608 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-15 02:12:34 +00:00
Jim Grosbach
ffc658b056 ARM VLDR/VSTR instructions don't need a size suffix.
Canonicallize on the non-suffixed form, but continue to accept assembly that
has any correctly sized type suffix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144583 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-14 23:03:21 +00:00
Jakob Stoklund Olesen
7f67091259 Linear scan is going away.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144472 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 22:39:34 +00:00
Jakob Stoklund Olesen
097d277ef0 Switch a few tests off linearscan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144460 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 19:53:52 +00:00
Jim Grosbach
6f09fcf5da ARM Darwin default relocation model is PIC.
This matches clang, so default options in llc and friends are now closer to
clang's defaults.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140863 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-30 17:41:35 +00:00
Eli Friedman
139e6699c4 Last batch of test conversions to new atomic instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140585 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-27 00:17:29 +00:00
Eli Friedman
47d3ee559a Convert more tests to new atomic instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140567 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-26 21:36:10 +00:00
Andrew Trick
3c8015aa6c Generalize this test's CHECK statements to handle different indvars modes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139577 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-13 02:46:27 +00:00
Evan Cheng
342e3161d9 Change ARM / Thumb2 addc / adde and subc / sube modeling to use physical
register dependency (rather than glue them together). This is general
goodness as it gives scheduler more freedom. However it is motivated by
a nasty bug in isel.

When a i64 sub is expanded to subc + sube.
  libcall #1
     \
      \        subc 
       \       /  \
        \     /    \
         \   /    libcall #2
          sube

If the libcalls are not serialized (i.e. both have chains which are dag
entry), legalizer can serialize them in arbitrary orders. If it's
unlucky, it can force libcall #2 before libcall #1 in the above case.

  subc
   |
  libcall #2
   |
  libcall #1
   |
  sube

However since subc and sube are "glued" together, this ends up being a
cycle when the scheduler combine subc and sube as a single scheduling
unit.

The right solution is to fix LegalizeType too chains the libcalls together.
However, LegalizeType is not processing nodes in order so that's harder than
it should be. For now, the move to physical register dependency will do.

rdar://10019576


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138791 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-30 01:34:54 +00:00
Jim Grosbach
7a32fa1c78 Update tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138116 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-19 22:19:48 +00:00
Jim Grosbach
93b3eff623 Thumb assembly parsing and encoding for LDM instruction.
Fix base register type and canonicallize to the "ldm" spelling rather than
"ldmia." Add diagnostics for incorrect writeback token and out-of-range
registers.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137986 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-18 21:50:53 +00:00
Eli Friedman
2cb1dfa446 Fix up the patterns for SXTB, SXTH, UXTB, and UXTH so that they are correctly active without HasT2ExtractPack. PR10611.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137061 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-08 19:49:37 +00:00
Jakub Staszak
990f78d53b Use MachineBranchProbabilityInfo in If-Conversion instead of its own heuristics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136826 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-03 22:34:43 +00:00
Evan Cheng
439661395f Introduce MCCodeGenInfo, which keeps information that can affect codegen
(including compilation, assembly). Move relocation model Reloc::Model from
TargetMachine to MCCodeGenInfo so it's accessible even without TargetMachine.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135468 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-19 06:37:02 +00:00
Evan Cheng
e721f5c8d3 Improve codegen for select's:
if (x != 0) x = 1
if (x == 1) x = 1

Previous codegen looks like this:
        mov     r1, r0
        cmp     r1, #1
        mov     r0, #0
        moveq   r0, #1

The naive lowering select between two different values. It should recognize the
test is equality test so it's more a conditional move rather than a select:
        cmp     r0, #1
        movne   r0, #0

rdar://9758317


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135017 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-13 00:42:17 +00:00
Jim Grosbach
92bf81ddd0 Improve test cases from r134746.
Use memory barriers to force if-conversion off for these tests instead of
the internal llc command line option ifcvt-limit.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134986 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-12 16:06:01 +00:00
Jim Grosbach
25e6d48220 Make tBX_RET and tBX_RET_vararg predicable.
The normal tBX instruction is predicable, so there's no reason the
pseudos for using it as a return shouldn't be. Gives us some nice code-gen
improvements as can be seen by the test changes. In particular, several
tests now have to disable if-conversion because it works too well and defeats
the test.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134746 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-08 21:50:04 +00:00
Jakob Stoklund Olesen
2aa6b4c142 Fix more register allocation sensitive tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134667 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-08 00:24:06 +00:00
Evan Cheng
39dfb0ff84 Change some ARM subtarget features to be single bit yes/no in order to sink them down to MC layer. Also fix tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134590 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-07 03:55:05 +00:00
Chandler Carruth
78e4fcecef FileCheck-ize another test. Reduces the llc invocations from 8 to 1, and
makes one of the tests actually mean something (as the string 'add' will
always appear in the output of this file).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134358 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-02 21:34:52 +00:00
Jim Grosbach
a7603982db ARMv7M vs. ARMv7E-M support.
The DSP instructions in the Thumb2 instruction set are an optional extension
in the Cortex-M* archtitecture. When present, the implementation is considered
an "ARMv7E-M implementation," and when not, an "ARMv7-M implementation."

Add a subtarget feature hook for the v7e-m instructions and hook it up. The
cortex-m3 cpu is an example of a v7m implementation, while the cortex-m4 is
a v7e-m implementation.

rdar://9572992



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134261 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-01 21:12:19 +00:00
Jim Grosbach
63b46faeb8 Thumb1 register to register MOV instruction is predicable.
Fix a FIXME and allow predication (in Thumb2) for the T1 register to
register MOV instructions. This allows some better codegen with
if-conversion (as seen in the test updates), plus it lays the groundwork
for pseudo-izing the tMOVCC instructions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134197 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-30 22:10:46 +00:00
Jim Grosbach
16f9924000 Pseudo-ize the t2LDMIA_RET instruction.
It's just a t2LDMIA_UPD instruction with extra codegen properties, so it
doesn't need the encoding information. As a side-benefit, we now correctly
recognize for instruction printing as a 'pop' instruction.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134173 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-30 18:25:42 +00:00
Benjamin Kramer
8981bce73f Don't depend on the optimization reverted in r134067.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134068 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-29 14:07:18 +00:00
Chris Lattner
a53616d08b Remove support for parsing the "type i32" syntax for defining a numbered
top level type without a specified number.  This syntax isn't documented
and blocks forward progress.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133371 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-19 00:03:46 +00:00
Chris Lattner
b85e4eba85 rip out a ton of intrinsic modernization logic from AutoUpgrade.cpp, which is
for pre-2.9 bitcode files.  We keep x86 unaligned loads, movnt, crc32, and the
target indep prefetch change.

As usual, updating the testsuite is a PITA.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133337 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-18 06:05:24 +00:00
Jakob Stoklund Olesen
0a074ed3ef Switch ARM to using AltOrders instead of MethodBodies.
This slightly changes the GPR allocation order on Darwin where R9 is not
a callee-saved register:

Before: %R0 %R1 %R2 %R3 %R12 %R9 %LR %R4 %R5 %R6 %R8 %R10 %R11
After:  %R0 %R1 %R2 %R3 %R9 %R12 %LR %R4 %R5 %R6 %R8 %R10 %R11

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133326 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-18 01:14:46 +00:00
Chris Lattner
26b0000166 manually upgrade a bunch of tests to modern syntax, and remove some that
are either unreduced or only test old syntax.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133228 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-17 03:14:27 +00:00
Rafael Espindola
f5b5c5156c Implement Jakob's suggestion on how to detect fall thought without calling
AnalyzeBranch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132981 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-14 06:08:32 +00:00
Rafael Espindola
4509ec42b8 AnalyzeBranch doesn't change which successors a bb has, just the order
we try to branch to them.

Before we were creating successor lists with duplicated entries. Fixing that
found a bug in isBlockOnlyReachableByFallthrough that would causes it to
return the wrong answer for

-----------
...
jne foo
jmp bar

foo:
----------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132882 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-12 03:20:32 +00:00
Cameron Zwarich
aaa5f14d7c Fix an issue where the two-address conversion pass incorrectly rewrites untied
operands to an early clobber register. This fixes <rdar://problem/9566076>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132738 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-07 23:54:00 +00:00
Jakob Stoklund Olesen
5f2316a3b5 Switch AllocationOrder to using RegisterClassInfo instead of a BitVector
of reserved registers.

Use RegisterClassInfo in RABasic as well. This slightly changes som
allocation orders because RegisterClassInfo puts CSR aliases last.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132581 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-03 20:34:53 +00:00
Stuart Hastings
88882247d2 Since I can't reproduce the failures from 131261, re-trying with a
simplified version.  <rdar://problem/9298790>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131274 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-13 00:51:54 +00:00
Stuart Hastings
8ad145d729 Revert 131266 and 131261 due to buildbot complaints.
rdar://problem/9298790


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131269 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-13 00:15:17 +00:00
Stuart Hastings
4c576ca9db Tweak 131261 (thumb2-cbnz.ll) to generate the intended cbnz.
rdar://problem/9298790


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131266 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-13 00:10:03 +00:00
Stuart Hastings
5adc646380 Non-fast-isel followup to 129634; correctly handle branches controlled
by non-CMP expressions.  The executable test case (129821) would test
this as well, if we had an "-O0 -disable-arm-fast-isel" LLVM-GCC
tester.  Alas, the ARM assembly would be very difficult to check with
FileCheck.

The thumb2-cbnz.ll test is affected; it generates larger code (tst.w
vs. cmp #0), but I believe the new version is correct.
rdar://problem/9298790


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131261 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-12 23:36:41 +00:00
Eli Friedman
5e926ac651 Re-revert r130877; it's apparently causing a regression on 197.parser,
possibly related to cbnz formation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130977 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-06 05:23:07 +00:00
Eli Friedman
baf717a08a Re-commit r130862 with a minor change to avoid an iterator running off the edge in some cases.
Original message:

Teach MachineCSE how to do simple cross-block CSE involving physregs.  This allows, for example, eliminating duplicate cmpl's on x86. Part of rdar://problem/8259436 .



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130877 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-04 22:10:36 +00:00
Eli Friedman
24d4c9911e Back out r130862; it appears to be breaking bootstrap.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130867 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-04 20:48:42 +00:00
Eli Friedman
49cec1d818 Teach MachineCSE how to do simple cross-block CSE involving physregs. This allows, for example, eliminating duplicate cmpl's on x86. Part of rdar://problem/8259436 .
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130862 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-04 19:54:24 +00:00
Jakob Stoklund Olesen
96169b189c Fix more register and coalescing dependencies.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130859 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-04 19:02:11 +00:00
Jakob Stoklund Olesen
28e104bcb0 Explicitly request physreg coalesing for a bunch of Thumb2 unit tests.
These tests all follow the same pattern:

	mov	r2, r0
	movs	r0, #0
	$CMP	r2, r1
	it	eq
	moveq	r0, #1
	bx	lr

The first 'mov' can be eliminated by rematerializing 'movs r0, #0' below the
test instruction:

	$CMP	r0, r1
	mov.w	r0, #0
	it	eq
	moveq	r0, #1
	bx	lr

So far, only physreg coalescing can do that. The register allocators won't yet
split live ranges just to eliminate copies. They can learn, but this particular
problem is not likely to show up in real code. It only appears because r0 is
used for both the function argument and return value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130858 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-04 19:02:07 +00:00
Jakob Stoklund Olesen
d5b679c8ce Weekly fix of register allocation dependent unit tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130567 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-30 01:37:52 +00:00
Andrew Trick
d49ffe8284 Teach Thumb2 isel to fold and->rotr ==> ROR.
Generalization of Nate Begeman's patch!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130502 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-29 14:18:15 +00:00
Andrew Trick
e02a1501e7 Combine thumb2-ror tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130498 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-29 14:02:41 +00:00
Evan Cheng
554daa67bd Be careful about scheduling nodes above previous calls. It increase usages of
more callee-saved registers and introduce copies. Only allows it if scheduling
a node above calls would end up lessen register pressure.

Call operands also has added ABI restrictions for register allocation, so be
extra careful with hoisting them above calls.

rdar://9329627


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130245 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-26 21:31:35 +00:00
Benjamin Kramer
a42a757176 Make tests more useful.
lit needs a linter ...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130126 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-25 10:12:01 +00:00
Andrew Trick
83eb906906 Accidental function name mangling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130050 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-23 04:08:15 +00:00
Andrew Trick
1c3af779fc Thumb2 and ARM add/subtract with carry fixes.
Fixes Thumb2 ADCS and SBCS lowering: <rdar://problem/9275821>.
t2ADCS/t2SBCS are now pseudo instructions, consistent with ARM, so the
assembly printer correctly prints the 's' suffix.

Fixes Thumb2 adde -> SBC matching to check for live/dead carry flags.

Fixes the internal ARM machine opcode mnemonic for ADCS/SBCS.
Fixes ARM SBC lowering to check for live carry (potential bug).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130048 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-23 03:55:32 +00:00
Andrew Trick
5adfba283d whitespace
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130046 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-23 03:24:11 +00:00
Evan Cheng
db6cbe1ff1 In Thumb2 mode, lower frame indix references to:
add <rd>, sp, #<imm8>
ldr <rd>, [sp, #<imm8>]
When the offset from sp is multiple of 4 and in range of 0-1020.
This saves code size by utilizing 16-bit instructions.

rdar://9321541


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129971 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-22 01:42:52 +00:00
Andrew Trick
87896d9368 Recommit r129383. PreRA scheduler heuristic fixes: VRegCycle, TokenFactor latency.
Additional fixes:
Do something reasonable for subtargets with generic
itineraries by handle node latency the same as for an empty
itinerary. Now nodes default to unit latency unless an itinerary
explicitly specifies a zero cycle stage or it is a TokenFactor chain.

Original fixes:
UnitsSharePred was a source of randomness in the scheduler: node
priority depended on the queue data structure. I rewrote the recent
VRegCycle heuristics to completely replace the old heuristic without
any randomness. To make the ndoe latency adjustments work, I also
needed to do something a little more reasonable with TokenFactor. I
gave it zero latency to its consumers and always schedule it as low as
possible.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129421 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-13 00:38:32 +00:00
Chris Lattner
b99e000d79 fix two completely broken tests, which were matching due to PR9629.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129195 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-09 06:34:38 +00:00
Jakob Stoklund Olesen
c3178f85b5 Fix Thumb and Thumb2 tests to be register allocator independent.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128690 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-31 23:31:50 +00:00
Eric Christopher
29aeed1bf8 Fix the bfi handling for or (and a mask) (and b mask). We need the two
masks to match inversely for the code as is to work. For the example given
we actually want:

bfi r0, r2, #1, #1

not #0, however, given the way the pattern is written it's not possible
at the moment.

Fixes rdar://9177502


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128320 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-26 01:21:03 +00:00
Cameron Zwarich
899eaa3569 Roll r127459 back in:
Optimize trivial branches in CodeGenPrepare, which often get created from the
lowering of objectsize intrinsics. Unfortunately, a number of tests were relying
on llc not optimizing trivial branches, so I had to add an option to allow them
to continue to test what they originally tested.

This fixes <rdar://problem/8785296> and <rdar://problem/9112893>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127498 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-11 21:52:04 +00:00
Daniel Dunbar
950d3db5f4 Revert r127459, "Optimize trivial branches in CodeGenPrepare, which often get
created from the", it broke some GCC test suite tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127477 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-11 19:30:30 +00:00
Cameron Zwarich
592ca3fda9 Optimize trivial branches in CodeGenPrepare, which often get created from the
lowering of objectsize intrinsics. Unfortunately, a number of tests were relying
on llc not optimizing trivial branches, so I had to add an option to allow them
to continue to test what they originally tested.

This fixes <rdar://problem/8785296> and <rdar://problem/9112893>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127459 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-11 04:54:27 +00:00
Bob Wilson
782b576749 Move a test that ended up in the wrong place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124933 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-05 04:15:50 +00:00
Evan Cheng
53519f015e Last round of fixes for movw + movt global address codegen.
1. Fixed ARM pc adjustment.
2. Fixed dynamic-no-pic codegen
3. CSE of pc-relative load of global addresses.

It's now enabled by default for Darwin.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123991 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-21 18:55:51 +00:00
Andrew Trick
d1dace8aea Enable support for precise scheduling of the instruction selection
DAG. Disable using "-disable-sched-cycles".

For ARM, this enables a framework for modeling the cpu pipeline and
counting stalls. It also activates several heuristics to drive
scheduling based on the model. Scheduling is inherently imprecise at
this stage, and until spilling is improved it may defeat attempts to
schedule. However, this framework provides greater control over
tuning codegen.

Although the flag is not target-specific, it should have very little
affect on the default scheduler used by x86. The only two changes that
affect x86 are:
- scheduling a high-latency operation bumps the current cycle so independent
  operations can have their latency covered. i.e. two independent 4
  cycle operations can produce results in 4 cycles, not 8 cycles.
- Two operations with equal register pressure impact and no
  latency-based stalls on their uses will be prioritized by depth before height
  (height is irrelevant if no stalls occur in the schedule below this point).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123971 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-21 06:19:05 +00:00
Bob Wilson
5e8b833707 Add ARM patterns to match EXTRACT_SUBVECTOR nodes.
Also fix an off-by-one in SelectionDAGBuilder that was preventing shuffle
vectors from being translated to EXTRACT_SUBVECTOR.
Patch by Tim Northover.

The test changes are needed to keep those spill-q tests from testing aligned
spills and restores.  If the only aligned stack objects are spill slots, we
no longer realign the stack frame.  Prior to this patch, an EXTRACT_SUBVECTOR
was legalized by loading from the stack, which created an aligned frame index.
Now, however, there is nothing except the spill slot in the stack frame, so
I added an aligned alloca.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122995 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-07 04:59:04 +00:00
Bob Wilson
4711d5cda3 Remove the rest of the *_sfp Neon instruction patterns.
Use the same COPY_TO_REGCLASS approach as for the 2-register *_sfp instructions.
This change made a big difference in the code generated for the
CodeGen/Thumb2/cross-rc-coalescing-2.ll test: The coalescer is still doing
a fine job, but some instructions that were previously moved outside the loop
are not moved now.  It's using fewer VFP registers now, which is generally
a good thing, so I think the estimates for register pressure changed and that
affected the LICM behavior.  Since that isn't obviously wrong, I've just
changed the test file.  This completes the work for Radar 8711675.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121730 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-13 23:02:37 +00:00
Evan Cheng
a9688c4b57 (or (and (shl A, #shamt), mask), B) => ARMbfi B, A, ~mask where lsb(mask) == #shamt. rdar://8752056
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121606 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-11 04:11:38 +00:00
Jim Grosbach
c6f9261711 ARM stm/ldm instructions require more than one register in the register list.
Otherwise, a plain str/ldr should be used instead. Make sure we account for
that in prologue/epilogue code generation.
rdar://8745460

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121391 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-09 18:31:13 +00:00
Bob Wilson
c24130bade The Thumb tADDrSPi instruction is not valid when the destination is SP.
Check for that and try narrowing it to tADDspi instead.  Radar 8724703.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120892 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-04 04:40:19 +00:00
Jim Grosbach
41ad0c4c73 When using the 'push' mnemonic for Thumb2 stmdb, be explicit when it's the
32-bit wide version by adding the .w suffix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120838 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-03 20:33:01 +00:00
Owen Anderson
9d63d90de5 Add correct encodings for STRD and LDRD, including fixup support. Additionally, update these to unified syntax.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120589 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-01 19:18:46 +00:00
Evan Cheng
ab5c703fdb Fix epilogue codegen to avoid leaving the stack pointer in an invalid
state. Previously Thumb2 would restore sp from fp like this:
mov sp, r7
sub, sp, #4
If an interrupt is taken after the 'mov' but before the 'sub', callee-saved
registers might be clobbered by the interrupt handler. Instead, try
restoring directly from sp:
add sp, #4
Or, if necessary (with VLA, etc.) use a scratch register to compute sp and
then restore it:
sub.w r4, r7, #8
mov sp, r7
rdar://8465407


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119977 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-22 18:12:04 +00:00
Eric Christopher
8b3ca6216d Rewrite stack callee saved spills and restores to use push/pop instructions.
Remove movePastCSLoadStoreOps and associated code for simple pointer
increments. Update routines that depended upon other opcodes for save/restore.

Adjust all testcases accordingly.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119725 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-18 19:40:05 +00:00
Dale Johannesen
8abe08d7f9 These tests are looking for library function names that
appear to differ on Linux.  Try to make them pass on Linux.
Would be good for a Linux person to review this.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119572 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-17 21:57:32 +00:00
Evan Cheng
c4af4638df Remove ARM isel hacks that fold large immediates into a pair of add, sub, and,
and xor. The 32-bit move immediates can be hoisted out of loops by machine
LICM but the isel hacks were preventing them.

Instead, let peephole optimization pass recognize registers that are defined by
immediates and the ARM target hook will fold the immediates in.

Other changes include 1) do not fold and / xor into cmp to isel TST / TEQ
instructions if there are multiple uses. This happens when the 'and' is live
out, machine sink would have sinked the computation and that ends up pessimizing
code. The peephole pass would recognize situations where the 'and' can be
toggled to define CPSR and eliminate the comparison anyway.

2) Move peephole pass to after machine LICM, sink, and CSE to avoid blocking
important optimizations.

rdar://8663787, rdar://8241368


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119548 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-17 20:13:28 +00:00
Evan Cheng
8239daf7c8 Two sets of changes. Sorry they are intermingled.
1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to
   "optimize for latency". Call instructions don't have the right latency and
   this is more likely to use introduce spills.
2. Fix if-converter cost function. For ARM, it should use instruction latencies,
   not # of micro-ops since multi-latency instructions is completely executed
   even when the predicate is false. Also, some instruction will be "slower"
   when they are predicated due to the register def becoming implicit input.
   rdar://8598427


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118135 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-03 00:45:17 +00:00
Jim Grosbach
ab3d00e535 Revert r114340 (improvements in Darwin function prologue/epilogue), as it broke
assumptions about stack layout. Specifically, LR must be saved next to FP.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118026 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-02 17:35:25 +00:00
Bob Wilson
f74a429816 Overhaul memory barriers in the ARM backend. Radar 8601999.
There were a number of issues to fix up here:
* The "device" argument of the llvm.memory.barrier intrinsic should be
used to distinguish the "Full System" domain from the "Inner Shareable"
domain.  It has nothing to do with using DMB vs. DSB instructions.
* The compiler should never need to emit DSB instructions.  Remove the
ARMISD::SYNCBARRIER node and also remove the instruction patterns for DSB.
* Merge the separate DMB/DSB instructions for options only used for the
disassembler with the default DMB/DSB instructions.  Add the default
"full system" option ARM_MB::SY to the ARM_MB::MemBOpt enum.
* Add a separate ARMISD::MEMBARRIER_MCR node for subtargets that implement
a data memory barrier using the MCR instruction.
* Fix up encodings for these instructions (except MCR).
I also updated the tests and added a few new ones to check for DMB options
that were not currently being exercised.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117756 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-30 00:54:37 +00:00
Evan Cheng
089751535d Avoiding overly aggressive latency scheduling. If the two nodes share an
operand and one of them has a single use that is a live out copy, favor the
one that is live out. Otherwise it will be difficult to eliminate the copy
if the instruction is a loop induction variable update. e.g.

BB:
sub r1, r3, #1
str r0, [r2, r3]
mov r3, r1
cmp
bne BB

=>

BB:
str r0, [r2, r3]
sub r3, r3, #1
cmp
bne BB

This fixed the recent 256.bzip2 regression.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117675 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-29 18:09:28 +00:00
Evan Cheng
134982daa9 More accurate estimate / tracking of register pressure.
- Initial register pressure in the loop should be all the live defs into the
  loop. Not just those from loop preheader which is often empty.
- When an instruction is hoisted, update register pressure from loop preheader
  to the original BB.
- Treat only use of a virtual register as kill since the code is still SSA.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116956 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-20 22:03:58 +00:00
Dale Johannesen
e4d31593c5 Fix crash introduced in 116852. 8573915.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116955 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-20 22:03:37 +00:00
Dale Johannesen
575cd148ce Enable using vdup for vector constants which are splat of
integers by default, and remove the controlling flag, now
that LICM will hoist such vdup's.  8003375.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116852 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-19 20:00:17 +00:00
Evan Cheng
2312842de0 Re-enable register pressure aware machine licm with fixes. Hoist() may have
erased the instruction during LICM so UpdateRegPressureAfter() should not
reference it afterwards.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116845 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-19 18:58:51 +00:00
Daniel Dunbar
9869413802 Revert r116781 "- Add a hook for target to determine whether an instruction def
is", which breaks some nightly tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116816 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-19 17:14:24 +00:00
Evan Cheng
11e8b74a7a - Add a hook for target to determine whether an instruction def is
"long latency" enough to hoist even if it may increase spilling. Reloading
  a value from spill slot is often cheaper than performing an expensive
  computation in the loop. For X86, that means machine LICM will hoist
  SQRT, DIV, etc. ARM will be somewhat aggressive with VFP and NEON
  instructions.
- Enable register pressure aware machine LICM by default.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116781 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-19 00:55:07 +00:00
Bob Wilson
7d24705f65 Change register allocation order for ARM VFP and NEON registers to put the
callee-saved registers at the end of the lists.  Also prefer to avoid using
the low registers that are in register subclasses required by certain
instructions, so that those registers will more likely be available when needed.
This change makes a huge improvement in spilling in some cases.  Thanks to
Jakob for helping me realize the problem.

Most of this patch is fixing the testsuite.  There are quite a few places
where we're checking for specific registers.  I changed those to wildcards
in places where that doesn't weaken the tests.  The spill-q.ll and
thumb2-spill-q.ll tests stopped spilling with this change, so I added a bunch
of live values to force spills on those tests.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116055 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-08 06:15:13 +00:00
Owen Anderson
8614167572 Enable target-specific mul-lowering on ARM, even at -Os. Remove a test that this makes
irrelevant, but add a new test for the new, improved functionality.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114494 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-21 22:51:46 +00:00
Jim Grosbach
1dc335a79f Simplify ARM callee-saved register handling by removing the distinction
between the high and low registers for prologue/epilogue code. This was
a Darwin-only thing that wasn't providing a realistic benefit anymore.
Combining the save areas simplifies the compiler code and results in better
ARM/Thumb2 codegen.

For example, previously we would generate code like:
        push    {r4, r5, r6, r7, lr}
        add     r7, sp, #12
        stmdb   sp!, {r8, r10, r11}
With this change, we combine the register saves and generate:
        push    {r4, r5, r6, r7, r8, r10, r11, lr}
        add     r7, sp, #12

rdar://8445635



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114340 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-20 19:32:20 +00:00
Jim Grosbach
e6be85e9ff Teach the (non-MC) instruction printer to use the cannonical names for push/pop,
and shift instructions on ARM. Update the tests to match.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114230 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-17 22:36:38 +00:00
Jim Grosbach
1aaf4cb393 Move thumb2 tests to the thumb2 directory
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114206 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-17 20:34:09 +00:00
Evan Cheng
3ef1c8759a Teach if-converter to be more careful with predicating instructions that would
take multiple cycles to decode.
For the current if-converter clients (actually only ARM), the instructions that
are predicated on false are not nops. They would still take machine cycles to
decode. Micro-coded instructions such as LDM / STM can potentially take multiple
cycles to decode. If-converter should take treat them as non-micro-coded
simple instructions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113570 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-10 01:29:16 +00:00
Bob Wilson
0f1e9457a5 Fix NEON VLD pseudo instruction itineraries that were incorrectly copied from
the VST pseudos.  The VLD/VST scheduling still needs work (see pr6722), but
at least we shouldn't confuse the loads with the stores.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113473 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-09 05:40:26 +00:00
Jim Grosbach
65482b1bb8 Re-apply r112883:
"For ARM stack frames that utilize variable sized objects and have either
large local stack areas or require dynamic stack realignment, allocate a
base register via which to access the local frame. This allows efficient
access to frame indices not accessible via the FP (either due to being out
of range or due to dynamic realignment) or the SP (due to variable sized
object allocation). In particular, this greatly improves efficiency of access
to spill slots in Thumb functions which contain VLAs."

r112986 fixed a latent bug exposed by the above.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112989 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-03 18:37:12 +00:00