1045 Commits

Author SHA1 Message Date
Eli Bendersky
0f156af831 Add a special ARM trap encoding for NaCl.
More details in this thread: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130128/163783.html

Patch by JF Bastien



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173943 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-30 16:30:19 +00:00
Tim Northover
0adfdedacb Fix 64-bit atomic operations in Thumb mode.
The ARM and Thumb variants of LDREXD and STREXD have different constraints and
take different operands. Previously the code expanding atomic operations didn't
take this into account and asserted in Thumb mode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173780 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-29 09:06:13 +00:00
Evan Cheng
8688a58c53 Teach SDISel to combine fsin / fcos into a fsincos node if the following
conditions are met:
1. They share the same operand and are in the same BB.
2. Both outputs are used.
3. The target has a native instruction that maps to ISD::FSINCOS node or
   the target provides a sincos library call.

Implemented the generic optimization in sdisel and enabled it for
Mac OSX. Also added an additional optimization for x86_64 Mac OSX by
using an alternative entry point __sincos_stret which returns the two
results in xmm0 / xmm1.

rdar://13087969
PR13204


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173755 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-29 02:32:37 +00:00
Silviu Baranga
4a9256f265 Fixed the condition codes for the atomic64 min/umin code generation on ARM. If the sutraction of the higher 32 bit parts gives a 0 result, we need to do the store operation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173437 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-25 10:39:49 +00:00
Chandler Carruth
aeef83c6af Switch TargetTransformInfo from an immutable analysis pass that requires
a TargetMachine to construct (and thus isn't always available), to an
analysis group that supports layered implementations much like
AliasAnalysis does. This is a pretty massive change, with a few parts
that I was unable to easily separate (sorry), so I'll walk through it.

The first step of this conversion was to make TargetTransformInfo an
analysis group, and to sink the nonce implementations in
ScalarTargetTransformInfo and VectorTargetTranformInfo into
a NoTargetTransformInfo pass. This allows other passes to add a hard
requirement on TTI, and assume they will always get at least on
implementation.

The TargetTransformInfo analysis group leverages the delegation chaining
trick that AliasAnalysis uses, where the base class for the analysis
group delegates to the previous analysis *pass*, allowing all but tho
NoFoo analysis passes to only implement the parts of the interfaces they
support. It also introduces a new trick where each pass in the group
retains a pointer to the top-most pass that has been initialized. This
allows passes to implement one API in terms of another API and benefit
when some other pass above them in the stack has more precise results
for the second API.

The second step of this conversion is to create a pass that implements
the TargetTransformInfo analysis using the target-independent
abstractions in the code generator. This replaces the
ScalarTargetTransformImpl and VectorTargetTransformImpl classes in
lib/Target with a single pass in lib/CodeGen called
BasicTargetTransformInfo. This class actually provides most of the TTI
functionality, basing it upon the TargetLowering abstraction and other
information in the target independent code generator.

The third step of the conversion adds support to all TargetMachines to
register custom analysis passes. This allows building those passes with
access to TargetLowering or other target-specific classes, and it also
allows each target to customize the set of analysis passes desired in
the pass manager. The baseline LLVMTargetMachine implements this
interface to add the BasicTTI pass to the pass manager, and all of the
tools that want to support target-aware TTI passes call this routine on
whatever target machine they end up with to add the appropriate passes.

The fourth step of the conversion created target-specific TTI analysis
passes for the X86 and ARM backends. These passes contain the custom
logic that was previously in their extensions of the
ScalarTargetTransformInfo and VectorTargetTransformInfo interfaces.
I separated them into their own file, as now all of the interface bits
are private and they just expose a function to create the pass itself.
Then I extended these target machines to set up a custom set of analysis
passes, first adding BasicTTI as a fallback, and then adding their
customized TTI implementations.

The fourth step required logic that was shared between the target
independent layer and the specific targets to move to a different
interface, as they no longer derive from each other. As a consequence,
a helper functions were added to TargetLowering representing the common
logic needed both in the target implementation and the codegen
implementation of the TTI pass. While technically this is the only
change that could have been committed separately, it would have been
a nightmare to extract.

The final step of the conversion was just to delete all the old
boilerplate. This got rid of the ScalarTargetTransformInfo and
VectorTargetTransformInfo classes, all of the support in all of the
targets for producing instances of them, and all of the support in the
tools for manually constructing a pass based around them.

Now that TTI is a relatively normal analysis group, two things become
straightforward. First, we can sink it into lib/Analysis which is a more
natural layer for it to live. Second, clients of this interface can
depend on it *always* being available which will simplify their code and
behavior. These (and other) simplifications will follow in subsequent
commits, this one is clearly big enough.

Finally, I'm very aware that much of the comments and documentation
needs to be updated. As soon as I had this working, and plausibly well
commented, I wanted to get it committed and in front of the build bots.
I'll be doing a few passes over documentation later if it sticks.

Commits to update DragonEgg and Clang will be made presently.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171681 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-07 01:37:14 +00:00
Chandler Carruth
0b8c9a80f2 Move all of the header files which are involved in modelling the LLVM IR
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.

There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.

The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.

I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).

I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171366 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-02 11:36:10 +00:00
Bill Wendling
831737d329 Remove the Function::getFnAttributes method in favor of using the AttributeSet
directly.

This is in preparation for removing the use of the 'Attribute' class as a
collection of attributes. That will shift to the AttributeSet class instead.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171253 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-30 10:32:01 +00:00
Bob Wilson
103b4a571e Revert "Adding support for llvm.arm.neon.vaddl[su].* and"
This reverts r170694.  The operations can be represented in IR without
adding any new intrinsics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170765 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-20 21:09:38 +00:00
Renato Golin
332bd79951 Adding support for llvm.arm.neon.vaddl[su].* and
llvm.arm.neon.vsub[su].* intrinsics.

Patch by Pete Couperus <pjcoup@gmail.com>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170694 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-20 13:52:11 +00:00
Jakob Stoklund Olesen
37a942cd52 Remove the explicit MachineInstrBuilder(MI) constructor.
Use the version that also takes an MF reference instead.

It would technically be possible to extract an MF reference from the MI
as MI->getParent()->getParent(), but that would not work for MIs that
are not inserted into any basic block.

Given the reasonably small number of places this constructor was used at
all, I preferred the compile time check to a run time assertion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170588 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-19 21:31:56 +00:00
Patrik Hagglund
0340557fb8 Change TargetLowering::findRepresentativeClass to take an MVT, instead
of EVT.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170532 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-19 11:30:36 +00:00
Bill Wendling
034b94b170 Rename the 'Attributes' class to 'Attribute'. It's going to represent a single attribute in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170502 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-19 07:18:57 +00:00
Patrik Hagglund
a61b17c18a Change TargetLowering::getRegClassFor to take an MVT, instead of EVT.
Accordingly, add helper funtions getSimpleValueType (in parallel to
getValueType) in SDValue, SDNode, and TargetLowering.

This is the first, in a series of patches.

This is the second attempt. In the first attempt (r169837), a few
getSimpleVT() were hoisted too far, detected by bootstrap failures.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170104 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-13 06:34:11 +00:00
Evan Cheng
946a3a9f22 Sorry about the churn. One more change to getOptimalMemOpType() hook. Did I
mention the inline memcpy / memset expansion code is a mess?

This patch split the ZeroOrLdSrc argument into two: IsMemset and ZeroMemset.
The first indicates whether it is expanding a memset or a memcpy / memmove.
The later is whether the memset is a memset of zero. It's totally possible
(likely even) that targets may want to do different things for memcpy and
memset of zero.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169959 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-12 02:34:41 +00:00
Evan Cheng
7d34267df6 - Rename isLegalMemOpType to isSafeMemOpType. "Legal" is a very overloade term.
Also added more comments to explain why it is generally ok to return true.
- Rename getOptimalMemOpType argument IsZeroVal to ZeroOrLdSrc. It's meant to
be true for loaded source (memcpy) or zero constants (memset). The poor name
choice is probably some kind of legacy issue.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169954 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-12 01:32:07 +00:00
Evan Cheng
61f4dfe369 Avoid using lossy load / stores for memcpy / memset expansion. e.g.
f64 load / store on non-SSE2 x86 targets.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169944 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-12 00:42:09 +00:00
Evan Cheng
e07f85eb76 Replace TargetLowering::isIntImmLegal() with
ScalarTargetTransformInfo::getIntImmCost() instead. "Legal" is a poorly defined
term for something like integer immediate materialization. It is always possible
to materialize an integer immediate. Whether to use it for memcpy expansion is
more a "cost" conceern.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169929 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-11 23:26:14 +00:00
Patrik Hagglund
34525f9ac0 Revert EVT->MVT changes, r169836-169851, due to buildbot failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169854 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-11 11:14:33 +00:00
Patrik Hagglund
bade0345d1 Change TargetLowering::findRepresentativeClass to take an MVT, instead
of EVT.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169845 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-11 09:57:18 +00:00
Patrik Hagglund
8163ca76f0 Change TargetLowering::getRegClassFor to take an MVT, instead of EVT.
Accordingly, add helper funtions getSimpleValueType (in parallel to
getValueType) in SDValue, SDNode, and TargetLowering.

This is the first, in a series of patches.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169837 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-11 09:10:33 +00:00
Evan Cheng
6a1b5cc7c6 Stylistic tweak.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169811 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-11 02:31:57 +00:00
Evan Cheng
376642ed62 Some enhancements for memcpy / memset inline expansion.
1. Teach it to use overlapping unaligned load / store to copy / set the trailing
   bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies.
2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g.
   x86 and ARM.
3. When memcpy from a constant string, do *not* replace the load with a constant
   if it's not possible to materialize an integer immediate with a single
   instruction (required a new target hook: TLI.isIntImmLegal()).
4. Use unaligned load / stores more aggressively if target hooks indicates they
   are "fast".
5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8.
   Also increase the threshold to something reasonable (8 for memset, 4 pairs
   for memcpy).

This significantly improves Dhrystone, up to 50% on ARM iOS devices.

rdar://12760078


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169791 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-10 23:21:26 +00:00
Evan Cheng
2766a47310 Replace r169459 with something safer. Rather than having computeMaskedBits to
understand target implementation of any_extend / extload, just generate
zero_extend in place of any_extend for liveouts when the target knows the
zero_extend will be implicit (e.g. ARM ldrb / ldrh) or folded (e.g. x86 movz).

rdar://12771555


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169536 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06 19:13:27 +00:00
Evan Cheng
8a7186dbc2 Let targets provide hooks that compute known zero and ones for any_extend
and extload's. If they are implemented as zero-extend, or implicitly
zero-extend, then this can enable more demanded bits optimizations. e.g.

define void @foo(i16* %ptr, i32 %a) nounwind {
entry:
  %tmp1 = icmp ult i32 %a, 100
  br i1 %tmp1, label %bb1, label %bb2
bb1:
  %tmp2 = load i16* %ptr, align 2
  br label %bb2
bb2:
  %tmp3 = phi i16 [ 0, %entry ], [ %tmp2, %bb1 ]
  %cmp = icmp ult i16 %tmp3, 24
  br i1 %cmp, label %bb3, label %exit
bb3:
  call void @bar() nounwind
  br label %exit
exit:
  ret void
}

This compiles to the followings before:
        push    {lr}
        mov     r2, #0
        cmp     r1, #99
        bhi     LBB0_2
@ BB#1:                                 @ %bb1
        ldrh    r2, [r0]
LBB0_2:                                 @ %bb2
        uxth    r0, r2
        cmp     r0, #23
        bhi     LBB0_4
@ BB#3:                                 @ %bb3
        bl      _bar
LBB0_4:                                 @ %exit
        pop     {lr}
        bx      lr

The uxth is not needed since ldrh implicitly zero-extend the high bits. With
this change it's eliminated.

rdar://12771555


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169459 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06 01:28:01 +00:00
Matt Beaumont-Gay
105ab4fe4b Appease GCC's -Wparentheses.
(TIL that Clang's -Wparentheses ignores 'x || y && "foo"' on purpose. Neat.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169337 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04 23:54:02 +00:00
Evan Cheng
c8e7045c8a ARM custom lower ctpop for vector types. Patch by Pete Couperus.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169325 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04 22:41:50 +00:00
Chandler Carruth
d04a8d4b33 Use the new script to sort the includes of every file under lib.
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.

Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169131 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-03 16:50:05 +00:00
Sebastian Pop
cb4953089b Codegen failure for vmull with small vectors
Codegen was failing with an assertion because of unexpected vector
operands when legalizing the selection DAG for a MUL instruction.

The asserting code was legalizing multiplies for vectors of size 128
bits. It uses a custom lowering to try and detect cases where it can
use a VMULL instruction instead of a VMOVL + VMUL.  The code was
looking for input operands to the MUL that had been sign or zero
extended. If it found the extended operands it would drop the
sign/zero extension and use the original vector size as input to a
VMULL instruction.

The code assumed that the original input vector was 64 bits so that
after dropping the extension it would fit directly into a D register
and could be used as an operand of a VMULL instruction. The input
code that trigger the failure used a vector of <4 x i8> that was
sign extended to <4 x i32>. It was not safe to drop the sign
extension in this case because the original vector is only 32 bits
wide. The fix is to insert a sign extension for the vector to reach
the required 64 bit size. In this particular example, the vector would
need to be sign extented to a <4 x i16>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169024 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-30 19:08:04 +00:00
Silviu Baranga
35b3df6e31 Added atomic 64 min/max/umin/umax instrinsics support in the ARM backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168886 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-29 14:41:25 +00:00
Benjamin Kramer
350c00843b ARM: Implement CanLowerReturn so large vectors get expanded into sret.
Fixes 14337.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168809 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-28 20:55:10 +00:00
Eli Friedman
43147afd71 Mark FP_EXTEND form v2f32 to v2f64 as "expand" for ARM NEON. Patch by Pete Couperus.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168240 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-17 01:52:46 +00:00
Weiming Zhao
e56764bad1 Remove hard coded registers in ARM ldrexd and strexd instructions
This patch replaces the hard coded GPR pair [R0, R1] of
Intrinsic:arm_ldrexd and [R2, R3] of Intrinsic:arm_strexd with
even/odd GPRPair reg class.
Similar to the lowering of atomic_64 operation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168207 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-16 21:55:34 +00:00
Anton Korobeynikov
b1a392e7c5 Make sure FABS on v2f32 and v4f32 is legal on ARM NEON
This fixes PR14359


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168200 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-16 21:15:20 +00:00
Eli Friedman
846ce8ea67 Mark FP_ROUND for converting NEON v2f64 to v2f32 as expand. Add a missing
case to vector legalization so this actually works.

Patch by Pete Couperus.  Fixes PR12540.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168107 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-15 22:44:27 +00:00
Craig Topper
116bd168e1 Revert changing FNEG of v4f32 to Expand. It's legal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168030 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-15 08:09:46 +00:00
Craig Topper
b916904e68 Make FNEG and FABS of v4f32 Expand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168029 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-15 08:06:12 +00:00
Craig Topper
490104720d Add llvm.ceil, llvm.trunc, llvm.rint, llvm.nearbyint intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168025 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-15 06:51:10 +00:00
Evan Cheng
b341fac05a Disable the Thumb no-return call optimization:
mov lr, pc
b.w _foo

The "mov" instruction doesn't set bit zero to one, it's putting incorrect
value in lr. It messes up backtraces.

rdar://12663632


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167657 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-10 02:09:05 +00:00
Chad Rosier
b3235b128f Revert r167620; this can be implemented using an existing CL option.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167622 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-09 18:25:27 +00:00
Chad Rosier
d054eda441 Add support for -mstrict-align compiler option for ARM targets.
rdar://12340498


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167620 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-09 17:29:38 +00:00
Chad Rosier
e7bd51980a Mark the Int_eh_sjlj_dispatchsetup pseudo instruction as clobbering all
registers.  Previously, the register we being marked as implicitly defined, but
not killed.  In some cases this would cause the register scavenger to spill a
dead register.

Also, use an empty register mask to simplify the logic and to reduce the memory
footprint.
rdar://12592448

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167499 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-06 23:05:24 +00:00
Quentin Colombet
43934aee71 Vext Lowering was missing opportunities
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167318 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-02 21:32:17 +00:00
Quentin Colombet
9a419f656e Change ForceSizeOpt attribute into MinSize attribute
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167020 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 16:32:52 +00:00
Quentin Colombet
80acd97266 [code size][ARM] Emit regular call instructions instead of the move, branch sequence
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166854 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-27 01:10:17 +00:00
Stepan Dyatkovskiy
0d3c8d5d16 ARM:
Removed extra stack frame object for fixed byval arguments,
VarArgsStyleRegisters invocation was reworked due to some improper usage in
past. PR14099 also demonstrates it.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166273 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 08:23:06 +00:00
Stepan Dyatkovskiy
b52ba9f8a8 Issue:
Stack is formed improperly for long structures passed as byval arguments for
EABI mode.

If we took AAPCS reference, we can found the next statements:

A: "If the argument requires double-word alignment (8-byte), the NCRN (Next
Core Register Number) is rounded up to the next even register number." (5.5
Parameter Passing, Stage C, C.3).

B: "The alignment of an aggregate shall be the alignment of its most-aligned
component." (4.3 Composite Types, 4.3.1 Aggregates).

So if we have structure with doubles (9 double fields) and 3 Core unused
registers (r1, r2, r3): caller should use r2 and r3 registers only.
Currently r1,r2,r3 set is used, but it is invalid.

Callee VA routine should also use r2 and r3 regs only. All is ok here. This
behaviour is guessed by rounding up SP address with ADD+BFC operations.

Fix:
Main fix is in ARMTargetLowering::HandleByVal. If we detected AAPCS mode and
8 byte alignment, we waste odd registers then.

P.S.:
I also improved LDRB_POST_IMM regression test. Since ldrb instruction will
not generated by current regression test after this patch. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166018 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 07:16:47 +00:00
Silviu Baranga
bb1078ea13 Fixed PR13938: the ARM backend was crashing because it couldn't select a VDUPLANE node with the vector input size different from the output size. This was bacause the BUILD_VECTOR lowering code didn't check that the size of the input vector was correct for using VDUPLANE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165929 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 09:41:32 +00:00
Manman Ren
e6c3cc8dc5 ARM: tail-call inside a function where part of a byval argument is on caller's
local frame causes problem.

For example:
void f(StructToPass s) {
  g(&s, sizeof(s));
}
will cause problem with tail-call since part of s is passed via registers and
saved in f's local frame. When g tries to access s, part of s may be corrupted
since f's local frame is popped out before the tail-call.

The current fix is to disable tail-call if getVarArgsRegSaveSize is not 0 for
the caller. This is a conservative approach, if we can prove the address of
s or part of s is not taken and passed to g, it should be okay to perform
tail-call.

rdar://12442472


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165853 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 23:39:43 +00:00
Jim Grosbach
4346fa9437 ARM: Mark VSELECT as 'expand'.
The backend already pattern matches to form VBSL when it can. We may want to
teach it to use the vbsl intrinsics at some point to prevent machine licm from
mucking with this, but using the Expand is completely correct.

http://llvm.org/bugs/show_bug.cgi?id=13831
http://llvm.org/bugs/show_bug.cgi?id=13961

Patch by Peter Couperus <peter.couperus@st.com>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165845 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-12 22:59:21 +00:00
Stepan Dyatkovskiy
2c2cb3c09f Fix for LDRB instruction:
SDNode for LDRB_POST_IMM is invalid: number of registers added to SDNode fewer
that described in .td.

7 ops is needed, but SDNode with only 6 is created.

In more details:
In ARMInstrInfo.td, in multiclass AI2_ldridx, in definition _POST_IMM, offset
operand is defined as am2offset_imm. am2offset_imm is complex parameter type,
and actually it consists from dummy register and imm itself. As I understood
trick with dummy reg was made for AsmParser. In ARMISelLowering.cpp, this dummy
register was not added to SDNode, and it cause crash in Peephole Optimizer pass.

The problem fixed by setting up additional dummy reg when emitting
LDRB_POST_IMM instruction.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165617 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 11:43:40 +00:00