Commit Graph

6663 Commits

Author SHA1 Message Date
Eric Christopher
2c55362848 No really, don't store anything to this since it's unconditionally
set below.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180015 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-22 14:11:25 +00:00
Eric Christopher
b929b13c77 Remove variable store that is never read.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180014 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-22 13:51:44 +00:00
Stepan Dyatkovskiy
78e3c90419 Fix for 5.5 Parameter Passing --> Stage C:
-- C.4 and C.5 statements, when NSAA is not equal to SP.
 -- C.1.cp statement for VA functions. Note: There are no VFP CPRCs in a
    variadic procedure.

Before this patch "NSAA != 0" means "don't use GPRs anymore ". But there are
some exceptions in AAPCS.
1. For non VA function: allocate all VFP regs for CPRC. When all VFPs are allocated
   CPRCs would be sent to stack, while non CPRCs may be still allocated in GRPs.
2. Check that for VA functions all params uses GPRs and then stack.
   No exceptions, no CPRCs here.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180011 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-22 13:06:52 +00:00
Jim Grosbach
0cb1019e9c Legalize vector truncates by parts rather than just splitting.
Rather than just splitting the input type and hoping for the best, apply
a bit more cleverness. Just splitting the types until the source is
legal often leads to an illegal result time, which is then widened and a
scalarization step is introduced which leads to truly horrible code
generation. With the loop vectorizer, these sorts of operations are much
more common, and so it's worth extra effort to do them well.

Add a legalization hook for the operands of a TRUNCATE node, which will
be encountered after the result type has been legalized, but if the
operand type is still illegal. If simple splitting of both types
ends up with the result type of each half still being legal, just
do that (v16i16 -> v16i8 on ARM, for example). If, however, that would
result in an illegal result type (v8i32 -> v8i8 on ARM, for example),
we can get more clever with power-two vectors. Specifically,
split the input type, but also widen the result element size, then
concatenate the halves and truncate again.  For example on ARM,
To perform a "%res = v8i8 trunc v8i32 %in" we transform to:
  %inlo = v4i32 extract_subvector %in, 0
  %inhi = v4i32 extract_subvector %in, 4
  %lo16 = v4i16 trunc v4i32 %inlo
  %hi16 = v4i16 trunc v4i32 %inhi
  %in16 = v8i16 concat_vectors v4i16 %lo16, v4i16 %hi16
  %res = v8i8 trunc v8i16 %in16

This allows instruction selection to generate three VMOVN instructions
instead of a sequences of moves, stores and loads.

Update the ARMTargetTransformInfo to take this improved legalization
into account.

Consider the simplified IR:

define <16 x i8> @test1(<16 x i32>* %ap) {
  %a = load <16 x i32>* %ap
  %tmp = trunc <16 x i32> %a to <16 x i8>
  ret <16 x i8> %tmp
}

define <8 x i8> @test2(<8 x i32>* %ap) {
  %a = load <8 x i32>* %ap
  %tmp = trunc <8 x i32> %a to <8 x i8>
  ret <8 x i8> %tmp
}

Previously, we would generate the truly hideous:
	.syntax unified
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_test1
	.align	2
_test1:                                 @ @test1
@ BB#0:
	push	{r7}
	mov	r7, sp
	sub	sp, sp, #20
	bic	sp, sp, #7
	add	r1, r0, #48
	add	r2, r0, #32
	vld1.64	{d24, d25}, [r0:128]
	vld1.64	{d16, d17}, [r1:128]
	vld1.64	{d18, d19}, [r2:128]
	add	r1, r0, #16
	vmovn.i32	d22, q8
	vld1.64	{d16, d17}, [r1:128]
	vmovn.i32	d20, q9
	vmovn.i32	d18, q12
	vmov.u16	r0, d22[3]
	strb	r0, [sp, #15]
	vmov.u16	r0, d22[2]
	strb	r0, [sp, #14]
	vmov.u16	r0, d22[1]
	strb	r0, [sp, #13]
	vmov.u16	r0, d22[0]
	vmovn.i32	d16, q8
	strb	r0, [sp, #12]
	vmov.u16	r0, d20[3]
	strb	r0, [sp, #11]
	vmov.u16	r0, d20[2]
	strb	r0, [sp, #10]
	vmov.u16	r0, d20[1]
	strb	r0, [sp, #9]
	vmov.u16	r0, d20[0]
	strb	r0, [sp, #8]
	vmov.u16	r0, d18[3]
	strb	r0, [sp, #3]
	vmov.u16	r0, d18[2]
	strb	r0, [sp, #2]
	vmov.u16	r0, d18[1]
	strb	r0, [sp, #1]
	vmov.u16	r0, d18[0]
	strb	r0, [sp]
	vmov.u16	r0, d16[3]
	strb	r0, [sp, #7]
	vmov.u16	r0, d16[2]
	strb	r0, [sp, #6]
	vmov.u16	r0, d16[1]
	strb	r0, [sp, #5]
	vmov.u16	r0, d16[0]
	strb	r0, [sp, #4]
	vldmia	sp, {d16, d17}
	vmov	r0, r1, d16
	vmov	r2, r3, d17
	mov	sp, r7
	pop	{r7}
	bx	lr

	.globl	_test2
	.align	2
_test2:                                 @ @test2
@ BB#0:
	push	{r7}
	mov	r7, sp
	sub	sp, sp, #12
	bic	sp, sp, #7
	vld1.64	{d16, d17}, [r0:128]
	add	r0, r0, #16
	vld1.64	{d20, d21}, [r0:128]
	vmovn.i32	d18, q8
	vmov.u16	r0, d18[3]
	vmovn.i32	d16, q10
	strb	r0, [sp, #3]
	vmov.u16	r0, d18[2]
	strb	r0, [sp, #2]
	vmov.u16	r0, d18[1]
	strb	r0, [sp, #1]
	vmov.u16	r0, d18[0]
	strb	r0, [sp]
	vmov.u16	r0, d16[3]
	strb	r0, [sp, #7]
	vmov.u16	r0, d16[2]
	strb	r0, [sp, #6]
	vmov.u16	r0, d16[1]
	strb	r0, [sp, #5]
	vmov.u16	r0, d16[0]
	strb	r0, [sp, #4]
	ldm	sp, {r0, r1}
	mov	sp, r7
	pop	{r7}
	bx	lr

Now, however, we generate the much more straightforward:
	.syntax unified
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_test1
	.align	2
_test1:                                 @ @test1
@ BB#0:
	add	r1, r0, #48
	add	r2, r0, #32
	vld1.64	{d20, d21}, [r0:128]
	vld1.64	{d16, d17}, [r1:128]
	add	r1, r0, #16
	vld1.64	{d18, d19}, [r2:128]
	vld1.64	{d22, d23}, [r1:128]
	vmovn.i32	d17, q8
	vmovn.i32	d16, q9
	vmovn.i32	d18, q10
	vmovn.i32	d19, q11
	vmovn.i16	d17, q8
	vmovn.i16	d16, q9
	vmov	r0, r1, d16
	vmov	r2, r3, d17
	bx	lr

	.globl	_test2
	.align	2
_test2:                                 @ @test2
@ BB#0:
	vld1.64	{d16, d17}, [r0:128]
	add	r0, r0, #16
	vld1.64	{d18, d19}, [r0:128]
	vmovn.i32	d16, q8
	vmovn.i32	d17, q9
	vmovn.i16	d16, q8
	vmov	r0, r1, d16
	bx	lr

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179989 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-21 23:47:41 +00:00
Tim Northover
4cc1407b84 ARM: Use ldrd/strd to spill 64-bit pairs when available.
This allows common sp-offsets to be part of the instruction and is
probably faster on modern CPUs too.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179977 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-21 11:57:07 +00:00
Tim Northover
335dd0d1a6 ARM: don't add FrameIndex offset for LDMIA (has no immediate)
Previously, when spilling 64-bit paired registers, an LDMIA with both
a FrameIndex and an offset was produced. This kind of instruction
shouldn't exist, and the extra operand was being confused with the
predicate, causing aborts later on.

This removes the invalid 0-offset from the instruction being
produced.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179956 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-20 19:31:00 +00:00
Tim Northover
8b71994fde Remove unused ShouldFoldAtomicFences flag.
I think it's almost impossible to fold atomic fences profitably under
LLVM/C++11 semantics. As a result, this is now unused and just
cluttering up the target interface.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179940 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-20 12:32:43 +00:00
Tim Northover
6265d5c91a Remove unused MEMBARRIER DAG node; it's been replaced by ATOMIC_FENCE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179939 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-20 12:32:17 +00:00
Stephen Lin
456ca048af Add CodeGen support for functions that always return arguments via a new parameter attribute 'returned', which is taken advantage of in target-independent tail call opportunity detection and in ARM call lowering (when placed on an integral first parameter).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179925 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-20 05:14:40 +00:00
Stephen Lin
69394f2997 Test commit
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179913 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-20 00:47:48 +00:00
Eli Bendersky
75299e3a95 Move TryToFoldFastISelLoad to FastISel, where it belongs. In general, I'm
trying to move as much FastISel logic as possible out of the main path in
SelectionDAGISel - intermixing them just adds confusion.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179902 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-19 22:29:18 +00:00
Michael Liao
2a8bea7a8e ArrayRefize getMachineNode(). No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179901 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-19 22:22:57 +00:00
Tim Northover
d3af696c08 ARM: Permit "sp" in ARM variant of STREXD instructions
Patch from Mihail Popa

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179854 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-19 15:44:32 +00:00
Tim Northover
4521019c6f ARM: permit "sp" in ARM variants of MOVW/MOVT instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179847 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-19 09:58:09 +00:00
Chad Rosier
88eb89b89f [asm parser] Add support for predicating MnemonicAlias based on the assembler
variant/dialect.  Addresses a FIXME in the emitMnemonicAliases function.
Use and test case to come shortly.
rdar://13688439 and part of PR13340.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179804 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-18 22:35:36 +00:00
Hao Liu
d050e96133 Fix for PR14824, An ARM Load/Store Optimization bug
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179751 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-18 09:11:08 +00:00
Peter Collingbourne
df39be6cb4 Add support for subsections to the ELF assembler. Fixes PR8717.
Differential Revision: http://llvm-reviews.chandlerc.com/D598

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179725 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-17 21:18:16 +00:00
Quentin Colombet
7c4cf030a8 Fix treatment of ARM unallocated hint instructions.
The reference manual defines only 5 permitted values for the immediate field of the "hint" instruction:
1. nop (imm == 0)
2. yield (imm == 1)
3. wfe (imm == 2)
4. wfi (imm == 3)
5. sev (imm == 4)

Therefore, restrict the permitted values for the "hint" instruction to 0 through 4.

Patch by Mihail Popa <Mihail.Popa@arm.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179707 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-17 18:46:12 +00:00
Logan Chien
a363b117f4 Fix build failure introduced in 179591 when assertions are disabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179593 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-16 14:02:30 +00:00
Logan Chien
532854d7ab Implement ARM unwind opcode assembler.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179591 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-16 12:02:21 +00:00
Jim Grosbach
d0132ba722 ARM: Add VACLT and VACLE assembly aliases.
These are aliases for VACGT and VACGE, respectively, with the source
operands reversed.

rdar://13638090

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179575 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-15 22:42:50 +00:00
Quentin Colombet
d64ee4455a ARM: Correct printing of pre-indexed operands.
According to the ARM reference manual, constant offsets are mandatory for pre-indexed addressing modes.
The MC disassembler was not obeying this when the offset is 0.
It was producing instructions like: str r0, [r1]!.
Correct syntax is: str r0, [r1, #0]!.

This change modifies the dumping of operands so that the offset is always printed, regardless of its value, when pre-indexed addressing mode is used.

Patch by Mihail Popa <Mihail.Popa@arm.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179398 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-12 18:47:25 +00:00
Tim Northover
8c9e52a9fc ARM: Make "SMC" instructions conditional on new TrustZone architecture feature.
These instructions aren't universally available, but depend on a specific
extension to the normal ARM architecture (rather than, say, v6/v7/...) so a new
feature is appropriate.

This also enables the feature by default on A-class cores which usually have
these extensions, to avoid breaking existing code and act as a sensible
default.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179171 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-10 12:08:35 +00:00
Benjamin Kramer
7fba6cd3d0 ARM: Remove unused variable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179001 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-08 08:07:35 +00:00
Renato Golin
84581daf20 Reverting 178851 as it broke buildbots
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178883 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05 16:39:53 +00:00
Stepan Dyatkovskiy
992347f271 Buildbot fix for r178851: mistake was in wrong TargetRegisterInfo::getRegClass usage.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178854 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05 07:34:08 +00:00
Stepan Dyatkovskiy
89becbb974 Fix for PR14824: "Optimization arm_ldst_opt inserts newly generated instruction vldmia at incorrect position".
Patch introduces memory operands tracking in ARMLoadStoreOpt::LoadStoreMultipleOpti. For each register it keeps the order of load operations as it was before optimization pass.
It is kind of deep improvement of fix proposed by Hao: http://llvm.org/bugs/show_bug.cgi?id=14824#c4
But it also tracks conflicts between different register classes (e.g. D2 and S5).
For more details see:
Bug description: http://llvm.org/bugs/show_bug.cgi?id=14824
LLVM Commits discussion: 
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130311/167936.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130318/168688.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130325/169376.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130401/170238.html



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178851 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05 05:52:14 +00:00
Arnold Schwaighofer
fc61e635fd ARM scheduler model: Add scheduler info to more instructions and resource
descriptions for compares

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178844 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05 05:01:06 +00:00
Arnold Schwaighofer
08da486557 ARM scheduler model: Swift has varying latencies, uops for simple ALU ops
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178842 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05 04:42:00 +00:00
Jakob Stoklund Olesen
ee27cac9fa Avoid high-latency false CPSR dependencies even for tMOVSi.
The Thumb2SizeReduction pass avoids false CPSR dependencies, except it
still aggressively creates tMOVi8 instructions because they are so
common.

Avoid creating false CPSR dependencies even for tMOVi8 instructions when
the the CPSR flags are known to have high latency. This allows integer
computation to overlap floating point computations.

Also process blocks in a reverse post-order and propagate high-latency
flags to successors.

<rdar://problem/13468102>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178773 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-04 18:25:36 +00:00
Arnold Schwaighofer
55097ff567 ARM Scheduler Model: Add resources instructions, map resources in subtargets
Reapply r177968:
After commit 178074 we can now have undefined scheduler variants.

Move the CortexA9 resources into the CortexA9 SchedModel namespace. Define
resource mappings under the CortexA9 SchedModel. Define resources and mappings
for the SwiftModel.

Incooperate Andrew's feedback.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178460 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-01 13:07:05 +00:00
Benjamin Kramer
74a4533a42 Remove the old CodePlacementOpt pass.
It was superseded by MachineBlockPlacement and disabled by default since LLVM 3.1.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178349 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-29 17:14:24 +00:00
Gordon Keiser
ce88835110 Fix issue with disassembler decoding CBZ/CBNZ immediates as negatives when the upper bit is set.
They should always be zero-extended, not sign extended.  Added test case.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178275 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-28 19:22:28 +00:00
Gordon Keiser
93b10789c6 Testing commit access to llvm. Remove two lines of whitespace from the Thumb README.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178256 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-28 18:26:15 +00:00
Silviu Baranga
a210db781f Enabling the generation of dependency breakers for partial updates on Cortex-A15. Also fixing a small bug in getting the update clearence for VLD1LNd32.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178134 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-27 12:38:44 +00:00
Arnold Schwaighofer
afaeb8152c Revert ARM Scheduler Model: Add resources instructions, map resources
This reverts commit r177968. It is causing failures in a local build bot.

"fatal error: error in backend: Expected a variant SchedClass"

Original commit message:
Move the CortexA9 resources into the CortexA9 SchedModel namespace. Define
resource mappings under the CortexA9 SchedModel. Define resources and mappings
for the SwiftModel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178028 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-26 15:14:04 +00:00
Joe Abbey
b78821d380 Patch by Gordon Keiser!
If PC or SP is the destination, the disassembler erroneously failed with the
invalid encoding, despite the manual saying that both are fine.

This patch addresses failure to decode encoding T4 of LDR (A8.8.62) which is a
postindexed load, where the offset 0xc is applied to SP after the load occurs.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178017 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-26 13:58:53 +00:00
Arnold Schwaighofer
a5dbe29ff5 ARM Scheduler Model: Add resources instructions, map resources in subtargets
Move the CortexA9 resources into the CortexA9 SchedModel namespace. Define
resource mappings under the CortexA9 SchedModel. Define resources and mappings
for the SwiftModel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177968 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-26 02:01:42 +00:00
Arnold Schwaighofer
1b618f8848 ARM Scheduler Model: Partial implementation of the new machine scheduler model
This is very much work in progress. Please send me a note if you start to depend
on the added abstract read/write resources. They are subject to change until
further notice.

The old itinerary is still the default.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177967 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-26 02:01:39 +00:00
Chad Rosier
c82fba27fe [arm load/store optimizer] When trying to merge a base update load/store, make
sure the base register and would-be writeback register don't conflict for
stores.  This was already being done for loads.

Unfortunately, it is rather difficult to create a test case for this issue.  It
was exposed in 450.soplex at LTO and requires unlucky register allocation.
<rdar://13394908>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177874 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-25 16:29:20 +00:00
Hal Finkel
dc3beb9017 Allow the register scavenger to spill multiple registers
This patch lets the register scavenger make use of multiple spill slots in
order to guarantee that it will be able to provide multiple registers
simultaneously.

To support this, the RS's API has changed slightly: setScavengingFrameIndex /
getScavengingFrameIndex have been replaced by addScavengingFrameIndex /
isScavengingFrameIndex / getScavengingFrameIndices.

In forthcoming commits, the PowerPC backend will use this capability in order
to implement the spilling of condition registers, and some special-purpose
registers, without relying on r0 being reserved. In some cases, spilling these
registers requires two GPRs: one for addressing and one to hold the value being
transferred.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177774 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-22 23:32:27 +00:00
Renato Golin
3382a84074 Avoid NEON SP-FP unless unsafe-math or Darwin
NEON is not IEEE 754 compliant, so we should avoid lowering single-precision
floating point operations with NEON unless unsafe-math is turned on. The
equivalent VFP instructions are IEEE 754 compliant, but in some cores they're
much slower, so some archs/OSs might still request it to be on by default,
such as Swift and Darwin.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177651 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-21 18:47:47 +00:00
Chad Rosier
580f9c85fd Fix pr13145 - Naming a function like a register name confuses the asm parser.
Patch by Stepan Dyatkovskiy <stpworld@narod.ru>
rdar://13457826

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177463 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-19 23:44:03 +00:00
Renato Golin
5ad5f5931e Improve long vector sext/zext lowering on ARM
The ARM backend currently has poor codegen for long sext/zext
operations, such as v8i8 -> v8i32. This patch addresses this
by performing a custom expansion in ARMISelLowering. It also
adds/changes the cost of such lowering in ARMTTI.

This partially addresses PR14867.

Patch by Pete Couperus

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177380 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-19 08:15:38 +00:00
Arnold Schwaighofer
bf37bf9e21 ARM cost model: Make some vector integer to float casts cheaper
The default logic marks them as too expensive.

For example, before this patch we estimated:
  cost of 16 for instruction:   %r = uitofp <4 x i16> %v0 to <4 x float>

While this translates to:
  vmovl.u16 q8, d16
  vcvt.f32.u32  q8, q8

All other costs are left to the values assigned by the fallback logic. Theses
costs are mostly reasonable in the sense that they get progressively more
expensive as the instruction sequences emitted get longer.

radar://13445992

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177334 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-18 22:47:09 +00:00
Arnold Schwaighofer
01f2571014 ARM cost model: Correct cost for some cheap float to integer conversions
Fix cost of some "cheap" cast instructions. Before this patch we used to
estimate for example:
  cost of 16 for instruction:   %r = fptoui <4 x float> %v0 to <4 x i16>

While we would emit:
  vcvt.s32.f32  q8, q8
  vmovn.i32 d16, q8
  vuzp.8  d16, d17

All other costs are left to the values assigned by the fallback logic. Theses
costs are mostly reasonable in the sense that they get progressively more
expensive as the instruction sequences emitted get longer.

radar://13434072

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177333 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-18 22:47:06 +00:00
Arnold Schwaighofer
5193e4ebe2 ARM cost model: Fix costs for some vector selects
I was too pessimistic in r177105. Vector selects that fit into a legal register
type lower just fine. I was mislead by the code fragment that I was using. The
stores/loads that I saw in those cases came from lowering the conditional off
an address.

Changing the code fragment to:

%T0_3 = type <8 x i18>
%T1_3 = type <8 x i1>

define void @func_blend3(%T0_3* %loadaddr, %T0_3* %loadaddr2,
                         %T1_3* %blend, %T0_3* %storeaddr) {
  %v0 = load %T0_3* %loadaddr
  %v1 = load %T0_3* %loadaddr2
==> FROM:
  ;%c = load %T1_3* %blend
==> TO:
  %c = icmp slt %T0_3 %v0, %v1
==> USE:
  %r = select %T1_3 %c, %T0_3 %v0, %T0_3 %v1

  store %T0_3 %r, %T0_3* %storeaddr
  ret void
}

revealed this mistake.

radar://13403975

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177170 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-15 18:31:01 +00:00
Silviu Baranga
bcbf3fddef Adding an A15 specific optimization pass for interactions between S/D/Q registers. The pass handles all the required transformations pre-regalloc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177169 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-15 18:28:25 +00:00
Benjamin Kramer
133c0d36e1 ARM: Fix an old refacto.
Fixes PR15520.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177167 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-15 17:27:39 +00:00
Arnold Schwaighofer
c0d8dc0eb6 ARM cost model: Fix cost of fptrunc and fpext instructions
A vector fptrunc and fpext simply gets split into scalar instructions.

radar://13192358

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177159 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-15 15:10:47 +00:00