Commit Graph

6591 Commits

Author SHA1 Message Date
Evan Cheng
14b4c03580 Add more fused mul+add/sub patterns. rdar://10139676
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154484 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 06:59:47 +00:00
Nadav Rotem
e611378a6e Reapply 154396 after fixing a test.
Original message:
Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
blendV uses a register for the selection while Vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154483 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 06:40:27 +00:00
Evan Cheng
92c904539a Match (fneg (fma) to vfnma. rdar://10139676
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154469 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 01:21:25 +00:00
Evan Cheng
a0908d0a44 Merge fma.ll into fusedMAC.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154466 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 01:03:11 +00:00
Jakob Stoklund Olesen
89cdaf46ec Fix test to be register assignment invariant.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154453 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 00:00:24 +00:00
Owen Anderson
06886aaaeb Move the constant-folding support for FP_ROUND in SelectionDAG from the one-operand version of getNode() to the two-operand version, since it became a two-operand node at sound point.
Zap a testcase that this allows us to completely fold away.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154447 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 22:46:53 +00:00
Evan Cheng
3aef2ff514 Handle llvm.fma.* intrinsics. rdar://10914096
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154439 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 21:40:28 +00:00
Duncan Sands
507bb7a42f Add a comment noting that the fdiv -> fmul conversion won't generate
multiplication by a denormal, and some tests checking that.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154431 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 20:35:27 +00:00
Eric Christopher
a139051654 Temporarily revert this patch to see if it brings the buildbots back.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154425 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 19:33:16 +00:00
Eric Christopher
18112d83e7 To ensure that we have more accurate line information for a block
don't elide the branch instruction if it's the only one in the block,
otherwise it's ok.

PR9796 and rdar://11215207

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154417 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 18:18:10 +00:00
Nadav Rotem
50e64cfe6e Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
blendv uses a register for the selection while vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154396 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 14:33:13 +00:00
Anton Korobeynikov
999821cddf Transform div to mul with reciprocal only when fp imm is legal.
This fixes PR12516 and uncovers one weird problem in legalize (workarounded)


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154394 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 13:22:49 +00:00
Evan Cheng
fa12d0df5a Add proper checks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154379 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 03:15:42 +00:00
Evan Cheng
bf010eb911 Fix a long standing tail call optimization bug. When a libcall is emitted
legalizer always use the DAG entry node. This is wrong when the libcall is
emitted as a tail call since it effectively folds the return node. If
the return node's input chain is not the entry (i.e. call, load, or store)
use that as the tail call input chain.

PR12419
rdar://9770785
rdar://11195178


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154370 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 01:51:00 +00:00
Rafael Espindola
fdb230a154 Don't try to zExt just to check if an integer constant is zero, it might
not fit in a i64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154364 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 00:16:22 +00:00
Lang Hames
23f369d1fe Test case for PR12495.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154359 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-09 23:58:59 +00:00
Akira Hatanaka
787c3fd385 Have TargetLowering::getPICJumpTableRelocBase return a node that points to the
GOT if jump table uses 64-bit gp-relative relocation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154341 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-09 20:32:12 +00:00
Chad Rosier
7f35455708 When performing a truncating store, it's possible to rearrange the data
in-register, such that we can use a single vector store rather then a 
series of scalar stores.

For func_4_8 the generated code

	vldr	d16, LCPI0_0
	vmov	d17, r0, r1
	vadd.i16	d16, d17, d16
	vmov.u16	r0, d16[3]
	strb	r0, [r2, #3]
	vmov.u16	r0, d16[2]
	strb	r0, [r2, #2]
	vmov.u16	r0, d16[1]
	strb	r0, [r2, #1]
	vmov.u16	r0, d16[0]
	strb	r0, [r2]
	bx	lr

becomes

	vldr	d16, LCPI0_0
	vmov	d17, r0, r1
	vadd.i16	d16, d17, d16
	vuzp.8	d16, d17
	vst1.32	{d16[0]}, [r2, :32]
	bx	lr

I'm not fond of how this combine pessimizes 2012-03-13-DAGCombineBug.ll,
but I couldn't think of a way to judiciously apply this combine.

This

	ldrh	r0, [r0, #4]
	strh	r0, [r1]

becomes

	vldr	d16, [r0]
	vmov.u16	r0, d16[2]
	vmov.32	d16[0], r0
	vuzp.16	d16, d17
	vst1.32	{d16[0]}, [r1, :32]

PR11158
rdar://10703339


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154340 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-09 20:32:02 +00:00
Rafael Espindola
decbc43f72 Pattern match a setcc of boolean value with 0 as a truncate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154322 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-09 16:06:03 +00:00
Nadav Rotem
e80aa7c783 Lower some x86 shuffle sequences to the vblend family of instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154313 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-09 08:33:21 +00:00
Nadav Rotem
154819dd6f Fix a bug in the lowering of broadcasts: ConstantPools need to use the target pointer type.
Move NormalizeVectorShuffle and LowerVectorBroadcast into X86TargetLowering.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154310 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-09 07:45:58 +00:00
Chandler Carruth
ab5a55e118 Cleanup and relax a restriction on the matching of global offsets into
x86 addressing modes. This allows PIE-based TLS offsets to fit directly
into an addressing mode immediate offset, which is the last remaining
code quality issue from PR12380. With this patch, that PR is completely
fixed.

To understand why this patch is correct to match these offsets into
addressing mode immediates, break it down by cases:
1) 32-bit is trivially correct, and unmodified here.
2) 64-bit non-small mode is unchanged and never matches.
3) 64-bit small PIC code which is RIP-relative is handled specially in
   the match to try to fit RIP into the base register. If it fails, it
   now early exits. This behavior is unchanged by the patch.
4) 64-bit small non-PIC code which is not RIP-relative continues to work
   as it did before. The reason these immediates are safe is because the
   ABI ensures they fit in small mode. This behavior is unchanged.
5) 64-bit small PIC code which is *not* using RIP-relative addressing.
   This is the only case changed by the patch, and the primary place you
   see it is in TLS, either the win64 section offset TLS or Linux
   local-exec TLS model in a PIC compilation. Here the ABI again ensures
   that the immediates fit because we are in small mode, and any other
   operations required due to the PIC relocation model have been handled
   externally to the Wrapper node (extra loads etc are made around the
   wrapper node in ISelLowering).

I've tested this as much as I can comparing it with GCC's output, and
everything appears safe. I discussed this with Anton and it made sense
to him at least at face value. That said, if there are issues with PIC
code after this patch, yell and we can revert it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154304 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-09 02:13:06 +00:00
Chandler Carruth
6916a2375a Fold 15 tiny test cases into a single file that implements the
comprehensive testing of TLS codegen for x86. Convert all of the ones
that were still using grep to use FileCheck. Remove some redundancies
between them.

Perhaps most interestingly expand the test cases so that they actually
fully list the instruction snippet being tested. TLS operations are
*very* narrowly defined, and so these seem reasonably stable. More
importantly, the existing test cases already were crazy fine grained,
expecting specific registers to be allocated. This just clarifies that
no *other* instructions are expected, and fills in some crucial gaps
that weren't being tested at all.

This will make any subsequent changes to TLS much more clear during
review.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154303 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-09 01:43:17 +00:00
Duncan Sands
3ef3fcfc04 Only have codegen turn fdiv by a constant into fmul by the reciprocal
when -ffast-math, i.e. don't just always do it if the reciprocal can
be formed exactly.  There is already an IR level transform that does
that, and it does it more carefully.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154296 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-08 18:08:12 +00:00
Chandler Carruth
253933ee9e Teach LLVM about a PIE option which, when enabled on top of PIC, makes
optimizations which are valid for position independent code being linked
into a single executable, but not for such code being linked into
a shared library.

I discussed the design of this with Eric Christopher, and the decision
was to support an optional bit rather than a completely separate
relocation model. Fundamentally, this is still PIC relocation, its just
that certain optimizations are only valid under a PIC relocation model
when the resulting code won't be in a shared library. The simplest path
to here is to expose a single bit option in the TargetOptions. If folks
have different/better designs, I'm all ears. =]

I've included the first optimization based upon this: changing TLS
models to the *Exec models when PIE is enabled. This is the LLVM
component of PR12380 and is all of the hard work.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154294 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-08 17:51:45 +00:00
Nadav Rotem
9d68b06bc5 AVX2: Build splat vectors by broadcasting a scalar from the constant pool.
Previously we used three instructions to broadcast an immediate value into a
vector register.
On Sandybridge we continue to load the broadcasted value from the constant pool.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154284 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-08 12:54:54 +00:00
Nadav Rotem
d16c8d0d33 1. Remove the part of r153848 which optimizes shuffle-of-shuffle into a new
shuffle node because it could introduce new shuffle nodes that were not
   supported efficiently by the target.

2. Add a more restrictive shuffle-of-shuffle optimization for cases where the
   second shuffle reverses the transformation of the first shuffle.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154266 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-07 21:19:08 +00:00
Duncan Sands
961d666be4 Convert floating point division by a constant into multiplication by the
reciprocal if converting to the reciprocal is exact.  Do it even if inexact
if -ffast-math.  This substantially speeds up ac.f90 from the polyhedron
benchmarks.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154265 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-07 20:04:00 +00:00
Sean Hunt
0fdfaafb70 Make the test for r154235 more platform-independent with a shorter
string.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154243 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-07 01:33:14 +00:00
Sean Hunt
3420e7f360 Output UTF-8-encoded characters as identifier characters into assembly
by default.

This is a behaviour configurable in the MCAsmInfo. I've decided to turn
it on by default in (possibly optimistic) hopes that most assemblers are
reasonably sane. If this proves a problem, switching to default seems
reasonable.

I'm not sure if this is the opportune place to test, but it seemed good
to make sure it was tested somewhere.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154235 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-07 00:37:53 +00:00
Akira Hatanaka
3e59b5edd6 Add lines in global-address.ll to test N32 and N64 code generation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154202 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06 20:23:36 +00:00
Jakob Stoklund Olesen
70fbea7c75 Allow negative immediates in ARM and Thumb2 compares.
ARM and Thumb2 mode can use cmn instructions to compare against negative
immediates. Thumb1 mode can't.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154183 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06 17:45:04 +00:00
Craig Topper
f85cb768fe Test case for PR12413
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154172 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06 14:38:25 +00:00
Craig Topper
9a2b6e1d7b Allow 256-bit shuffles to be split if a 128-bit lane contains elements from a single source. This is a rewrite of the 256-bit shuffle splitting code based on similar code from legalize types. Fixes PR12413.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154166 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06 07:45:23 +00:00
Akira Hatanaka
ba9536a3c6 Reapply test case in 154038, this time with triple to prevent the backend
from emitting gp_rel relocation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154122 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-05 20:44:35 +00:00
Jakob Stoklund Olesen
740cd657f3 Don't break the IV update in TLI::SimplifySetCC().
LSR always tries to make the ICmp in the loop latch use the incremented
induction variable. This allows the induction variable to be kept in a
single register.

When the induction variable limit is equal to the stride,
SimplifySetCC() would break LSR's hard work by transforming:

   (icmp (add iv, stride), stride) --> (cmp iv, 0)

This forced us to use lea for the IC update, preventing the simpler
incl+cmp.

<rdar://problem/7643606>
<rdar://problem/11184260>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154119 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-05 20:30:20 +00:00
James Molloy
17dcaf5ef9 An oversight when applying the patches for r150956 and r150957 to a vanilla tree meant I forgot to svn add these testcases.
Noticed while investigating PR12274!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154090 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-05 10:01:12 +00:00
Jakob Stoklund Olesen
9243c4f7c5 Pass the right sign to TLI->isLegalICmpImmediate.
LSR can fold three addressing modes into its ICmpZero node:

  ICmpZero BaseReg + Offset      => ICmp BaseReg, -Offset
  ICmpZero -1*ScaleReg + Offset  => ICmp ScaleReg, Offset
  ICmpZero BaseReg + -1*ScaleReg => ICmp BaseReg, ScaleReg

The first two cases are only used if TLI->isLegalICmpImmediate() likes
the offset.

Make sure the right Offset sign is passed to this method in the second
case. The ARM version is not symmetric.

<rdar://problem/11184260>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154079 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-05 03:10:56 +00:00
Akira Hatanaka
56ce6b3520 Reapply 154038 without the failing test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154062 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-04 22:16:36 +00:00
Owen Anderson
657a4e774c Revert r154038. It was causing make check failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154054 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-04 21:18:58 +00:00
Akira Hatanaka
e825fb3888 Fix LowerGlobalAddress to produce instructions with the correct relocation
types for N32 ABI. Add new test case and update existing ones.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154038 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-04 19:02:38 +00:00
Akira Hatanaka
86a2733055 Fix LowerConstantPool to produce instructions with the correct relocation
types for N32 ABI and update test case.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154034 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-04 18:26:12 +00:00
Jakob Stoklund Olesen
c5041cac7d Implement ARMBaseInstrInfo::commuteInstruction() for MOVCCr.
A MOVCCr instruction can be commuted by inverting the condition. This
can help reduce register pressure and remove unnecessary copies in some
cases.

<rdar://problem/11182914>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154033 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-04 18:23:42 +00:00
Akira Hatanaka
03d830e4f9 Fix LowerBlockAddress to produce instructions with the correct relocation
types for N32 ABI and update test case.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154031 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-04 18:22:53 +00:00
Pete Cooper
2ce63c7352 Add VSELECT to LegalizeVectorTypes::ScalariseVectorResult. Previously it would crash if it encountered a 1 element VSELECT. Solution is slightly more complicated than just creating a SELET as we have to mask or sign extend the vector condition if it had different boolean contents from the scalar condition. Fixes <rdar://problem/11178095>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153976 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-03 22:57:55 +00:00
Nadav Rotem
43b32e0cff Add an additional testcase which checks ops with multiple users.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153939 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-03 07:39:36 +00:00
Jakob Stoklund Olesen
e3b23cde80 Allocate virtual registers in ascending order.
This is just the fallback tie-breaker ordering, the main allocation
order is still descending size.

Patch by Shamil Kurmangaleev!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153904 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-02 22:30:39 +00:00
Lang Hames
be9fe49b17 During two-address lowering, rescheduling an instruction does not untie
operands. Make TryInstructionTransform return false to reflect this.
Fixes PR11861.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153892 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-02 19:58:43 +00:00
Rafael Espindola
ce167840b2 No need to run llvm-as.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153890 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-02 19:44:20 +00:00
Nadav Rotem
44b5e6de8c Optimizing swizzles of complex shuffles may generate additional complex shuffles.
Do not try to optimize swizzles of shuffles if the source shuffle has more than
a single user, except when the source shuffle is also a swizzle.






git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153864 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-02 07:11:12 +00:00