Commit Graph

313 Commits

Author SHA1 Message Date
Cameron Zwarich
1bcdb6ffad Fix a comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127728 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-16 08:13:42 +00:00
Cameron Zwarich
85b0f468cf Only convert allocas to scalars if it is profitable. The profitability metric I
chose is having a non-memcpy/memset use and being larger than any native integer
type. Originally I chose having an access of a size smaller than the total size
of the alloca, but this caused some minor issues on the spirit benchmark where
SRoA runs again after some inlining.

This fixes <rdar://problem/8613163>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127718 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-16 00:13:44 +00:00
Cameron Zwarich
deac268f89 Better use initializer lists.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127716 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-16 00:13:37 +00:00
Cameron Zwarich
d4c9c3e6b9 Add a clarifying comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127715 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-16 00:13:35 +00:00
Cameron Zwarich
032c10fee2 Fix a crasher introduced by r127317 that is seen on the bots when using an
alloca as both integer and floating-point vectors of the same size. Bugpoint is
not cooperating with me, but I'll try to find a manual testcase tomorrow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127320 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-09 07:34:11 +00:00
Cameron Zwarich
b2fd770136 Add support to scalar replacement for partial vector accesses of an alloca, e.g.
a union of a float, <2 x float>, and <4 x float>. This mostly comes up with the
use of vector intrinsics, especially in NEON when programmers know the layout of
the register file. This enables codegen to eliminate a lot of the subregister
traffic it would otherwise generate.

This commit only enables this for a small number of floating-point cases, but a
lot more integer cases. I assume this is okay for all ports, but I did not do
extensive testing of the quality of code involving i512 vectors and the like. If
there is a use case where this generates worse code than before, let me know and
we can scale it back.

This fixes <rdar://problem/9036264>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127317 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-09 05:43:05 +00:00
Cameron Zwarich
c9ecd14cee Move vector type merging to a separate function in preparation for it getting
more complicated.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127316 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-09 05:43:01 +00:00
Chris Lattner
2ca5c8644e convert ConstantVector::get to use ArrayRef.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125537 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-15 00:14:00 +00:00
Chris Lattner
7583190422 revert my ConstantVector patch, it seems to have made the llvm-gcc
builders unhappy.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125504 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-14 18:15:46 +00:00
Chris Lattner
283c8caccd Switch ConstantVector::get to use ArrayRef instead of a pointer+size
idiom.  Change various clients to simplify their code.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125487 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-14 07:55:32 +00:00
Dan Gohman
bd1801b555 Give GetUnderlyingObject a TargetData, to keep it in sync
with BasicAA's DecomposeGEPExpression, which recently began
using a TargetData. This fixes PR8968, though the testcase
is awkward to reduce.

Also, update several off GetUnderlyingObject's users
which happen to have a TargetData handy to pass it in.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124134 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-24 18:53:32 +00:00
Chris Lattner
e3357863aa enhance SRoA to promote allocas that are used by PHI nodes. This often
occurs because instcombine sinks loads and inserts phis.  This kicks in 
on such apps as 175.vpr, eon, 403.gcc, xalancbmk and a bunch of times in
spec2006 in some app that uses std::deque.

This resolves the last of rdar://7339113.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124090 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-24 01:07:11 +00:00
Chris Lattner
c87c50a39c Enhance SRoA to promote allocas that are used by selects in some
common cases.  This triggers a surprising number of times in SPEC2K6
because min/max idioms end up doing this.  For example, code from the
STL ends up looking like this to SRoA:

  %202 = load i64* %__old_size, align 8, !tbaa !3
  %203 = load i64* %__old_size, align 8, !tbaa !3
  %204 = load i64* %__n, align 8, !tbaa !3
  %205 = icmp ult i64 %203, %204
  %storemerge.i = select i1 %205, i64* %__n, i64* %__old_size
  %206 = load i64* %storemerge.i, align 8, !tbaa !3

We can now promote both the __n and the __old_size allocas.

This addresses another chunk of rdar://7339113, poor codegen on
stringswitch.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124088 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-23 22:04:55 +00:00
Chris Lattner
145c532e68 Enhance SRoA to be more aggressive about scalarization of aggregate allocas
that have PHI or select uses of their element pointers.  This can often happen
when instcombine sinks two loads into a successor, inserting a phi or select.

With this patch, we can scalarize the alloca, but the pinned elements are not
yet promoted.  This is still a win for large aggregates where only one element
is used.  This fixes rdar://8904039 and part of rdar://7339113 (poor codegen
on stringswitch).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124070 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-23 08:27:54 +00:00
Chris Lattner
6c95d24927 have AllocaInfo store the alloca being inspected, simplifying callers.
No functionality change.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124067 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-23 07:29:29 +00:00
Chris Lattner
d01a0da090 Rearrange some code a bit. Change MarkUnsafe to
handle the "Transformation preventing inst" printing, 
so that -scalarrepl -debug will always print the rejected
instruction.  No functionality change.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124066 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-23 07:05:44 +00:00
Chris Lattner
85a7c69085 remove an old hack that avoided creating MMX datatypes. The
X86 backend has been fixed.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124064 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-23 06:40:33 +00:00
Cameron Zwarich
b1686c32fc Remove outdated references to dominance frontiers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123724 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-18 03:53:26 +00:00
Cameron Zwarich
419e8a6299 Roll r123609 back in with two changes that fix test failures with expensive
checks enabled:

1) Use '<' to compare integers in a comparison function rather than '<='.

2) Use the uniqued set DefBlocks rather than Info.DefiningBlocks to initialize
the priority queue.

The speedup of scalarrepl on test-suite + SPEC2000 + SPEC2006 is a bit less, at
just under 16% rather than 17%.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123662 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-17 17:38:41 +00:00
Cameron Zwarich
b1086a9c6d Roll out r123609 due to failures on the llvm-x86_64-linux-checks bot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123618 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-17 07:26:51 +00:00
Cameron Zwarich
ebed6de7b1 Eliminate the use of dominance frontiers in PromoteMemToReg. In addition to
eliminating a potentially quadratic data structure, this also gives a 17%
speedup when running -scalarrepl on test-suite + SPEC2000 + SPEC2006. My initial
experiment gave a greater speedup around 25%, but I moved the dominator tree
level computation from dominator tree construction to PromoteMemToReg.

Since this approach to computing IDFs has a much lower overhead than the old
code using precomputed DFs, it is worth looking at using this new code for the
second scalarrepl pass as well.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123609 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-17 01:08:59 +00:00
Chris Lattner
396a0567cf tidy up a comment, as suggested by duncan
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123590 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-16 17:46:19 +00:00
Chris Lattner
7e9b427c87 if an alloca is only ever accessed as a unit, and is accessed with load/store instructions,
then don't try to decimate it into its individual pieces.  This will just make a mess of the
IR and is pointless if none of the elements are individually accessed.  This was generating
really terrible code for std::bitset (PR8980) because it happens to be lowered by clang
as an {[8 x i8]} structure instead of {i64}.

The testcase now is optimized to:

define i64 @test2(i64 %X) {
  br label %L2

L2:                                               ; preds = %0
  ret i64 %X
}

before we generated:

define i64 @test2(i64 %X) {
  %sroa.store.elt = lshr i64 %X, 56
  %1 = trunc i64 %sroa.store.elt to i8
  %sroa.store.elt8 = lshr i64 %X, 48
  %2 = trunc i64 %sroa.store.elt8 to i8
  %sroa.store.elt9 = lshr i64 %X, 40
  %3 = trunc i64 %sroa.store.elt9 to i8
  %sroa.store.elt10 = lshr i64 %X, 32
  %4 = trunc i64 %sroa.store.elt10 to i8
  %sroa.store.elt11 = lshr i64 %X, 24
  %5 = trunc i64 %sroa.store.elt11 to i8
  %sroa.store.elt12 = lshr i64 %X, 16
  %6 = trunc i64 %sroa.store.elt12 to i8
  %sroa.store.elt13 = lshr i64 %X, 8
  %7 = trunc i64 %sroa.store.elt13 to i8
  %8 = trunc i64 %X to i8
  br label %L2

L2:                                               ; preds = %0
  %9 = zext i8 %1 to i64
  %10 = shl i64 %9, 56
  %11 = zext i8 %2 to i64
  %12 = shl i64 %11, 48
  %13 = or i64 %12, %10
  %14 = zext i8 %3 to i64
  %15 = shl i64 %14, 40
  %16 = or i64 %15, %13
  %17 = zext i8 %4 to i64
  %18 = shl i64 %17, 32
  %19 = or i64 %18, %16
  %20 = zext i8 %5 to i64
  %21 = shl i64 %20, 24
  %22 = or i64 %21, %19
  %23 = zext i8 %6 to i64
  %24 = shl i64 %23, 16
  %25 = or i64 %24, %22
  %26 = zext i8 %7 to i64
  %27 = shl i64 %26, 8
  %28 = or i64 %27, %25
  %29 = zext i8 %8 to i64
  %30 = or i64 %29, %28
  ret i64 %30
}

In this case, instcombine was able to eliminate the nonsense, but in PR8980 enough
PHIs are in play that instcombine backs off.  It's better to not generate this stuff
in the first place.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123571 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-16 06:18:28 +00:00
Chris Lattner
7072853279 Use an irbuilder to get some trivial constant folding when doing a store
of a constant.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123570 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-16 05:58:24 +00:00
Chris Lattner
192228edb1 enhance FoldOpIntoPhi in instcombine to try harder when a phi has
multiple uses.  In some cases, all the uses are the same operation,
so instcombine can go ahead and promote the phi.  In the testcase
this pushes an add out of the loop.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123568 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-16 05:28:59 +00:00
Chris Lattner
deaf55f698 Generalize LoadAndStorePromoter a bit and switch LICM
to use it.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123501 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-15 00:12:35 +00:00
Chris Lattner
d0f56132cf switch SRoA to use LoadAndStorePromoter instead of its own copy of the code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123457 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-14 19:50:47 +00:00
Chris Lattner
b352d6eb49 split SROA into two passes: one that uses DomFrontiers (-scalarrepl)
and one that uses SSAUpdater (-scalarrepl-ssa)


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123436 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-14 08:13:00 +00:00
Chris Lattner
e0a1a5ba91 Implement full support for promoting allocas to registers using SSAUpdater
instead of DomTree/DomFrontier.  This may be interesting for reducing compile 
time.  This is currently disabled, but seems to work just fine.

When this is enabled, we eliminate two runs of dominator frontier, one in the
"early per-function" optimizations and one in the "interlaced with inliner"
function passes.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123434 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-14 07:50:47 +00:00
Bob Wilson
6974302e3f Fix whitespace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123396 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-13 20:59:44 +00:00
Bob Wilson
f0908aeade Check for empty structs, and for consistency, zero-element arrays.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123383 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-13 18:26:59 +00:00
Bob Wilson
704d1347c5 Extend SROA to handle arrays accessed as homogeneous structs and vice versa.
This is a minor extension of SROA to handle a special case that is
important for some ARM NEON operations.  Some of the NEON intrinsics
return multiple values, which are handled as struct types containing
multiple elements of the same vector type.  The corresponding return
types declared in the arm_neon.h header have equivalent arrays.  We
need SROA to recognize that it can split up those arrays and structs
into separate vectors, even though they are not always accessed with
the same type.  SROA already handles loads and stores of an entire
alloca by using insertvalue/extractvalue to access the individual
pieces, and that code works the same regardless of whether the type
is a struct or an array.  So, all that needs to be done is to check
for compatible arrays and homogeneous structs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123381 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-13 17:45:11 +00:00
Bob Wilson
694a10e7d8 Make SROA more aggressive with allocas containing padding.
SROA only split up structs and arrays one level at a time, so padding can
only cause trouble if it is located in between the struct or array elements.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123380 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-13 17:45:08 +00:00
Chris Lattner
9fc5cdf77c split dom frontier handling stuff out to its own DominanceFrontier header,
so that Dominators.h is *just* domtree.  Also prune #includes a bit.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122714 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-02 22:09:33 +00:00
Chris Lattner
61db1f56d0 start using irbuilder to make mem intrinsics in a few passes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122572 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-26 22:57:41 +00:00
Mon P Wang
e90a6333c3 Preserve the address space when generating bitcasts for MemTransferInst in ConvertToScalarInfo
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122462 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-23 01:41:32 +00:00
Dan Gohman
5034dd318a Move Value::getUnderlyingObject to be a standalone
function so that it can live in Analysis instead of
VMCore.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121885 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-15 20:02:24 +00:00
Nick Lewycky
081f80078d Treat a call of function pointer like a load of the pointer when considering
whether the pointer can be replaced with the global variable it is a copy of.
Fixes PR8680.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120126 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-24 22:04:20 +00:00
Benjamin Kramer
f601d6df6f Simplify code. No change in functionality.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119908 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-20 18:43:35 +00:00
Chris Lattner
2e29ebd9e8 finish a thought.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119690 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-18 07:32:33 +00:00
Chris Lattner
6248065194 allow eliminating an alloca that is just copied from an constant global
if it is passed as a byval argument.  The byval argument will just be a
read, so it is safe to read from the original global instead.  This allows
us to promote away the %agg.tmp alloca in PR8582


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119686 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-18 06:41:51 +00:00
Chris Lattner
a9be1df6d7 enhance the "alloca is just a memcpy from constant global"
to ignore calls that obviously can't modify the alloca
because they are readonly/readnone.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119683 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-18 06:26:49 +00:00
Chris Lattner
2e61849f45 fix a small oversight in the "eliminate memcpy from constant global"
optimization.  If the alloca that is "memcpy'd from constant" also has
a memcpy from *it*, ignore it: it is a load.  We now optimize the testcase to:

define void @test2() {
  %B = alloca %T
  %a = bitcast %T* @G to i8*
  %b = bitcast %T* %B to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %b, i8* %a, i64 124, i32 4, i1 false)
  call void @bar(i8* %b)
  ret void
}

previously we would generate:

define void @test() {
  %B = alloca %T
  %b = bitcast %T* %B to i8*
  %G.0 = getelementptr inbounds %T* @G, i32 0, i32 0
  %tmp3 = load i8* %G.0, align 4
  %G.1 = getelementptr inbounds %T* @G, i32 0, i32 1
  %G.15 = bitcast [123 x i8]* %G.1 to i8*
  %1 = bitcast [123 x i8]* %G.1 to i984*
  %srcval = load i984* %1, align 1
  %B.0 = getelementptr inbounds %T* %B, i32 0, i32 0
  store i8 %tmp3, i8* %B.0, align 4
  %B.1 = getelementptr inbounds %T* %B, i32 0, i32 1
  %B.12 = bitcast [123 x i8]* %B.1 to i8*
  %2 = bitcast [123 x i8]* %B.1 to i984*
  store i984 %srcval, i984* %2, align 1
  call void @bar(i8* %b)
  ret void
}



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119682 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-18 06:20:47 +00:00
Owen Anderson
081c34b725 Get rid of static constructors for pass registration. Instead, every pass exposes an initializeMyPassFunction(), which
must be called in the pass's constructor.  This function uses static dependency declarations to recursively initialize
the pass's dependencies.

Clients that only create passes through the createFooPass() APIs will require no changes.  Clients that want to use the
CommandLine options for passes will need to manually call the appropriate initialization functions in PassInitialization.h
before parsing commandline arguments.

I have tested this with all standard configurations of clang and llvm-gcc on Darwin.  It is possible that there are problems
with the static dependencies that will only be visible with non-standard options.  If you encounter any crash in pass
registration/creation, please send the testcase to me directly.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116820 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-19 17:21:58 +00:00
Benjamin Kramer
af81235ef9 Eliminate some calls to Value::getNameStr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116670 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-16 11:28:23 +00:00
Owen Anderson
2ab36d3502 Begin adding static dependence information to passes, which will allow us to
perform initialization without static constructors AND without explicit initialization
by the client.  For the moment, passes are required to initialize both their
(potential) dependencies and any passes they preserve.  I hope to be able to relax
the latter requirement in the future.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116334 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-12 19:48:12 +00:00
Owen Anderson
ce665bd2e2 Now with fewer extraneous semicolons!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115996 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-07 22:25:06 +00:00
Dale Johannesen
0488fb649a Massive rewrite of MMX:
The x86_mmx type is used for MMX intrinsics, parameters and
return values where these use MMX registers, and is also
supported in load, store, and bitcast.

Only the above operations generate MMX instructions, and optimizations
do not operate on or produce MMX intrinsics. 

MMX-sized vectors <2 x i32> etc. are lowered to XMM or split into
smaller pieces.  Optimizations may occur on these forms and the
result casted back to x86_mmx, provided the result feeds into a
previous existing x86_mmx operation.

The point of all this is prevent optimizations from introducing
MMX operations, which is unsafe due to the EMMS problem.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115243 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-30 23:57:10 +00:00
Chris Lattner
72eaa0e5eb deepen my MMX/SRoA hack to avoid hurting non-x86 codegen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112763 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-01 23:09:27 +00:00
Chris Lattner
91abace4ef add a gross hack to work around a problem that Argiris reported
on llvmdev: SRoA is introducing MMX datatypes like <1 x i64>,
which then cause random problems because the X86 backend is
producing mmx stuff without inserting proper emms calls.

In the short term, force off MMX datatypes.  In the long term,
the X86 backend should not select generic vector types to MMX
registers.  This is being worked on, but won't be done in time
for 2.8.  rdar://8380055


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112696 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-01 05:14:33 +00:00