Optimize InstCombine stack memory consumption

This patch reduces the stack memory consumption of the InstCombine
function "isOnlyCopiedFromConstantGlobal() ", that in certain conditions
could overflow the stack because of excessive recursiveness.

For example, in a case like this:

%0 = alloca [50025 x i32], align 4
%1 = getelementptr inbounds [50025 x i32]* %0, i64 0, i64 0
store i32 0,                         i32* %1
%2 = getelementptr inbounds          i32* %1, i64 1
store i32 1,                         i32* %2
%3 = getelementptr inbounds          i32* %2, i64 1
store i32 2,                         i32* %3
%4 = getelementptr inbounds          i32* %3, i64 1
store i32 3,                         i32* %4
%5 = getelementptr inbounds          i32* %4, i64 1
store i32 4,                         i32* %5
%6 = getelementptr inbounds          i32* %5, i64 1
store i32 5,                         i32* %6
...

This piece of code crashes llvm when trying to apply instcombine on
desktop. On embedded devices this could happen with a much lower limit
of recursiveness.  Some instructions (getelementptr and bitcasts) make
the function recursively call itself on their uses, which is what makes
the example above consume so much stack (it becomes a recursive
depth-first tree visit with a very big depth).

The patch changes the algorithm to be semantically equivalent, but
iterative instead of recursive and the visiting order to be from a
depth-first visit to a breadth-first visit (visit all the instructions
of the current level before the ones of the next one).

Now if a lot of memory is required a heap allocation is done instead of
the the stack allocation, avoiding the possible crash.

Reviewed By: rnk

Differential Revision: http://reviews.llvm.org/D4355

Patch by Marcello Maggioni!  We don't generally commit large stress test
that look for out of memory conditions, so I didn't request that one be
added to the patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212133 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
Reid Kleckner
2014-07-01 21:36:20 +00:00
parent a96332b9c7
commit 3ca3826528

View File

@@ -50,13 +50,17 @@ static bool pointsToConstantGlobal(Value *V) {
/// can optimize this.
static bool
isOnlyCopiedFromConstantGlobal(Value *V, MemTransferInst *&TheCopy,
SmallVectorImpl<Instruction *> &ToDelete,
bool IsOffset = false) {
SmallVectorImpl<Instruction *> &ToDelete) {
// We track lifetime intrinsics as we encounter them. If we decide to go
// ahead and replace the value with the global, this lets the caller quickly
// eliminate the markers.
for (Use &U : V->uses()) {
SmallVector<std::pair<Value *, bool>, 35> ValuesToInspect;
ValuesToInspect.push_back(std::make_pair(V, false));
while (!ValuesToInspect.empty()) {
auto ValuePair = ValuesToInspect.pop_back_val();
const bool IsOffset = ValuePair.second;
for (auto &U : ValuePair.first->uses()) {
Instruction *I = cast<Instruction>(U.getUser());
if (LoadInst *LI = dyn_cast<LoadInst>(I)) {
@@ -67,16 +71,14 @@ isOnlyCopiedFromConstantGlobal(Value *V, MemTransferInst *&TheCopy,
if (isa<BitCastInst>(I) || isa<AddrSpaceCastInst>(I)) {
// If uses of the bitcast are ok, we are ok.
if (!isOnlyCopiedFromConstantGlobal(I, TheCopy, ToDelete, IsOffset))
return false;
ValuesToInspect.push_back(std::make_pair(I, IsOffset));
continue;
}
if (GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(I)) {
// If the GEP has all zero indices, it doesn't offset the pointer. If it
// doesn't, it does.
if (!isOnlyCopiedFromConstantGlobal(
GEP, TheCopy, ToDelete, IsOffset || !GEP->hasAllZeroIndices()))
return false;
ValuesToInspect.push_back(
std::make_pair(I, IsOffset || !GEP->hasAllZeroIndices()));
continue;
}
@@ -144,6 +146,7 @@ isOnlyCopiedFromConstantGlobal(Value *V, MemTransferInst *&TheCopy,
// Otherwise, the transform is safe. Remember the copy instruction.
TheCopy = MI;
}
}
return true;
}