the definition of the nsw and nuw flags to make use of it.
nsw was introduced to help optimizers answer yes to the following:
// Can we change i from i32 to i64 to eliminate the cast inside the loop?
for (int i = 0; i < n; ++i) A[i] *= 0.1;
// Can we assume that this loop will eventually terminate?
for (int i = 0; i <= n; ++i) A[i] *= 0.1;
In its current form, it isn't truly sufficient for either.
In the first case, if the increment overflows, it'll still have some
valid i32 value; sign-extending it will produce a value which is 33
homogeneous sign bits trailed by 31 independent undef bits. If i is
promoted to i64, it won't have those same values when it reaches that
point. (The compiler could recover here by reasoning about how i is
used by the load, but that's a lot more complicated and isn't always
possible.)
In the second case, there is no value for i which will be greater than
n, so having the increment return undef on overflow doesn't help.
Trap values are a formalization of some existing concepts that we have
about LLVM IR, and give the optimizers a better basis for answering yes
to both questions above.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102140 91177308-0d34-0410-b5e6-96231b3b80d8
arguments are handled with a new InlineFunctionInfo class. This
makes it easier to extend InlineFunction to return more info in the
future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102137 91177308-0d34-0410-b5e6-96231b3b80d8
define void @f3(void (i8*)* %__f) ssp {
entry:
call void %__f(i8* undef)
unreachable
}
define void @f4(i8* %this) ssp align 2 {
entry:
call void @f3(void (i8*)* @f2) ssp
ret void
}
The inliner is turning the indirect call to %__f into a direct
call to F2. Make the call graph more precise when this happens.
The inliner doesn't revisit call sites introduced by inlining,
so there isn't an easy way to test for this, but a more precise
callgraph is a good thing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102131 91177308-0d34-0410-b5e6-96231b3b80d8
Fix RefreshCallGraph to use CGN->replaceCallEdge instead of hand
rolling its own loop. replaceCallEdge properly maintains the
reference counts of the nodes, fixing a crash exposed by the
iterative callgraph stuff.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102120 91177308-0d34-0410-b5e6-96231b3b80d8
FunctionLoweringInfo, as it isn't SelectionDAG-specific. This isn't
completely natural, as PHI node state is not per-function but rather
per-basic-block, however there's currently no other convenient
per-basic-block state to group it with.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102109 91177308-0d34-0410-b5e6-96231b3b80d8
This actually makes everything slower, but the plan is to have isel add <kill>
flags the way it is already adding <dead> flags. Then LiveVariables can be
removed again.
When ignoring the time spent in LiveVariables, -regalloc=fast is now twice as
fast as -regalloc=local.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102034 91177308-0d34-0410-b5e6-96231b3b80d8
So far this is just a clone of -regalloc=local that has been lobotomized to run
25% faster. It drops the least-recently-used calculations, and is just plain
stupid when it runs out of registers.
The plan is to make this go even faster for -O0 by taking advantage of the short
live intervals in unoptimized code. It should not be necessary to calculate
liveness when most virtual registers are killed 2-3 instructions after they are
born.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102006 91177308-0d34-0410-b5e6-96231b3b80d8