This patch migrates the strlen optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167103 91177308-0d34-0410-b5e6-96231b3b80d8
This patch migrates the strncpy optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167102 91177308-0d34-0410-b5e6-96231b3b80d8
This is important for loops in the LAPACK test-suite.
These loops start at 1 because they are auto-converted from fortran.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167084 91177308-0d34-0410-b5e6-96231b3b80d8
This patch migrates the stpcpy optimizations from the simplify-libcalls
pass into the instcombine library call simplifier. Note that the
__stpcpy_chk simplifications were migrated in a previous commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167083 91177308-0d34-0410-b5e6-96231b3b80d8
r166198 migrated the strcpy optimization to instcombine. The strcpy
simplifier that was migrated from Transforms/Scalar/SimplifyLibCalls.cpp
was also doing some __strcpy_chk simplifications. Those fortified
simplifications were migrated as well, but introduced a bug in the
__stpcpy_chk simplifier in the process. This happened because the
__strcpy_chk and __stpcpy_chk simplifiers were both mapped to StrCpyChkOpt
which was updated with simplifications that worked for __strcpy_chk, but
not __stpcpy_chk.
This patch fixes the problem by adding proper test coverage and creating a
new simplifier for __stpcpy_chk (instead of sharing one with __strcpy_chk).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167082 91177308-0d34-0410-b5e6-96231b3b80d8
integers in that the code to handle split alloca-wide integer loads or
stores doesn't come first. It should, for the same reasons as with
integers, and the PR attests to that. Also had to fix a busted assert in
that this test case also covers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167051 91177308-0d34-0410-b5e6-96231b3b80d8
When the switch-to-lookup tables transform landed in SimplifyCFG, it
was pointed out that this could be inappropriate for some targets.
Since there was no way at the time for the pass to know anything about
the target, an awkward reverse-transform was added in CodeGenPrepare
that turned lookup tables back into switches for some targets.
This patch uses the new TargetTransformInfo to determine if a
switch should be transformed, and removes
CodeGenPrepare::ConvertLoadToSwitch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167011 91177308-0d34-0410-b5e6-96231b3b80d8
getCastInstrCost had an assert prohibiting scalar to vector casts. Such casts,
however, are allowed. This should make the vectorizer buildbot happier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166998 91177308-0d34-0410-b5e6-96231b3b80d8
This turns loops like
for (unsigned i = 0; i != n; ++i)
p[i] = p[i+1];
into memmove, which has a highly optimized implementation in most libcs.
This was really easy with the new DependenceAnalysis :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166875 91177308-0d34-0410-b5e6-96231b3b80d8
Requires a lot less code and complexity on loop-idiom's side and the more
precise analysis can catch more cases, like the one I included as a test case.
This also fixes the edge-case miscompilation from PR9481.
Compile time performance seems to be slightly worse, but this is mostly due
to an extra LCSSA run scheduled by the PassManager and should be fixed there.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166874 91177308-0d34-0410-b5e6-96231b3b80d8
Add getCostXXX calls for different families of opcodes, such as casts, arithmetic, cmp, etc.
Port the LoopVectorizer to the new API.
The LoopVectorizer now finds instructions which will remain uniform after vectorization. It uses this information when calculating the cost of these instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166836 91177308-0d34-0410-b5e6-96231b3b80d8
list of externals. This makes sense since a shared library with no symbols
can still be useful if it has static constructors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166795 91177308-0d34-0410-b5e6-96231b3b80d8
The LoopSimplify bug is pretty harmless because the loop goes from unanalyzable
to analyzable but the LCSSA bug is very nasty. It only comes into play with a
specific order of the LoopPassManager worklist and can cause actual
miscompilations, when a SCEV refers to a value that has been replaced with PHI
node. SCEVExpander may then insert code into the wrong place, either violating
domination or randomly miscompiling stuff.
Comes with an extensive test case reduced from the test-suite with
bugpoint+SCEVValidator.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166787 91177308-0d34-0410-b5e6-96231b3b80d8
This is the first of several steps to incorporate information from the new
TargetTransformInfo infrastructure into BBVectorize. Two things are done here:
1. Target information is used to determine if it is profitable to fuse two
instructions. This means that the cost of the vector operation must not
be more expensive than the cost of the two original operations. Pairs that
are not profitable are no longer considered (because current cost information
is incomplete, for intrinsics for example, equal-cost pairs are still
considered).
2. The 'cost savings' computed for the profitability check are also used to
rank the DAGs that represent the potential vectorization plans. Specifically,
for nodes of non-trivial depth, the cost savings is used as the node
weight.
The next step will be to incorporate the shuffle costs into the DAG weighting;
this will give the edges of the DAG weights as well. Once that is done, when
target information is available, we should be able to dispense with the
depth heuristic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166716 91177308-0d34-0410-b5e6-96231b3b80d8
The isValueEqualityComparison() guard at the top of SimplifySwitch()
only applies to some of the possible transformations.
The newer transformations work just fine on large switches, and the
check on predecessor count is nonsensical.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166710 91177308-0d34-0410-b5e6-96231b3b80d8
smaller integer loads and stores.
The high-level motivation is that the frontend sometimes generates
a single whole-alloca integer load or store during ABI lowering of
splittable allocas. We need to be able to break this apart in order to
see the underlying elements and properly promote them to SSA values. The
hope is that this fixes some performance regressions on x86-32 with the
new SROA pass.
Unfortunately, this causes quite a bit of churn in the test cases, and
bloats some IR that comes out. When we see an alloca that consists soley
of bits and bytes being extracted and re-inserted, we now do some
splitting first, before building widened integer "bucket of bits"
representations. These are always well folded by instcombine however, so
this shouldn't actually result in missed opportunities.
If this splitting of all-integer allocas does cause problems (perhaps
due to smaller SSA values going into the RA), we could potentially go to
some extreme measures to only do this integer splitting trick when there
are non-integer component accesses of an alloca, but discovering this is
quite expensive: it adds yet another complete walk of the recursive use
tree of the alloca.
Either way, I will be watching build bots and LNT bots to see what
fallout there is here. If anyone gets x86-32 numbers before & after this
change, I would be very interested.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166662 91177308-0d34-0410-b5e6-96231b3b80d8
When the trip count is -1, getSmallConstantTripMultiple could return zero,
and this would cause runtime loop unrolling to assert. Instead of returning
zero, one is now returned (consistent with the existing overflow cases).
Fixes PR14167.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166612 91177308-0d34-0410-b5e6-96231b3b80d8
loads. It's not really profitable and may result in GVN going into an infinite
loop when it hits constructs like this:
%x = gep %some.type %x, ...
Found via an LTO build of LLVM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166490 91177308-0d34-0410-b5e6-96231b3b80d8
%V = mul i64 %N, 4
%t = getelementptr i8* bitcast (i32* %arr to i8*), i32 %V
into
%t1 = getelementptr i32* %arr, i32 %N
%t = bitcast i32* %t1 to i8*
incorporating the multiplication into the getelementptr.
This happens all the time in dragonegg, for example for
int foo(int *A, int N) {
return A[N];
}
because gcc turns this into byte pointer arithmetic before it hits the plugin:
D.1590_2 = (long unsigned int) N_1(D);
D.1591_3 = D.1590_2 * 4;
D.1592_5 = A_4(D) + D.1591_3;
D.1589_6 = *D.1592_5;
return D.1589_6;
The D.1592_5 line is a POINTER_PLUS_EXPR, which is turned into a getelementptr
on a bitcast of A_4 to i8*, so this becomes exactly the kind of IR that the
transform fires on.
An analogous transform (with no testcases!) already existed for bitcasts of
arrays, so I rewrote it to share code with this one.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166474 91177308-0d34-0410-b5e6-96231b3b80d8
Unreachable blocks can have invalid instructions. For example,
jump threading can produce self-referential instructions in
unreachable blocks. Also, we should not be spending time
optimizing unreachable code. Fixes PR14133.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166423 91177308-0d34-0410-b5e6-96231b3b80d8