string_ostream is a safe and efficient string builder that combines opaque
stack storage with a built-in ostream interface.
small_string_ostream<bytes> additionally permits an explicit stack storage size
other than the default 128 bytes to be provided. Beyond that, storage is
transferred to the heap.
This convenient class can be used in most places an
std::string+raw_string_ostream pair or SmallString<>+raw_svector_ostream pair
would previously have been used, in order to guarantee consistent access
without byte truncation.
The patch also converts much of LLVM to use the new facility. These changes
include several probable bug fixes for truncated output, a programming error
that's no longer possible with the new interface.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211749 91177308-0d34-0410-b5e6-96231b3b80d8
[LLVM part]
These patches rename the loop unrolling and loop vectorizer metadata
such that they have a common 'llvm.loop.' prefix. Metadata name
changes:
llvm.vectorizer.* => llvm.loop.vectorizer.*
llvm.loopunroll.* => llvm.loop.unroll.*
This was a suggestion from an earlier review
(http://reviews.llvm.org/D4090) which added the loop unrolling
metadata.
Patch by Mark Heffernan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211710 91177308-0d34-0410-b5e6-96231b3b80d8
The induction variables start value needs to be defined before we branch
(overflow check) to the scalar preheader where we used it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211460 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds support to recognize patterns such as fadd,fsub,fadd,fsub.../add,sub,add,sub... and
vectorizes them as vector shuffles if they are profitable.
These patterns of vector shuffle can later be converted to instructions such as addsubpd etc on X86.
Thanks to Arnold and Hal for the reviews. http://reviews.llvm.org/D4015
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211339 91177308-0d34-0410-b5e6-96231b3b80d8
If we have common uses on separate paths in the tree; process the one with greater common depth first.
This makes sure that we do not assume we need to extract a load when it is actually going to be part of a vectorized tree.
Review: http://reviews.llvm.org/D3800
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210310 91177308-0d34-0410-b5e6-96231b3b80d8
The loop vectorizer instantiates be-taken-count + 1 as the loop iteration count.
If this expression overflows the generated code was invalid.
In case of overflow the code now jumps to the scalar loop.
Fixes PR17288.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209854 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This adds two new diagnostics: -pass-remarks-missed and
-pass-remarks-analysis. They take the same values as -pass-remarks but
are intended to be triggered in different contexts.
-pass-remarks-missed is used by LLVMContext::emitOptimizationRemarkMissed,
which passes call when they tried to apply a transformation but
couldn't.
-pass-remarks-analysis is used by LLVMContext::emitOptimizationRemarkAnalysis,
which passes call when they want to inform the user about analysis
results.
The patch also:
1- Adds support in the inliner for the two new remarks and a
test case.
2- Moves emitOptimizationRemark* functions to the llvm namespace.
3- Adds an LLVMContext argument instead of making them member functions
of LLVMContext.
Reviewers: qcolombet
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3682
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209442 91177308-0d34-0410-b5e6-96231b3b80d8
Turns out that there is a very cheap way of testing whether a block is dead,
just look it up in the DomTree. We have to do this anyways so just ignore
unreachable blocks before sorting by domination. This restores a proper
ordering for std::stable_sort when dead code is present.
Covered by existing tests & buildbots running in STL debug mode (MSVC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208492 91177308-0d34-0410-b5e6-96231b3b80d8
There is no total ordering if the CFG is disconnected. We don't care if we
catch all CSE opportunities in dead code either so just exclude ignore them in
the assert.
PR19646
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208461 91177308-0d34-0410-b5e6-96231b3b80d8
When can't assume a vectorized tree is rooted in an instruction. The IRBuilder
could have constant folded it. When we rebuild the build_vector (the series of
InsertElement instructions) use the last original InsertElement instruction. The
vectorized tree root is guaranteed to be before it.
Also, we can't assume that the n-th InsertElement inserts the n-th element into
a vector.
This reverts r207746 which reverted the revert of the revert of r205018 or so.
Fixes the test case in PR19621.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207939 91177308-0d34-0410-b5e6-96231b3b80d8
There is no point in creating it if we're not going to vectorize
anything. Creating the map is expensive as it creates large values.
No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207916 91177308-0d34-0410-b5e6-96231b3b80d8
There are public functions that mutate various members as well as
another private member already, so make all the members private to
avoid the discontinuity and add accessors for the values. Should
be no functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207868 91177308-0d34-0410-b5e6-96231b3b80d8
=[
Turns out that this was the root cause of PR19621. We found a crasher
only recently (likely due to improvements elsewhere in the SLP
vectorizer) but the reduced test case failed all the way back to here.
I've confirmed that reverting this patch both fixes the reduced test
case in PR19621 and the actual source file that led to it, so it seems
to really be rooted here. I've replied to the commit thread with
discussion of my (feeble) attempts to debug this. Didn't make it very
far, so reverting now that we have a good test case so that things can
get back to healthy while the debugging carries on.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207746 91177308-0d34-0410-b5e6-96231b3b80d8
This patch changes the vectorization remarks to also inform when
vectorization is possible but not beneficial.
Added tests to exercise some loop remarks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207574 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This calls emitOptimizationRemark from the loop unroller and vectorizer
at the point where they make a positive transformation. For the
vectorizer, it reports vectorization and interleave factors. For the
loop unroller, it reports all the different supported types of
unrolling.
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3456
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207528 91177308-0d34-0410-b5e6-96231b3b80d8
definition below all of the header #include lines, lib/Transforms/...
edition.
This one is tricky for two reasons. We again have a couple of passes
that define something else before the includes as well. I've sunk their
name macros with the DEBUG_TYPE.
Also, InstCombine contains headers that need DEBUG_TYPE, so now those
headers #define and #undef DEBUG_TYPE around their code, leaving them
well formed modular headers. Fixing these headers was a large motivation
for all of these changes, as "leaky" macros of this form are hard on the
modules implementation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206844 91177308-0d34-0410-b5e6-96231b3b80d8
The vectorizer only knows how to vectorize intrinics by widening all operands by
the same factor.
Patch by Tyler Nowicki!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205855 91177308-0d34-0410-b5e6-96231b3b80d8
Some Intrinsics are overloaded to the extent that return type equality (all
that's been checked up to now) does not guarantee that the arguments are the
same. In these cases SLP vectorizer should not recurse into the operands, which
can be achieved by comparing them as "Function *" rather than simply the ID.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205424 91177308-0d34-0410-b5e6-96231b3b80d8
For the purpose of calculating the cost of the loop at various vectorization
factors, we need to count dependencies of consecutive pointers as uniforms
(which means that the VF = 1 cost is used for all overall VF values).
For example, the TSVC benchmark function s173 has:
...
%3 = add nsw i64 %indvars.iv, 16000
%arrayidx8 = getelementptr inbounds %struct.GlobalData* @global_data, i64 0, i32 0, i64 %3
...
and we must realize that the add will be a scalar in order to correctly deduce
it to be profitable to vectorize this on PowerPC with VSX enabled. In fact, all
dependencies of a consecutive pointer must be a scalar (uniform), and so we
simply need to add all consecutive pointers to the worklist that currently
detects collects uniforms.
Fixes PR19296.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205387 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r205018.
Conflicts:
lib/Transforms/Vectorize/SLPVectorizer.cpp
test/Transforms/SLPVectorizer/X86/insert-element-build-vector.ll
This is breaking libclc build.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205260 91177308-0d34-0410-b5e6-96231b3b80d8
Extract element instructions that will be removed when vectorzing lower the
cost.
Patch by Arch D. Robison!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205020 91177308-0d34-0410-b5e6-96231b3b80d8
Extracts coming from phis were being hoisted, while all others were
sunk to their uses. This was inconsistent and didn't seem to serve a
purpose. Changing all extracts to be sunk to uses is a prerequisite
for adding block frequency to the SLP vectorizer's cost model.
I benchmarked the change in isolation (without block frequency). I
only saw noise on x86 and some potentially significant improvements on
ARM. No major regressions is good enough for me.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204699 91177308-0d34-0410-b5e6-96231b3b80d8
noise.
Original commit log:
Replace some dead code with an assert. When I first ported this pass
from a loop pass to a function pass I did so in the naive, recursive
way. It doesn't actually work, we need a worklist instead. When
I switched to the worklist I didn't delete the naive recursion. That
recursion was also buggy because it was dead and never really exercised.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204187 91177308-0d34-0410-b5e6-96231b3b80d8
pass from a loop pass to a function pass I did so in the naive,
recursive way. It doesn't actually work, we need a worklist instead.
When I switched to the worklist I didn't delete the naive recursion.
That recursion was also buggy because it was dead and never really
exercised.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204184 91177308-0d34-0410-b5e6-96231b3b80d8