We won't find a root with index zero in any loop that we are able to reroll.
However, we may find one in a non-rerollable loop, so bail gracefully instead
of failing hard.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229406 91177308-0d34-0410-b5e6-96231b3b80d8
If a PHI has no users, don't crash; bail gracefully. This shouldn't
happen often, but we can make no guarantees that previous passes didn't leave
dead code around.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229405 91177308-0d34-0410-b5e6-96231b3b80d8
We didn't properly handle the out-of-bounds case for
ConstantAggregateZero and UndefValue. This would manifest as a crash
when the constant folder was asked to fold a load of a constant global
whose struct type has no operands.
This fixes PR22595.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229352 91177308-0d34-0410-b5e6-96231b3b80d8
The "dereferenceable" attribute cannot be added via .addAttribute(),
since it also expects a size in bytes. AttrBuilder#addAttribute or
AttributeSet#addAttribute is wrapped by classes Function, InvokeInst,
and CallInst. Add corresponding wrappers to
AttrBuilder#addDereferenceableAttr.
Having done this, propagate the dereferenceable attribute via
gc.relocate, adding a test to exercise it. Note that -datalayout is
required during execution over and above -instcombine, because
InstCombine only optionally requires DataLayoutPass.
Differential Revision: http://reviews.llvm.org/D7510
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229265 91177308-0d34-0410-b5e6-96231b3b80d8
If we know that the sign bit of a value being sign extended is zero, we can use a zero extension instead. This is motivated by the fact that zero extensions are generally cheaper on x86 (and most other architectures?). We already apply a similar transform in DAGCombine, this just extends that to the IR level.
This comes up when we eagerly canonicalize gep indices to the width of a machine register (i64 on x86_64). To do so, we insert sign extensions (sext) to promote smaller types.
Differential Revision: http://reviews.llvm.org/D7255
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229189 91177308-0d34-0410-b5e6-96231b3b80d8
This patch fixes a problem I accidentally introduced in an instruction combine
on select instructions added at r227197. That revision taught the instruction
combiner how to fold a cttz/ctlz followed by a icmp plus select into a single
cttz/ctlz with flag 'is_zero_undef' cleared.
However, the new rule added at r227197 would have produced wrong results in the
case where a cttz/ctlz with flag 'is_zero_undef' cleared was follwed by a
zero-extend or truncate. In that case, the folded instruction would have
been inserted in a wrong location thus leaving the CFG in an inconsistent
state.
This patch fixes the problem and add two reproducible test cases to
existing test 'InstCombine/select-cmp-cttz-ctlz.ll'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229124 91177308-0d34-0410-b5e6-96231b3b80d8
SimplifyCFG now knows how to speculate calls to intrinsic cttz/ctlz that are
'cheap' for the target. Therefore, some of the logic in CodeGenPrepare
that was originally added at revision 224899 can now be removed.
This patch is basically a no functional change. It removes the duplicated
logic in CodeGenPrepare and converts all the existing target specific tests
for cttz/ctlz into SimplifyCFG tests.
Differential Revision: http://reviews.llvm.org/D7608
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229105 91177308-0d34-0410-b5e6-96231b3b80d8
The issues with the new unroll analyzer are more fundamental than code
cleanup, algorithm, or data structure changes. I've sent an email to the
original commit thread with details and a proposal for how to redesign
things. I'm disabling this for now so that we don't spend time
debugging issues with it in its current state.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229064 91177308-0d34-0410-b5e6-96231b3b80d8
- First, there's a crash when we try to combine that pointers into `icmp`
directly by creating a `bitcast`, which is invalid if that two pointers are
from different address spaces.
- It's not always appropriate to cast one pointer to another if they are from
different address spaces as that is not no-op cast. Instead, we only combine
`icmp` from `ptrtoint` if that two pointers are of the same address space.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229063 91177308-0d34-0410-b5e6-96231b3b80d8
propagating of metadata.
We were propagating !nonnull metadata even when the newly formed load is
no longer of a pointer type. This is clearly broken and results in LLVM
failing the verifier and aborting. This patch just restricts the
propagation of !nonnull metadata to when we actually have a pointer
type.
This bug report and the initial version of this patch was provided by
Charles Davis! Many thanks for finding this!
We still need to add logic to round-trip the metadata correctly if we
combine from pointer types to integer types and then back by using range
metadata for the integer type loads. But this is the minimal and safe
version of the patch, which is important so we can backport it into 3.6.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229029 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Instances of the AssumptionCache are per function, so we can't re-use
the same AssumptionCache instance when recursing in the CallAnalyzer to
analyze a different function. Instead we have to pass the
AssumptionCacheTracker to the CallAnalyzer so it can get the right
AssumptionCache on demand.
Reviewers: hfinkel
Subscribers: llvm-commits, hans
Differential Revision: http://reviews.llvm.org/D7533
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228957 91177308-0d34-0410-b5e6-96231b3b80d8
We can't solve the full subgraph isomorphism problem. But we can
allow obvious cases, where for example two instructions of different
types are out of order. Due to them having different types/opcodes,
there is no ambiguity.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228931 91177308-0d34-0410-b5e6-96231b3b80d8
Now that SimplifyCFG uses TTI for the cost heuristic, we can teach BasicTTIImpl
how to query TLI in order to get a more accurate cost for truncates and
zero-extends.
Before this patch, the basic cost heuristic in TargetTransformInfoImplCRTPBase
would have conservatively returned a 'default' TCC_Basic for all zero-extends,
and TCC_Free for truncates on native types.
This patch improves the heuristic so that we query TLI (if available) to get
more accurate answers. If TLI is available, then methods 'isZExtFree' and
'isTruncateFree' can be used to check if a zext/trunc is free for the target.
Added more test cases to SimplifyCFG/X86/speculate-cttz-ctlz.ll.
With this change, SimplifyCFG is now able to speculate a 'cheap' cttz/ctlz
immediately followed by a free zext/trunc.
Differential Revision: http://reviews.llvm.org/D7585
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228923 91177308-0d34-0410-b5e6-96231b3b80d8
Apparently some code finally started to tickle this after my
canonicalization changes to instcombine.
The bug stems from trying to form a vector type out of scalars that
aren't compatible at all. In this example, from x86_mmx values. The code
in the vectorizer that checks for reasonable types whas checking for
aggregates or vectors, but there are lots of other types that should
just never reach the vectorizer.
Debugging this was made more confusing by the lie in an assert in
VectorType::get() -- it isn't that the types are *primitive*. The types
must be integer, pointer, or floating point types. No other types are
allowed.
I've improved the assert and added a helper to the vectorizer to handle
the element type validity checks. It now re-uses the VectorType static
function and then further excludes weird target-specific types that we
probably shouldn't be touching here (x86_fp80 and ppc_fp128). Neither of
these are really reachable anyways (neither 80-bit nor 128-bit things
will get vectorized) but it seems better to just eagerly exclude such
nonesense.
I've added a test case, but while it definitely covers two of the paths
through this code there may be more paths that would benefit from test
coverage. I'm not familiar enough with the SLP vectorizer to synthesize
test cases for all of these, but was able to update the code itself by
inspection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228899 91177308-0d34-0410-b5e6-96231b3b80d8
I mistakenly thought the liveness of each "RetVal(F, i)" depended only on F. It
actually depends on the index too, which means we need to be careful about how
the results are combined before return. In particular if a single Use returns
Live, that counts for the entire object, at the granularity we're considering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228885 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When trying to canonicalize negative constants out of
multiplication expressions, we need to check that the
constant is not INT_MIN which cannot be negated.
Reviewers: mcrosier
Reviewed By: mcrosier
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D7286
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228872 91177308-0d34-0410-b5e6-96231b3b80d8
This patch is a follow-up of r228826 (see code-review: D7506).
Now that SimplifyCFG uses TargetTransformInfo for cost analysis, we
have to fix the cost heuristic for intrinsic calls to cttz/ctlz.
This patch defines method 'getIntrinsicCost' in BasicTTIImpl: now, BasicTTIImpl
queries TLI to check if a call to cttz/ctlz is cheap for the target.
Added test cases in Transforms/SimplifyCFG/X86 to verify that on x86,
SimplifyCFG only speculates a call to cttz/ctlz if it is cheap.
Differential Revision: http://reviews.llvm.org/D7554
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228829 91177308-0d34-0410-b5e6-96231b3b80d8
analysis.
We're already using TTI in SimplifyCFG, so remove the hard-baked "cheapness"
heuristic and use TTI directly. Generally NFC intended, but we're using a slightly
different heuristic now so there is a slight test churn.
Test changes:
* combine-comparisons-by-cse.ll: Removed unneeded branch check.
* 2014-08-04-muls-it.ll: Test now doesn't branch but emits muleq.
* coalesce-subregs.ll: Superfluous block check.
* 2008-01-02-hoist-fp-add.ll: fadd is safe to speculate. Change to udiv.
* PhiBlockMerge.ll: Superfluous CFG checking code. Main checks still present.
* select-gep.ll: A variable GEP is not expensive, just TCC_Basic, according to the TTI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228826 91177308-0d34-0410-b5e6-96231b3b80d8
A DAGRootSet models an induction variable being used in a rerollable
loop. For example:
x[i*3+0] = y1
x[i*3+1] = y2
x[i*3+2] = y3
Base instruction -> i*3
+---+----+
/ | \
ST[y1] +1 +2 <-- Roots
| |
ST[y2] ST[y3]
There may be multiple DAGRootSets, for example:
x[i*2+0] = ... (1)
x[i*2+1] = ... (1)
x[i*2+4] = ... (2)
x[i*2+5] = ... (2)
x[(i+1234)*2+5678] = ... (3)
x[(i+1234)*2+5679] = ... (3)
This concept is similar to the "Scale" member used previously, but allows
multiple independent sets of roots based off the same induction variable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228821 91177308-0d34-0410-b5e6-96231b3b80d8
If the landingpad of the invoke is using a personality function that
catches asynch exceptions, then it can catch a trap.
Also add some landingpads to invalid LLVM IR test cases that lack them.
Over-the-shoulder reviewed by David Majnemer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228782 91177308-0d34-0410-b5e6-96231b3b80d8
Unless we meet an insertvalue on a path from some value to a return, that value
will be live if *any* of the return's components are live, so all of those
components must be added to the MaybeLiveUses.
Previously we were deleting arguments if sub-value 0 turned out to be dead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228731 91177308-0d34-0410-b5e6-96231b3b80d8
This commit isn't using the correct context, and is transfoming calls
that are operands to loads rather than calls that are operands to an
icmp feeding into an assume. I've replied on the original review thread
with a very reduced test case and some thoughts on how to rework this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228677 91177308-0d34-0410-b5e6-96231b3b80d8
These tests the two optimizations for backedge insertion currently implemented and the split backedge flag which is currently off by default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228617 91177308-0d34-0410-b5e6-96231b3b80d8
This is just adding really simple tests which should have been part of the original submission. When doing so, I discovered that I'd mistakenly removed required pieces when preparing the patch for upstream submission. I fixed two such bugs in this submission.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228610 91177308-0d34-0410-b5e6-96231b3b80d8
Some parts of DeadArgElim were only considering the individual fields
of StructTypes separately, but others (where insertvalue &
extractvalue instructions occur) also looked into ArrayTypes.
This one is an actual bug; the mismatch can lead to an argument being
considered used by a return sub-value that isn't being tracked (and
hence is dead by default). It then gets incorrectly eliminated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228559 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, a non-extractvalue use of an aggregate return value meant
the entire return was considered live (the algorithm gave up
entirely). This was correct, but conservative. It's better to actually
look at that Use, making the analysis results apply to all sub-values
under consideration.
E.g.
%val = call { i32, i32 } @whatever()
[...]
ret { i32, i32 } %val
The return is using the entire aggregate (sub-values 0 and 1). We can
still simplify @whatever if we can prove that this return is itself
unused.
Also unifies the logic slightly between aggregate and non-aggregate
cases..
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228558 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The alias.scope metadata represents sets of things an instruction might
alias with. When generically combining the metadata from two
instructions the result must be the union of the original sets, because
the new instruction might alias with anything any of the original
instructions aliased with.
Reviewers: hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D7490
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228525 91177308-0d34-0410-b5e6-96231b3b80d8
Turns out there is a simpler way of checking that all bytes in a word are equal
than binary decomposition.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228503 91177308-0d34-0410-b5e6-96231b3b80d8
Normalize
select(C0, select(C1, a, b), b) -> select((C0 & C1), a, b)
select(C0, a, select(C1, a, b)) -> select((C0 | C1), a, b)
This normal form may enable further combines on the And/Or and shortens
paths for the values. Many targets prefer the other but can go back
easily in CodeGen.
Differential Revision: http://reviews.llvm.org/D7399
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228409 91177308-0d34-0410-b5e6-96231b3b80d8