Fixes the self-host fail. Note that this commit activates dominator
analysis in the combiner by default (like the original commit did).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222590 91177308-0d34-0410-b5e6-96231b3b80d8
The alloca's type is irrelevant, only those types which are used in a
load or store of the exact size of the slice should be considered.
This manifested as an assertion failure when we compared the various
types: we had a size mismatch.
This fixes PR21480.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222499 91177308-0d34-0410-b5e6-96231b3b80d8
Currently LoopUnroll generates a prologue loop before the main loop
body to execute first N%UnrollFactor iterations. Also, this loop is
used if trip-count can overflow - it's determined by a runtime check.
However, we've been mistakenly optimizing this loop to a linear code for
UnrollFactor = 2, not taking into account that it also serves as a safe
version of the loop if its trip-count overflows.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222451 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r222142. This is causing/exposing an execution-time regression
in spec2006/gcc and coremark on AArch64/A57/Ofast.
Conflicts:
test/Transforms/Reassociate/optional-flags.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222398 91177308-0d34-0410-b5e6-96231b3b80d8
When the BasicBlock containing the return instrution has a PHI with 2
incoming values, FoldReturnIntoUncondBranch will remove the no longer
used incoming value and remove the no longer needed phi as well. This
leaves us with a BB that no longer has a PHI, but the subsequent call
to FoldReturnIntoUncondBranch from FoldReturnAndProcessPred will not
remove the return instruction (which still uses the result of the call
instruction). This prevents EliminateRecursiveTailCall to remove
the value, as it is still being used in a basicblock which has no
predecessors.
The basicblock can not be erased on the spot, because its iterator is
still being used in runTRE.
This issue was exposed when removing the threshold on size for lifetime
marker insertion for named temporaries in clang. The testcase is a much
reduced version of peelOffOuterExpr(const Expr*, const ExplodedNode *)
from clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222354 91177308-0d34-0410-b5e6-96231b3b80d8
AliasSetTracker::addUnknown may create an AliasSet devoid of pointers
just to contain an instruction if no suitable AliasSet already exists.
It will then AliasSet::addUnknownInst and we will be done.
However, it's possible for addUnknown to choose an existing AliasSet to
addUnknownInst.
If this were to occur, we are in a bit of a pickle: removing pointers
from the AliasSet can cause the entire AliasSet to become destroyed,
taking our unknown instructions out with them.
Instead, keep track whether or not our AliasSet has any unknown
instructions.
This fixes PR21582.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222338 91177308-0d34-0410-b5e6-96231b3b80d8
We would attempt to replace an frem's operand with the same operand.
This would cause InstCombine to think real work was done, causing
InstCombine to enter an infinite loop.
This fixes the second part of PR21576.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222265 91177308-0d34-0410-b5e6-96231b3b80d8
flags are cleared. The reassociation pass was just reordering the leaf nodes
in the previous test cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222250 91177308-0d34-0410-b5e6-96231b3b80d8
EarlyCSE is giving up on the current instruction immediately when it recognizes that the current instruction makes a previous store trivially dead. There's no reason to do this. Once the previous store has been deleted, it's perfectly legal to remember the value of the current store (for value forwarding) and the fact the store occurred (it could be dead too!).
Reviewed by: Hal
Differential Revision: http://reviews.llvm.org/D6301
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222241 91177308-0d34-0410-b5e6-96231b3b80d8
It is impossible for (x & INT_MAX) == 0 && x == INT_MAX to ever be true.
While this sort of reasoning should normally live in InstSimplify,
the machinery that derives this result is not trivial to split out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222230 91177308-0d34-0410-b5e6-96231b3b80d8
I added a pessimization in r217102 to prevent miscompiles when the
incremented induction variable was used in a comparison; it would be
poison.
Try to use the incremented induction variable more often when we can be
sure that the increment won't end in poison.
Differential Revision: http://reviews.llvm.org/D6222
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222213 91177308-0d34-0410-b5e6-96231b3b80d8
When converting a switch to a lookup table we might have to generate a bitmaks
to encode and check for holes in the original switch statement.
The type of this mask depends on the number of switch statements, which can
result in illegal types for pretty much all architectures.
To avoid unnecessary type legalization and help FastISel this commit increases
the size of the bitmask to next power-of-2 value when necessary.
This fixes rdar://problem/18984639.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222168 91177308-0d34-0410-b5e6-96231b3b80d8
This is a simple optimization for switch table lookup:
It computes the output value directly with an (optional) mul and add if there is a linear mapping between index and output.
Example:
int f1(int x) {
switch (x) {
case 0: return 10;
case 1: return 11;
case 2: return 12;
case 3: return 13;
}
return 0;
}
generates:
define i32 @f1(i32 %x) #0 {
entry:
%0 = icmp ult i32 %x, 4
br i1 %0, label %switch.lookup, label %return
switch.lookup:
%switch.offset = add i32 %x, 10
ret i32 %switch.offset
return:
ret i32 0
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222121 91177308-0d34-0410-b5e6-96231b3b80d8
This adds back r222061, but now calls initializePAEvalPass from the correct
library to avoid link problems.
Original message:
Don't make assumptions about the name of private global variables.
Private variables are can be renamed, so it is not reliable to make
decisions on the name.
The name is also dropped by the assembler before getting to the
linker, so using the name causes a disconnect between how llvm makes a
decision (var name) and how the linker makes a decision (section it is
in).
This patch changes one case where we were looking at the variable name to use
the section instead.
Test tuning by Michael Gottesman.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222117 91177308-0d34-0410-b5e6-96231b3b80d8
Private variables are can be renamed, so it is not reliable to make
decisions on the name.
The name is also dropped by the assembler before getting to the
linker, so using the name causes a disconnect between how llvm makes a
decision (var name) and how the linker makes a decision (section it is
in).
This patch changes one case where we were looking at the variable name to use
the section instead.
Test tuning by Michael Gottesman.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222061 91177308-0d34-0410-b5e6-96231b3b80d8
We would attempt to replace a fptrunc of an frem with an identical
fptrunc. This would cause the new fptrunc to be added to the worklist.
Of course, this results in an infinite loop because we will keep
visiting the newly created fptruncs.
This fixes PR21576.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222040 91177308-0d34-0410-b5e6-96231b3b80d8
doing Load PRE"
This commit updates the failing test in
Analysis/TypeBasedAliasAnalysis/gvn-nonlocal-type-mismatch.ll
The failing test is sensitive to the order in which we process loads. This
version turns on the RPO traversal instead of the while DT traversal in GVN.
The new test code is functionally same just the order of loads that are
eliminated is swapped.
This new version also fixes an issue where GVN splits a critical edge and
potentially invalidate the RPO/DT iterator.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222039 91177308-0d34-0410-b5e6-96231b3b80d8
Prior to this commit fmul and fadd binary operators were being canonicalized for
both scalar and vector versions. We now canonicalize add, mul, and, or, and xor
vector instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222006 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r221924. It appears the commit was a bit premature and is causing
bot failures that need further investigation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221939 91177308-0d34-0410-b5e6-96231b3b80d8
If x is known to have the range [a, b), in a loop predicated by (icmp
ne x, a) its range can be sharpened to [a + 1, b). Get
ScalarEvolution and hence IndVars to exploit this fact.
This change triggers an optimization to widen-loop-comp.ll, so it had
to be edited to get it to pass.
This change was originally landed in r219834 but had a bug and broke
ASan. It was reverted in r219878, and is now being re-landed after
fixing the original bug.
phabricator: http://reviews.llvm.org/D5639
reviewed by: atrick
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221839 91177308-0d34-0410-b5e6-96231b3b80d8
Make the handling of calls to intrinsics in CGSCC consistent:
they are not treated like regular function calls because they
are never lowered to function calls.
Without this patch, we can get dangling pointer asserts from
the subsequent loop that processes callsites because it already
ignores intrinsics.
See http://llvm.org/bugs/show_bug.cgi?id=21403 for more details / discussion.
Differential Revision: http://reviews.llvm.org/D6124
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221802 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Reapply r221772. The old patch breaks the bot because the @indvar_32_bit test
was run whether NVPTX was enabled or not.
IndVarSimplify should not widen an indvar if arithmetics on the wider
indvar are more expensive than those on the narrower indvar. For
instance, although NVPTX64 treats i64 as a legal type, an ADD on i64 is
twice as expensive as that on i32, because the hardware needs to
simulate a 64-bit integer using two 32-bit integers.
Split from D6188, and based on D6195 which adds NVPTXTargetTransformInfo.
Fixes PR21148.
Test Plan:
Added @indvar_32_bit that verifies we do not widen an indvar if the arithmetics
on the wider type are more expensive. This test is run only when NVPTX is
enabled.
Reviewers: jholewinski, eliben, meheff, atrick
Reviewed By: atrick
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D6196
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221799 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
IndVarSimplify should not widen an indvar if arithmetics on the wider
indvar are more expensive than those on the narrower indvar. For
instance, although NVPTX64 treats i64 as a legal type, an ADD on i64 is
twice as expensive as that on i32, because the hardware needs to
simulate a 64-bit integer using two 32-bit integers.
Split from D6188, and based on D6195 which adds NVPTXTargetTransformInfo.
Fixes PR21148.
Test Plan:
Added @indvar_32_bit that verifies we do not widen an indvar if the arithmetics
on the wider type are more expensive.
Reviewers: jholewinski, eliben, meheff, atrick
Reviewed By: atrick
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D6196
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221772 91177308-0d34-0410-b5e6-96231b3b80d8
This patch enables the vec_vsx_ld and vec_vsx_st intrinsics for
PowerPC, which provide programmer access to the lxvd2x, lxvw4x,
stxvd2x, and stxvw4x instructions.
New LLVM intrinsics are provided to represent these four instructions
in IntrinsicsPowerPC.td. These are patterned after the similar
intrinsics for lvx and stvx (Altivec). In PPCInstrVSX.td, these
intrinsics are tied to the code gen patterns, with additional patterns
to allow plain vanilla loads and stores to still generate these
instructions.
At -O1 and higher the intrinsics are immediately converted to loads
and stores in InstCombineCalls.cpp. This will open up more
optimization opportunities while still allowing the correct
instructions to be generated. (Similar code exists for aligned
Altivec loads and stores.)
The new intrinsics are added to the code that checks for consecutive
loads and stores in PPCISelLowering.cpp, as well as to
PPCTargetLowering::getTgtMemIntrinsic().
There's a new test to verify the correct instructions are generated.
The loads and stores tend to be reordered, so the test just counts
their number. It runs at -O2, as it's not very effective to test this
at -O0, when many unnecessary loads and stores are generated.
I ended up having to modify vsx-fma-m.ll. It turns out this test case
is slightly unreliable, but I don't know a good way to prevent
problems with it. The xvmaddmdp instructions read and write the same
register, which is one of the multiplicands. Commutativity allows
either to be chosen. If the FMAs are reordered differently than
expected by the test, the register assignment can be different as a
result. Hopefully this doesn't change often.
There is a companion patch for Clang.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221767 91177308-0d34-0410-b5e6-96231b3b80d8
We currently have two ways of informing the optimizer that the result of a load is never null: metadata and assume. This change converts the second in to the former. This avoids a need to implement optimizations using both forms.
We should probably extend this basic idea to metadata of other forms; in particular, range metadata. We view is that assumes should be considered a "last resort" for when there isn't a more canonical way to represent something.
Reviewed by: Hal
Differential Revision: http://reviews.llvm.org/D5951
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221737 91177308-0d34-0410-b5e6-96231b3b80d8
This is a reapplication of r221171, but we only perform the transformation
on expressions which include a multiplication. We do not transform rem/div
operations as this doesn't appear to be safe in all cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221721 91177308-0d34-0410-b5e6-96231b3b80d8
Switch statements may have more than one incoming edge into the same BB if they
all have the same value. When the switch statement is converted these incoming
edges are now coming from multiple BBs. Updating all incoming values to be from
a single BB is incorrect and would generate invalid LLVM IR.
The fix is to only update the first occurrence of an incoming value. Switch
lowering will perform subsequent calls to this helper function for each incoming
edge with a new basic block - updating all edges in the process.
This fixes rdar://problem/18916275.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221627 91177308-0d34-0410-b5e6-96231b3b80d8
We would attempt to fold away a call instruction which had been marked
overdefined. However, it's not valid to transition to constant from
overdefined.
This fixes PR21512.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221513 91177308-0d34-0410-b5e6-96231b3b80d8
A pointer's pointee might not be sized: the pointee could be a function.
Report this as IK_NoInduction when calculating isInductionVariable.
This fixes PR21508.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221501 91177308-0d34-0410-b5e6-96231b3b80d8
Comparing the result of a cmpxchg instruction can be replaced with an
extractvalue of the cmpxchg success indicator.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221498 91177308-0d34-0410-b5e6-96231b3b80d8
instructions. Inlining might cause such cases and it's not valid to
reassociate floating-point instructions without the unsafe algebra flag.
Patch by Mehdi Amini <mehdi_amini@apple.com>!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221462 91177308-0d34-0410-b5e6-96231b3b80d8
When generating gcov compatible profiling, we sometimes skip emitting
data for functions for one reason or another. However, this was
emitting different function IDs in the .gcno and .gcda files, because
the .gcno case was using the loop index before skipping functions and
the .gcda the array index after. This resulted in completely invalid
gcov data.
This fixes the problem by making the .gcno loop track the ID
separately from the loop index.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221441 91177308-0d34-0410-b5e6-96231b3b80d8
Exact shifts may not shift out any non-zero bits. Use computeKnownBits
to determine when this occurs and just return the left hand side.
This fixes PR21477.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221325 91177308-0d34-0410-b5e6-96231b3b80d8
Divides and remainder operations do not behave like other operations
when they are given poison: they turn into undefined behavior.
It's really hard to know if the operands going into a div are or are not
poison. Because of this, we should only choose to speculate if there
are constant operands which we can easily reason about.
This fixes PR21412.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221318 91177308-0d34-0410-b5e6-96231b3b80d8
LoadCombine can be smarter about aborting when a writing instruction is
encountered, instead of aborting upon encountering any writing instruction, use
an AliasSetTracker, and only abort when encountering some write that might
alias with the loads that could potentially be combined.
This was originally motivated by comments made (and a test case provided) by
David Majnemer in response to PR21448. It turned out that LoadCombine was not
responsible for that PR, but LoadCombine should also be improved so that
unrelated stores (and @llvm.assume) don't interrupt load combining.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221203 91177308-0d34-0410-b5e6-96231b3b80d8
FoldOpIntoPhi could create an infinite loop if the PHI could potentially
reach a BB it was considering inserting instructions into. The
instructions it would insert would eventually lead to other combines
firing which would, again, lead to FoldOpIntoPhi firing.
The solution is to handicap FoldOpIntoPhi so that it doesn't attempt to
insert instructions that the PHI might reach.
This fixes PR21377.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221187 91177308-0d34-0410-b5e6-96231b3b80d8
EarlyCSE uses a simple generation scheme for handling memory-based
dependencies, and calls to @llvm.assume (which are marked as writing to memory
to ensure the preservation of control dependencies) disturb that scheme
unnecessarily. Skipping calls to @llvm.assume is legal, and the alternative
(adding AA calls in EarlyCSE) is likely undesirable (we have GVN for that).
Fixes PR21448.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221175 91177308-0d34-0410-b5e6-96231b3b80d8
call DAGCombiner. But we ran into a case (on Windows) where the
calling convention causes argument lowering to bail out of fast-isel,
and we end up in CodeGenAndEmitDAG() which does run DAGCombiner.
So, we need to make DAGCombiner check for 'optnone' after all.
Commit includes the test that found this, plus another one that got
missed in the original optnone work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221168 91177308-0d34-0410-b5e6-96231b3b80d8
m_ZExt might bind against a ConstantExpr instead of an Instruction.
Assuming this, using cast<Instruction>, results in InstCombine crashing.
Instead, introduce ZExtOperator to bridge both Instruction and
ConstantExpr ZExts.
This fixes PR21445.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221069 91177308-0d34-0410-b5e6-96231b3b80d8
This can happen pretty often in code that looks like:
int foo = bar - 1;
if (foo < 0)
do stuff
In this case, bar < 1 is an equivalent condition.
This transform requires that the add instruction be annotated with nsw.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221045 91177308-0d34-0410-b5e6-96231b3b80d8
In a case where we have a no {un,}signed wrap flag on the increment, if
RHS - Start is constant then we can avoid inserting a max operation bewteen
the two, since we can statically determine which is greater.
This allows us to unroll loops such as:
void testcase3(int v) {
for (int i=v; i<=v+1; ++i)
f(i);
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220960 91177308-0d34-0410-b5e6-96231b3b80d8
If we load from a location with range metadata, we can use information about the ranges of the loaded value for optimization purposes. This helps to remove redundant checks and canonicalize checks for other optimization passes. This particular patch checks whether a value is known to be non-zero from the range metadata.
Currently, these tests are against InstCombine. In theory, all of these should be InstSimplify since we're not inserting any new instructions. Moving the code may follow in a separate change.
Reviewed by: Hal
Differential Revision: http://reviews.llvm.org/D5947
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220925 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch finishes up support for handling sampling profiles in both
text and binary formats. The new binary format uses uleb128 encoding to
represent numeric values. This makes profiles files about 25% smaller.
The profile writer class can write profiles in the existing text and the
new binary format. In subsequent patches, I will add the capability to
read (and perhaps write) profiles in the gcov format used by GCC.
Additionally, I will be adding support in llvm-profdata to manipulate
sampling profiles.
There was a bit of refactoring needed to separate some code that was in
the reader files, but is actually common to both the reader and writer.
The new test checks that reading the same profile encoded as text or
raw, produces the same results.
Reviewers: bogner, dexonsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D6000
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220915 91177308-0d34-0410-b5e6-96231b3b80d8
Remove pointless checks for storage of uninteresting values. Ensure that we
perform basic alias analysis to make the test more correct. Finally, apply a
stylistic change to the test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220839 91177308-0d34-0410-b5e6-96231b3b80d8
This restores the commit from SVN r219899 with an additional change to ensure
that the CodeGen is correct for the case that was identified as being incorrect
(originally PR7272).
In the case that during inlining we need to synthesize a value on the stack
(i.e. for passing a value byval), then any function involving that alloca must
be stripped of its tailness as the restriction that it does not access the
parent's stack no longer holds. Unfortunately, a single alloca can cause a
rippling effect through out the inlining as the value may be aliased or may be
mutated through an escaped external call. As such, we simply track if an alloca
has been introduced in the frame during inlining, and strip any tail calls.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220811 91177308-0d34-0410-b5e6-96231b3b80d8
An icmp may have pointer arguments, it isn't limited to integers or
vectors of integers.
This fixes PR21388.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220664 91177308-0d34-0410-b5e6-96231b3b80d8
The dividend in "signed % unsigned" is treated as unsigned instead of signed,
causing unexpected behavior such as -64 % (uint64_t)24 == 0.
Added a regression test in split-gep.ll
Patched by Hao Liu.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220618 91177308-0d34-0410-b5e6-96231b3b80d8
The two operands of the new OR expression should be NextInChain and TheOther
instead of the two original operands.
Added a regression test in split-gep.ll.
Hao Liu reported this bug, and provded the test case and an initial patch.
Thanks!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220615 91177308-0d34-0410-b5e6-96231b3b80d8
This patch removes a chunk of special case logic for folding
(float)sqrt((double)x) -> sqrtf(x)
in InstCombineCasts and handles it in the mainstream path of SimplifyLibCalls.
No functional change intended, but I loosened the restriction on the existing
sqrt testcases to allow for this optimization even without unsafe-fp-math because
that's the existing behavior.
I also added a missing test case for not shrinking the llvm.sqrt.f64 intrinsic
in case the result is used as a double.
Differential Revision: http://reviews.llvm.org/D5919
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220514 91177308-0d34-0410-b5e6-96231b3b80d8
Jenkins likes to use directories with names involving the '@'
character, which breaks the sed expression in this test. Switch to use
'|' on the assumption that it's less likely to show up in a path.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220401 91177308-0d34-0410-b5e6-96231b3b80d8
When a call to a double-precision libm function has fast-math semantics
(via function attribute for now because there is no IR-level FMF on calls),
we can avoid fpext/fptrunc operations and use the float version of the call
if the input and output are both float.
We already do this optimization using a command-line option; this patch just
adds the ability for fast-math to use the existing functionality.
I moved the cl::opt from InstructionCombining into SimplifyLibCalls because
it's only ever used internally to that class.
Modified the existing test cases to use the unsafe-fp-math attribute rather
than repeating all tests.
This patch should solve: http://llvm.org/bugs/show_bug.cgi?id=17850
Differential Revision: http://reviews.llvm.org/D5893
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220390 91177308-0d34-0410-b5e6-96231b3b80d8
When the profile for a function cannot be applied, we use to emit an
error. This seems extreme. The compiler can continue, it's just that the
optimization opportunities won't include profile information.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220386 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When using a profile, we used to require the use -gmlt so that we could
get access to the line locations. This is used to match line numbers in
the input profile to the line numbers in the function's IR.
But this is actually not necessary. The driver can provide source
location tracking without the emission of debug information. In these
cases, the annotation 'llvm.dbg.cu' is missing from the IR, but the
actual line location annotations are still present.
This patch adds a new way of looking for the start of the current
function. Instead of looking through the compile units in llvm.dbg.cu,
we can walk up the scope for the first instruction in the function with
a debug loc. If that describes the function, we use it. Otherwise, we
keep looking until we find one.
If no such instruction is found, we then give up and produce an error.
Reviewers: echristo, dblaikie
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5887
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220382 91177308-0d34-0410-b5e6-96231b3b80d8
ConstantFolding crashes when trying to InstSimplify the following load:
@a = private unnamed_addr constant %mst {
i8* inttoptr (i64 -1 to i8*),
i8* inttoptr (i64 -1 to i8*)
}, align 8
%x = load <2 x i8*>* bitcast (%mst* @a to <2 x i8*>*), align 8
This patch fix this by adding support to this type of folding:
%x = load <2 x i8*>* bitcast (%mst* @a to <2 x i8*>*), align 8
==> gets folded to:
%x = <2 x i8*> <i8* inttoptr (i64 -1 to i8*), i8* inttoptr (i64 -1 to i8*)>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220380 91177308-0d34-0410-b5e6-96231b3b80d8
These are named following the IEEE-754 names for these
functions, rather than the libm fmin / fmax to avoid
possible ambiguities. Some languages may implement something
resembling fmin / fmax which return NaN if either operand is
to propagate errors. These implement the IEEE-754 semantics
of returning the other operand if either is a NaN representing
missing data.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220341 91177308-0d34-0410-b5e6-96231b3b80d8
This function was complicated by the fact that it tried to perform
canonicalizations that were already preformed by InstSimplify. Remove
this extra code and move the tests over to InstSimplify. Add asserts to
make sure our preconditions hold before we make any assumptions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220314 91177308-0d34-0410-b5e6-96231b3b80d8
inttoptr or ptrtoint cast provided there is datalayout available.
Eventually, the datalayout can just be required but in practice it will
always be there today.
To go with the ability to expose available values requiring a ptrtoint
or inttoptr cast, helpers are added to perform one of these three casts.
These smarts are necessary to finish canonicalizing loads and stores to
the operational type requirements without regressing fundamental
combines.
I've added some test cases. These should actually improve as the load
combining and store combining improves, but they may fundamentally be
highlighting some missing combines for select in addition to exercising
the specific added logic to load analysis.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220277 91177308-0d34-0410-b5e6-96231b3b80d8
The newly introduced 'nonnull' metadata is analogous to existing 'nonnull' attributes, but applies to load instructions rather than call arguments or returns. Long term, it would be nice to combine these into a single construct. The value of the load is allowed to vary between successive loads, but null is not a valid value to be loaded by any load marked nonnull.
Reviewed by: Hal Finkel
Differential Revision: http://reviews.llvm.org/D5220
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220240 91177308-0d34-0410-b5e6-96231b3b80d8
The original code had an implicit assumption that if the test for
allocas or globals was reached, the two pointers were not equal. With my
changes to make the pointer analysis more powerful here, I also had to
guard against circumstances where the results weren't useful. That in
turn violated the assumption and gave rise to a circumstance in which we
could have a store with both the queried pointer and stored pointer
rooted at *the same* alloca. Clearly, we cannot ignore such a store.
There are other things we might do in this code to better handle the
case of both pointers ending up at the same alloca or global, but it
seems best to at least make the test explicit in what it intends to
check.
I've added tests for both the alloca and global case here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220190 91177308-0d34-0410-b5e6-96231b3b80d8
r220178. First, the creation routine doesn't insert prior to the
terminator of the basic block provided, but really at the end of the
basic block. Instead, get the terminator and insert before that. The
next issue was that we need to ensure multiple PHI node entries for
a single predecessor re-use the same cast instruction rather than
creating new ones.
All of the logic here was without tests previously. I've reduced and
added a test case from the test suite that crashed without both of these
fixes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220186 91177308-0d34-0410-b5e6-96231b3b80d8
logic to look through pointer casts, making them trivially stronger in
the face of loads and stores with intervening pointer casts.
I've included a few test cases that demonstrate the kind of folding
instcombine can do without pointer casts and then variations which
obfuscate the logic through bitcasts. Without this patch, the variations
all fail to optimize fully.
This is more important now than it has been in the past as I've started
moving the load canonicialization to more closely follow the value type
requirements rather than the pointer type requirements and thus this
needs to be prepared for more pointer casts. When I made the same change
to stores several test cases regressed without logic along these lines
so I wanted to systematically improve matters first.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220178 91177308-0d34-0410-b5e6-96231b3b80d8
of InstCombine rather than just the bits enabled when datalayout is
optional.
The primary fixes here are because now things are little endian.
In good news, silliness like this seems like it will be going away as
we've got pretty stong consensus on dropping optional datalayout
entirely.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220176 91177308-0d34-0410-b5e6-96231b3b80d8
loads.
This handles many more cases than just the AA metadata, some of them
suggested by Hal in his review of the AA metadata handling patch. I've
tried to test this behavior where tractable to do so.
I'll point out that I have specifically *not* included a test for
debuginfo because it was going to require 2 or 3 times as much work to
craft some input which would survive the "helpful" stripping of debug
info metadata that doesn't match the desired schema. This is another
good example of why the current state of write-ability for our debug
info metadata is unacceptable. I spent over 30 minutes trying to conjure
some test case that would survive, even copying from other debug info
tests, but it always failed to survive with no explanation of why or how
I might fix it. =[
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220165 91177308-0d34-0410-b5e6-96231b3b80d8
up to where it actually works as intended. The problem is that
a GlobalAlias isa GlobalValue and so the prior block handled all of the
cases.
This allows us to constant fold based on the actual constant expression
in the global alias. As an example, see the last function in the newly
added test case which explicitly aligns an unaligned pointer using
constant expression math. Without this change, we fail to see that and
fold an alignment test to zero.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220164 91177308-0d34-0410-b5e6-96231b3b80d8
The following implements the optimization for sequences of the form:
icmp eq/ne (shl Const2, A), Const1
Such sequences can be transformed to:
icmp eq/ne A, (TrailingZeros(Const1) - TrailingZeros(Const2))
This handles only the equality operators for now. Other operators need
to be handled.
Patch by Ankur Garg!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220162 91177308-0d34-0410-b5e6-96231b3b80d8
by my refactoring of this code.
The method isSafeToLoadUnconditionally assumes that the load will
proceed with the preferred type alignment. Given that, it has to ensure
that the alloca or global is at least that aligned. It has always done
this historically when a datalayout is present, but has never checked it
when the datalayout is absent. When I refactored the code in r220156,
I exposed this path when datalayout was present and that turned the
latent bug into a patent bug.
This fixes the issue by just removing the special case which allows
folding things without datalayout. This isn't worth the complexity of
trying to tease apart when it is or isn't safe without actually knowing
the preferred alignment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220161 91177308-0d34-0410-b5e6-96231b3b80d8
...)) and (load (cast ...)): canonicalize toward the former.
Historically, we've tried to load using the type of the *pointer*, and
tried to match that type as closely as possible removing as many pointer
casts as we could and trading them for bitcasts of the loaded value.
This is deeply and fundamentally wrong.
Repeat after me: memory does not have a type! This was a hard lesson for
me to learn working on SROA.
There is only one thing that should actually drive the type used for
a pointer, and that is the type which we need to use to load from that
pointer. Matching up pointer types to the loaded value types is very
useful because it minimizes the physical size of the IR required for
no-op casts. Similarly, the only thing that should drive the type used
for a loaded value is *how that value is used*! Again, this minimizes
casts. And in fact, the *only* thing motivating types in any part of
LLVM's IR are the types used by the operations in the IR. We should
match them as closely as possible.
I've ended up removing some tests here as they were testing bugs or
behavior that is no longer present. Mostly though, this is just cleanup
to let the tests continue to function as intended.
The only fallout I've found so far from this change was SROA and I have
fixed it to not be impeded by the different type of load. If you find
more places where this change causes optimizations not to fire, those
too are likely bugs where we are assuming that the type of pointers is
"significant" for optimization purposes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220138 91177308-0d34-0410-b5e6-96231b3b80d8
This test is pretty awesome. It is claiming to test devirtualization.
However, the code in question is not in fact devirtualized by LLVM. If
you take the original C++ test case and run it through Clang at -O3 we
fail to devirtualize it completely. It also isn't a sufficiently focused
test case.
The *reason* we fail to devirtualize it isn't because of any missing
instcombine though. Instead, it is because we fail to emit an available
externally vtable and thus the vtable is just an external and completely
opaque. If I cause the vtable to be emitted, we successfully
devirtualize things.
Anyways, I'm just removing it because it is providing negative value at
this point: it isn't representative of the output of Clang really, LLVM
isn't doing the transform it claims to be testing, LLVM's failure to do
the transform isn't actually an LLVM bug at all and we shouldn't be
testing for it here, and finally the test is written in such a way that
it will trivially pass even when the point of the test is failing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220137 91177308-0d34-0410-b5e6-96231b3b80d8
cases where the alloca type, the load types, and the store types used
all disagree.
Previously, the only way that vector-based promotion occured was if the
alloca type was a vector type. This was one of the *very* few remaining
uses of the alloca's type to guide SROA/mem2reg left in LLVM. It turns
out it was a bad idea.
The alloca type can change very easily based on the mixture of types
loaded and stored to that alloca. We shouldn't be relying on it as
a signal for very much. Instead, the source of truth should be loads and
stores. We should canonicalize the loads and stores as much as possible
and then rely on them exclusively in SROA.
When looking and loads and stores, we may find many different candidate
vector types. This change will let SROA try all of them to find a vector
type which is a viable way to promote the entire alloca to a vector
register.
With this change, it becomes possible to do better canonicalization and
optimization of loads and stores without breaking SROA in random ways,
and that should allow fixing a core source of performance loss in hot
numerical loops such as those in Eigen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220116 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r219899.
This also updates byval-tail-call.ll to make it clear what was breaking.
Adding r219899 again will cause the load/store to disappear.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220093 91177308-0d34-0410-b5e6-96231b3b80d8
DSE's overlap checking contained special logic, used only when no DataLayout
was available, which inferred a complete overwrite when the pointee types were
equal. This logic seems fine for regular loads/stores, but does not work for
memcpy and friends. Instead of fixing this, I'm just removing it.
Philosophically, transformations should not contain enhanced behavior used only
when data layout is lacking (data layout should be strictly additive), and
maintaining these rarely-tested code paths seems not worthwhile at this stage.
Credit to Aliaksei Zasenka for the bug report and the diagnosis. The test case
(slightly reduced from that provided by Aliaksei) replaces the original
contents of test/Transforms/DeadStoreElimination/no-targetdata.ll -- a few
other tests have been updated to have a data layout.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220035 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Currently, call slot optimization requires that if the destination is an
argument, the argument has the sret attribute. This is to ensure that
the memory access won't trap. In addition to sret, we can also allow the
optimization to happen for arguments that have the new dereferenceable
attribute, which gives the same guarantee.
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5832
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219950 91177308-0d34-0410-b5e6-96231b3b80d8
If a square root call has an FP multiplication argument that can be reassociated,
then we can hoist a repeated factor out of the square root call and into a fabs().
In the simplest case, this:
y = sqrt(x * x);
becomes this:
y = fabs(x);
This patch relies on an earlier optimization in instcombine or reassociate to put the
multiplication tree into a canonical form, so we don't have to search over
every permutation of the multiplication tree.
Because there are no IR-level FastMathFlags for intrinsics (PR21290), we have to
use function-level attributes to do this optimization. This needs to be fixed
for both the intrinsics and in the backend.
Differential Revision: http://reviews.llvm.org/D5787
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219944 91177308-0d34-0410-b5e6-96231b3b80d8
The code committed in r219832 asserted when it attempted to shrink a switch
statement whose type was larger than 64-bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219902 91177308-0d34-0410-b5e6-96231b3b80d8
Make tail recursion elimination a bit more aggressive. This allows us to get
tail recursion on functions that are just branches to a different function. The
fact that the function takes a byval argument does not restrict it from being
optimised into just a tail call.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219899 91177308-0d34-0410-b5e6-96231b3b80d8
For pointer-typed function arguments, enhanced alignment can be asserted using
the 'align' attribute. When inlining, if this enhanced alignment information is
not otherwise available, preserve it using @llvm.assume-based alignment
assumptions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219876 91177308-0d34-0410-b5e6-96231b3b80d8
If x is known to have the range [a, b) in a loop predicated by (icmp
ne x, a), its range can be sharpened to [a + 1, b). Get
ScalarEvolution and hence IndVars to exploit this fact.
This change triggers an optimization to widen-loop-comp.ll, so it had
to be edited to get it to pass.
phabricator: http://reviews.llvm.org/D5639
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219834 91177308-0d34-0410-b5e6-96231b3b80d8
Truncate the operands of a switch instruction to a narrower type if the upper
bits are known to be all ones or zeros.
rdar://problem/17720004
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219832 91177308-0d34-0410-b5e6-96231b3b80d8
The SLP vectorizer should not vectorize ephemeral values. These are used to
express information to the optimizer, and vectorizing them does not lead to
faster code (because the ephemeral values are dropped prior to code generation,
vectorized or not), and obscures the information the instructions are
attempting to communicate (the logic that interprets the arguments to
@llvm.assume generically does not understand vectorized conditions).
Also, uses by ephemeral values are free (because they, and the necessary
extractelement instructions, will be dropped prior to code generation).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219816 91177308-0d34-0410-b5e6-96231b3b80d8
A few minor changes to prevent @llvm.assume from interfering with loop
vectorization. First, treat @llvm.assume like the lifetime intrinsics, which
are scalarized (but don't otherwise interfere with the legality checking).
Second, ignore the cost of ephemeral instructions in the loop (these will go
away anyway during CodeGen).
Alignment assumptions and other uses of @llvm.assume can often end up inside of
loops that should be vectorized (this is not uncommon for assumptions generated
by __attribute__((align_value(n))), for example).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219741 91177308-0d34-0410-b5e6-96231b3b80d8
Eliminate library calls and intrinsic calls to fabs when the input
is a squared value.
Note that no unsafe-math / fast-math assumptions are needed for
this optimization.
Differential Revision: http://reviews.llvm.org/D5777
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219717 91177308-0d34-0410-b5e6-96231b3b80d8
We assumed that A must be greater than B because the right hand side of
a remainder operator must be nonzero.
However, it is possible for A to be less than B if Pow2 is a power of
two greater than 1.
Take for example:
i32 %A = 0
i32 %B = 31
i32 Pow2 = 2147483648
((Pow2 << 0) >>u 31) is non-zero but A is less than B.
This fixes PR21274.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219713 91177308-0d34-0410-b5e6-96231b3b80d8
Reapply r216913, a fix for PR20832 by Andrea Di Biagio. The commit was reverted
because of buildbot failures, and credit goes to Ulrich Weigand for isolating
the underlying issue (which can be confirmed by Valgrind, which does helpfully
light up like the fourth of July). Uli explained the problem with the original
patch as:
It seems the problem is calling multiplySignificand with an addend of category
fcZero; that is not expected by this routine. Note that for fcZero, the
significand parts are simply uninitialized, but the code in (or rather, called
from) multiplySignificand will unconditionally access them -- in effect using
uninitialized contents.
This version avoids using a category == fcZero addend within
multiplySignificand, which avoids this problem (the Valgrind output is also now
clean).
Original commit message:
[APFloat] Fixed a bug in method 'fusedMultiplyAdd'.
When folding a fused multiply-add builtin call, make sure that we propagate the
correct result in the case where the addend is zero, and the two other operands
are finite non-zero.
Example:
define double @test() {
%1 = call double @llvm.fma.f64(double 7.0, double 8.0, double 0.0)
ret double %1
}
Before this patch, the instruction simplifier wrongly folded the builtin call
in function @test to constant 'double 7.0'.
With this patch, method 'fusedMultiplyAdd' correctly evaluates the multiply and
propagates the expected result (i.e. 56.0).
Added test fold-builtin-fma.ll with the reproducible from PR20832 plus extra
test cases to verify the behavior of method 'fusedMultiplyAdd' in the presence
of NaN/Inf operands.
This fixes PR20832.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219708 91177308-0d34-0410-b5e6-96231b3b80d8
When LazyValueInfo uses @llvm.assume intrinsics to provide edge-value
constraints, we should check for intrinsics that dominate the edge's branch,
not just any potential context instructions. An assumption that dominates the
edge's branch represents a truth on that edge. This is specifically useful, for
example, if multiple predecessors assume a pointer to be nonnull, allowing us
to simplify a later null comparison.
The test case, and an initial patch, were provided by Philip Reames. Thanks!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219688 91177308-0d34-0410-b5e6-96231b3b80d8
This is the same optimization of r219233 with modifications to support PHIs with multiple incoming edges from the same block
and a test to check that this condition is handled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219656 91177308-0d34-0410-b5e6-96231b3b80d8
We assumed that negation operations of the form (0 - %Z) resulted in a
negative number. This isn't true if %Z was originally negative.
Substituting the negative number into the remainder operation may result
in undefined behavior because the dividend might be INT_MIN.
This fixes PR21256.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219639 91177308-0d34-0410-b5e6-96231b3b80d8
We have a transform that changes:
(x lshr C1) udiv C2
into:
x udiv (C2 << C1)
However, it is unsafe to do so if C2 << C1 discards any of C2's bits.
This fixes PR21255.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219634 91177308-0d34-0410-b5e6-96231b3b80d8
Consider the case where X is 2. (2 <<s 31)/s-2147483648 is zero but we
would fold to X. Note that this is valid when we are in the unsigned
domain because we require NUW: 2 <<u 31 results in poison.
This fixes PR21245.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219568 91177308-0d34-0410-b5e6-96231b3b80d8
consider:
C1 = INT_MIN
C2 = -1
C1 * C2 overflows without a doubt but consider the following:
%x = i32 INT_MIN
This means that (%X /s C1) is 1 and (%X /s C1) /s C2 is -1.
N. B. Move the unsigned version of this transform to InstSimplify, it
doesn't create any new instructions.
This fixes PR21243.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219567 91177308-0d34-0410-b5e6-96231b3b80d8
consider:
mul i32 nsw %x, -2147483648
this instruction will not result in poison if %x is 1
however, if we transform this into:
shl i32 nsw %x, 31
then we will be generating poison because we just shifted into the sign
bit.
This fixes PR21242.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219566 91177308-0d34-0410-b5e6-96231b3b80d8
The LLVM Lang Ref states for signed/unsigned int to float conversions:
"If the value cannot fit in the floating point value, the results are undefined."
And for FP to signed/unsigned int:
"If the value cannot fit in ty2, the results are undefined."
This matches the C definitions.
The existing behavior pins to infinity or a max int value, but that may just
lead to more confusion as seen in:
http://llvm.org/bugs/show_bug.cgi?id=21130
Returning undef will hopefully lead to a less silent failure.
Differential Revision: http://reviews.llvm.org/D5603
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219542 91177308-0d34-0410-b5e6-96231b3b80d8
It also makes it more aggressive in querying range information by
adding a call to isKnownPredicateWithRanges to
isLoopBackedgeGuardedByCond and isLoopEntryGuardedByCond.
phabricator: http://reviews.llvm.org/D5638
Reviewed by: atrick, hfinkel
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219532 91177308-0d34-0410-b5e6-96231b3b80d8
ScalarEvolution in the presence of multiple exits. Previously all
loops exits had to have identical counts for a loop trip count to be
considered computable. This pessimization was implemented by calling
getBackedgeTakenCount(L) rather than getExitCount(L, ExitingBlock)
inside of ScalarEvolution::getSmallConstantTripCount() (see the FIXME
in the comments of that function). The pessimization was added to fix
a corner case involving undefined behavior (pr/16130). This patch more
precisely handles the undefined behavior case allowing the pessimization
to be removed.
ControlsExit replaces IsSubExpr to more precisely track the case where
undefined behavior is expected to occur. Because undefined behavior is
tracked more precisely we can remove MustExit from ExitLimit. MustExit
was used to track the case where the limit was computed potentially
assuming undefined behavior even if undefined behavior didn't necessarily
occur.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219517 91177308-0d34-0410-b5e6-96231b3b80d8
instead
We used to transform this:
define void @test6(i1 %cond, i8* %ptr) {
entry:
br i1 %cond, label %bb1, label %bb2
bb1:
br label %bb2
bb2:
%ptr.2 = phi i8* [ %ptr, %entry ], [ null, %bb1 ]
store i8 2, i8* %ptr.2, align 8
ret void
}
into this:
define void @test6(i1 %cond, i8* %ptr) {
%ptr.2 = select i1 %cond, i8* null, i8* %ptr
store i8 2, i8* %ptr.2, align 8
ret void
}
because the simplifycfg transformation into selects would happen to happen
before the simplifycfg transformation that removes unreachable control flow
(We have 'unreachable control flow' due to the store to null which is undefined
behavior).
The existing transformation that removes unreachable control flow in simplifycfg
is:
/// If BB has an incoming value that will always trigger undefined behavior
/// (eg. null pointer dereference), remove the branch leading here.
static bool removeUndefIntroducingPredecessor(BasicBlock *BB)
Now we generate:
define void @test6(i1 %cond, i8* %ptr) {
store i8 2, i8* %ptr.2, align 8
ret void
}
I did not see any impact on the test-suite + externals.
rdar://18596215
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219462 91177308-0d34-0410-b5e6-96231b3b80d8
This patch fixes a bug in method InstCombiner::FoldCmpCstShrCst where we
wrongly computed the distance between the highest bits set of two negative
values.
This fixes PR21222.
Differential Revision: http://reviews.llvm.org/D5700
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219406 91177308-0d34-0410-b5e6-96231b3b80d8
A function with discardable linkage cannot be discarded if its a member
of a COMDAT group without considering all the other COMDAT members as
well. This sort of thing is already handled by GlobalOpt/GlobalDCE.
This fixes PR21206.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219335 91177308-0d34-0410-b5e6-96231b3b80d8
A linkonce_odr member of a COMDAT shouldn't be dropped if we need to
keep the entire COMDAT group.
This fixes PR21191.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219283 91177308-0d34-0410-b5e6-96231b3b80d8
The icmp-select-icmp optimization targets select-icmp.eq
only. This is now ensured by testing the branch predicate
explictly. This commit also includes the test case for pr21199.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219282 91177308-0d34-0410-b5e6-96231b3b80d8
`LoopUnrollPass` says that it preserves `LoopInfo` -- make it so. In
particular, tell `LoopInfo` about copies of inner loops when unrolling
the outer loop.
Conservatively, also tell `ScalarEvolution` to forget about the original
versions of these loops, since their inputs may have changed.
Fixes PR20987.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219241 91177308-0d34-0410-b5e6-96231b3b80d8