promoting allocas to preferred alignments that exceed the natural
alignment. This avoids some potentially expensive dynamic stack realignments.
The natural stack alignment is set in target data strings via the "S<size>"
option. Size is in bits and must be a multiple of 8. The natural stack alignment
defaults to "unspecified" (represented by a zero value), and the "unspecified"
value does not prevent any alignment promotions. Target maintainers that care
about avoiding promotions should explicitly add the "S<size>" option to their
target data strings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141599 91177308-0d34-0410-b5e6-96231b3b80d8
switch (n) {
case 27:
do_something(x);
...
}
the call do_something(x) will be replaced with do_something(27). In
gcc-as-one-big-file this results in the removal of about 500 lines of
bitcode (about 0.02%), so has about 1/10 of the effect of propagating
branch conditions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141360 91177308-0d34-0410-b5e6-96231b3b80d8
While I'm here, fix the related issue with strncmp, add some actual tests for strcmp and strncmp, and start using StringRef::compare for constant folding instead of using strcmp/strncmp so that the optimized IR isn't dependent on the host's implementation of strcmp.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141227 91177308-0d34-0410-b5e6-96231b3b80d8
Just pull the instruction name, but don't change the order of anything
else. That keeps --debug happy and non-crashing, but doesn't change
how the worklist gets built.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141210 91177308-0d34-0410-b5e6-96231b3b80d8
When updating the worklist for InstCombine, the Add/AddUsersToWorklist
functions may access the instruction(s) being added, for debug output for
example. If the instructions aren't yet added to the basic block, this
can result in a crash. Finish the instruction transformation before
adjusting the worklist instead.
rdar://10238555
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141203 91177308-0d34-0410-b5e6-96231b3b80d8
branch "br i1 %x, label %if_true, label %if_false" then it replaces
"%x" with "true" in places only reachable via the %if_true arm, and
with "false" in places only reachable via the %if_false arm. Except
that actually it doesn't: if value numbering shows that %y is equal
to %x then, yes, %y will be turned into true/false in this way, but
any occurrences of %x itself are not transformed. Fix this. What's
more, it's often the case that %x is an equality comparison such as
"%x = icmp eq %A, 0", in which case every occurrence of %A that is
only reachable via the %if_true arm can be replaced with 0. Implement
this and a few other variations on this theme. This reduces the number
of lines of LLVM IR in "GCC as one big file" by 0.2%. It has a bigger
impact on Ada code, typically reducing the number of lines of bitcode
by around 0.4% by removing repeated compiler generated checks. Passes
the LLVM nightly testsuite and the Ada ACATS testsuite.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141177 91177308-0d34-0410-b5e6-96231b3b80d8
it's OK for the false/true destination to have multiple
predecessors as long as the extra ones are dominated by
the branch destination.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141176 91177308-0d34-0410-b5e6-96231b3b80d8
This handles the case in which LSR rewrites an IV user that is a phi and
splits critical edges originating from a switch.
Fixes <rdar://problem/6453893> LSR is not splitting edges "nicely"
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141059 91177308-0d34-0410-b5e6-96231b3b80d8
We want heuristics to be based on accurate data, but more importantly
we don't want llvm to behave randomly. A benign trunc inserted by an
upstream pass should not cause a wild swings in optimization
level. See PR11034. It's a general problem with threshold-based
heuristics, but we can make it less bad.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140919 91177308-0d34-0410-b5e6-96231b3b80d8
catch or repeated filter clauses. Teach instcombine a bunch
of tricks for simplifying landingpad clauses. Currently the
code only recognizes the GNU C++ and Ada personality functions,
but that doesn't stop it doing a bunch of "generic" transforms
which are hopefully fine for any real-world personality function.
If these "generic" transforms turn out not to be generic, they
can always be conditioned on the personality function. Probably
someone should add the ObjC++ personality function. I didn't as
I don't know anything about it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140852 91177308-0d34-0410-b5e6-96231b3b80d8
objc_retainBlock call is potentially responsible for copying
the block to the heap to extend its lifetime. rdar://10209613.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140814 91177308-0d34-0410-b5e6-96231b3b80d8
Rewriting the entire loop nest now requires -enable-lsr-nested.
See PR11035 for some performance data.
A few unit tests specifically test nested LSR, and are now under a flag.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140762 91177308-0d34-0410-b5e6-96231b3b80d8
Disabling aggressive LSR saves compilation time, and with the new
indvars behavior usually improves performance.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140590 91177308-0d34-0410-b5e6-96231b3b80d8
The minor bug heuristic was noticed by inspection. I added the
isLoser/isValid helpers because they will become more
important with subsequent checkins.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140580 91177308-0d34-0410-b5e6-96231b3b80d8
No test case. Noticed by inspection and I doubt it ever affects the
outcome of the overall heuristic, let alone final codegen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140431 91177308-0d34-0410-b5e6-96231b3b80d8
The landing pad must accompany the invoke when it's extracted. However, if it
does, then the loop isn't properly extracted. I.e., the resulting extraction has
a loop in it. The extracted function is then extracted, etc. resulting in an
infinite loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140193 91177308-0d34-0410-b5e6-96231b3b80d8
extract its associated landing pad block as well. However, that landing pad
block may have more than one predecessor. So split the landing pad block so that
individual landing pads have only one predecessor.
This type of transformation may produce a false positive with bugpoint.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140173 91177308-0d34-0410-b5e6-96231b3b80d8
extract the landing pad block. Otherwise, there will be a situation where the
invoke's unwind edge lands on a non-landing pad.
We also forbid the user from extracting the landing pad block by itself. Again,
this is not a valid transformation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140083 91177308-0d34-0410-b5e6-96231b3b80d8
No tests; these changes aren't really interesting in the sense that the logic is the same for volatile and atomic.
I believe this completes all of the changes necessary for the optimizer to handle loads and stores correctly. I'm going to try and come up with some additional testing, though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139533 91177308-0d34-0410-b5e6-96231b3b80d8
better.
Don't immediately give up when an add operation can't be trivially
sign/zero-extended within a loop. If it has NSW/NUW flags, generate a
new expression with sign extended (non-recurrent) operand. As before,
if SCEV says that all sign extends are loop invariant, then we can
widen the operation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139453 91177308-0d34-0410-b5e6-96231b3b80d8
init.trampoline and adjust.trampoline intrinsics, into two intrinsics
like in GCC. While having one combined intrinsic is tempting, it is
not natural because typically the trampoline initialization needs to
be done in one function, and the result of adjust trampoline is needed
in a different (nested) function. To get around this llvm-gcc hacks the
nested function lowering code to insert an additional parent variable
holding the adjust.trampoline result that can be accessed from the child
function. Dragonegg doesn't have the luxury of tweaking GCC code, so it
stored the result of adjust.trampoline in the memory GCC set aside for
the trampoline itself (this is always available in the child function),
and set up some new memory (using an alloca) to hold the trampoline.
Unfortunately this breaks Go which allocates trampoline memory on the
heap and wants to use it even after the parent has exited (!). Rather
than doing even more hacks to get Go working, it seemed best to just use
two intrinsics like in GCC. Patch mostly by Sanjoy Das.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139140 91177308-0d34-0410-b5e6-96231b3b80d8
This changes loop unrolling to use the same mechanism for trip count
computation as indvars. This is a stronger check that tends to unroll
more loops. A very common side-effect is that many single iteration
loops will be removed sooner. The real goal was simply to remove
dependence on canonical IVs.
x86 is break even.
ARM performance changes to expect (+ is good):
External/SPEC/CFP2000/183.equake/183.equake +13%
SingleSource/Benchmarks/Dhrystone/fldry +21%
MultiSource/Applications/spiff/spiff +3%
SingleSource/Benchmarks/Stanford/Puzzle -14%
The Puzzle regression is actually an improvement in loop optimization
that defeats GVN: rdar://problem/10065079.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139009 91177308-0d34-0410-b5e6-96231b3b80d8
The landingpad instruction is required in the landing pad block. Because we're
not deleting terminating instructions, the invoke may still jump to here (see
Transforms/SCCP/2004-11-16-DeadInvoke.ll). Remove all uses of the landingpad
instruction, but keep it around until code-gen can remove the basic block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138890 91177308-0d34-0410-b5e6-96231b3b80d8
ssa, so it has to be run really early in the pipeline. Any replacement
should probably use the SSAUpdater.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138841 91177308-0d34-0410-b5e6-96231b3b80d8
Optimize chained bitcasts of the form A->B->A.
Undo r138722 and change isEliminableCastPair to allow this case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138756 91177308-0d34-0410-b5e6-96231b3b80d8
In theory this could be extended to other instructions, eg. division by zero, but it's likely that it will "miscompile" some code because people depend on div by zero not trapping. NULL pointer dereference usually leads to a crash so we should be on the safe side.
This shrinks the size of a Release clang by 16k on x86_64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138618 91177308-0d34-0410-b5e6-96231b3b80d8