was centered around the premise of laying out a loop in a chain, and
then rotating that chain. This is good for preserving contiguous layout,
but bad for actually making sane rotations. In order to keep it safe,
I had to essentially make it impossible to rotate deeply nested loops.
The information needed to correctly reason about a deeply nested loop is
actually available -- *before* we layout the loop. We know the inner
loops are already fused into chains, etc. We lose information the moment
we actually lay out the loop.
The solution was the other alternative for this algorithm I discussed
with Benjamin and some others: rather than rotating the loop
after-the-fact, try to pick a profitable starting block for the loop's
layout, and then use our existing layout logic. I was worried about the
complexity of this "pick" step, but it turns out such complexity is
needed to handle all the important cases I keep teasing out of benchmarks.
This is, I'm afraid, a bit of a work-in-progress. It is still
misbehaving on some likely important cases I'm investigating in Olden.
It also isn't really tested. I'm going to try to craft some interesting
nested-loop test cases, but it's likely to be extremely time consuming
and I don't want to go there until I'm sure I'm testing the correct
behavior. Sadly I can't come up with a way of getting simple, fine
grained test cases for this logic. We need complex loop structures to
even trigger much of it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145183 91177308-0d34-0410-b5e6-96231b3b80d8
heavily on AnalyzeBranch. That routine doesn't behave as we want given
that rotation occurs mid-way through re-ordering the function. Instead
merely check that there are not unanalyzable branching constructs
present, and then reason about the CFG via successor lists. This
actually simplifies my mental model for all of this as well.
The concrete result is that we now will rotate more loop chains. I've
added a test case from Olden highlighting the effect. There is still
a bit more to do here though in order to regain all of the performance
in Olden.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145179 91177308-0d34-0410-b5e6-96231b3b80d8
trampoline forms. Both of these were correct in LLVM 3.0, and we don't
need to support LLVM 2.9 and earlier in mainline.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145174 91177308-0d34-0410-b5e6-96231b3b80d8
I think this is the last of autoupgrade that can be removed in 3.1.
Can the atomic upgrade stuff also go?
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145169 91177308-0d34-0410-b5e6-96231b3b80d8
pass. This is designed to achieve one of the important optimizations
that the old code placement pass did, but more simply.
This is a somewhat rough and *very* conservative version of the
transform. We could get a lot fancier here if there are profitable cases
to do so. In particular, this only looks for a single pattern, it
insists that the loop backedge being rotated away is the last backedge
in the chain, and it doesn't provide any means of doing better in-loop
placement due to the rotation. However, it appears that it will handle
the important loops I am finding in the LLVM test suite.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145158 91177308-0d34-0410-b5e6-96231b3b80d8
was returning incorrect values in rare cases, and incorrectly marking
exact conversions as inexact in some more common cases. Fixes PR11406, and a
missed optimization in test/CodeGen/X86/fp-stack-O0.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145141 91177308-0d34-0410-b5e6-96231b3b80d8
I don't see how the project is using LLVM and we really can't list every user of
the clang analyzer. Sorry.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145137 91177308-0d34-0410-b5e6-96231b3b80d8
tablegen patterns for scalar FMA4 operations and intrinsic. Also
add tests for vfmaddsd.
Patch by Jan Sjodin
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145133 91177308-0d34-0410-b5e6-96231b3b80d8
need lots of fanciness around retaining a reference to a Chain's slot in
the BlockToChain map, but that's all gone now. We can just go directly
to allocating the new chain (which will update the mapping for us) and
using it.
Somewhat gross mechanically generated test case replicates the issue
Duncan spotted when actually testing this out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145120 91177308-0d34-0410-b5e6-96231b3b80d8
conflicts, we should only be adding the first block of the chain to the
list, lest we try to merge into the middle of that chain. Most of the
places we were doing this we already happened to be looking at the first
block, but there is no reason to assume that, and in some cases it was
clearly wrong.
I've added a couple of tests here. One already worked, but I like having
an explicit test for it. The other is reduced from a test case Duncan
reduced for me and used to crash. Now it is handled correctly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145119 91177308-0d34-0410-b5e6-96231b3b80d8
- lower unaligned loads/stores.
- encode the size operand of instructions INS and EXT.
- emit relocation information needed for JAL (jump-and-link).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145113 91177308-0d34-0410-b5e6-96231b3b80d8