version uses a new algorithm for evaluating the binomial coefficients
which is significantly more efficient for AddRecs of more than 2 terms
(see the comments in the code for details on how the algorithm works).
It also fixes some bugs: it removes the arbitrary length restriction for
AddRecs, it fixes the silent generation of incorrect code for AddRecs
which require a wide calculation width, and it fixes an issue where we
were incorrectly truncating the iteration count too far when evaluating
an AddRec expression narrower than the induction variable.
There are still a few related issues I know of: I think there's
still an issue with the SCEVExpander expansion of AddRec in terms of
the width of the induction variable used. The hack to avoid generating
too-wide integers shouldn't be necessary; instead, the callers should be
considering the cost of the expansion before expanding it (in addition
to not expanding too-wide integers, we might not want to expand
expressions that are really expensive, especially when optimizing for
size; calculating an length-17 32-bit AddRec currently generates about 250
instructions of straight-line code on X86). Also, for long 32-bit
AddRecs on X86, CodeGen really sucks at scheduling the code. I'm planning on
filing follow-up PRs for these issues.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@54332 91177308-0d34-0410-b5e6-96231b3b80d8
subreg form on x86-64, to avoid the problem with x86-32
having GPRs that don't have 8-bit subregs.
Also, change several 16-bit instructions to use
equivalent 32-bit instructions. These have a smaller
encoding and avoid partial-register updates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@54223 91177308-0d34-0410-b5e6-96231b3b80d8
to different address spaces. This alters the naming scheme for those
intrinsics, e.g., atomic.load.add.i32 => atomic.load.add.i32.p0i32
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@54195 91177308-0d34-0410-b5e6-96231b3b80d8
time applying to the implicit comparison in smin expressions. The
correct way to transform an inequality into the opposite
inequality, either signed or unsigned, is with a not expression.
I looked through the SCEV code, and I don't think there are any more
occurrences of this issue.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@54194 91177308-0d34-0410-b5e6-96231b3b80d8
SGT exit condition. Essentially, the correct way to flip an inequality
in 2's complement is the not operator, not the negation operator.
That said, the difference only affects cases involving INT_MIN.
Also, enhance the pre-test search logic to be a bit smarter about
inequalities flipped with a not operator, so it can eliminate the smax
from the iteration count for simple loops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@54184 91177308-0d34-0410-b5e6-96231b3b80d8
to be marked invalid regardless of whether it is
a debug, an exception handling or (hopefully) a
GC label.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@54172 91177308-0d34-0410-b5e6-96231b3b80d8
partially unroll a loop when fully unrolling would not fit under the threshold.
Patch by Mikael Lepistö.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@54160 91177308-0d34-0410-b5e6-96231b3b80d8
that says "unconditional loads from this argument are safe", we now keep track
of the safety per set of indices from which loads happen. This prevents
ArgPromotion from promoting loads that aren't really valid. As an added effect,
this will now disregard the the type of the indices passed to a GEP, so
"load GEP %A, i32 1" and "load GEP %A, i64 1" will result in a single argument,
not two.
This fixes PR2598, for which a testcase has been added as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@54159 91177308-0d34-0410-b5e6-96231b3b80d8
which is represented in codegen as an 'and' operation. This matches them
with movz instructions, instead of leaving them to be matched by and
instructions with an immediate field.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@54147 91177308-0d34-0410-b5e6-96231b3b80d8
because opt exited while llvm-as was still
writing to the pipe, causing it to get a
SIGPIPE. It seems best to change things to
avoid the race altogether.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@54138 91177308-0d34-0410-b5e6-96231b3b80d8
mmx needs its own fancy shuffle logic based on unpack; for now we get correct but awful code.
Also commit Mon Ping's VSETCC patch
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@54039 91177308-0d34-0410-b5e6-96231b3b80d8
and knowledge of PseudoSourceValues. This unfortunately isn't sufficient to allow
constants to be rematerialized in PIC mode -- the extra indirection is a
complication.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@54000 91177308-0d34-0410-b5e6-96231b3b80d8
command-line option, and disable it by default. It introduced performance
regressions because CodeGen is currently not able to remat such loads.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53997 91177308-0d34-0410-b5e6-96231b3b80d8
case for this.
This allows instructions like loads from global variables declared to
be constant to be moved out of loops."
Patch by Stefanus Du Toit!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53945 91177308-0d34-0410-b5e6-96231b3b80d8
Remove the GetResultInst instruction. It is still accepted in LLVM assembly
and bitcode, where it is now auto-upgraded to ExtractValueInst. Also, remove
support for return instructions with multiple values. These are auto-upgraded
to use InsertValueInst instructions.
The IRBuilder still accepts multiple-value returns, and auto-upgrades them
to InsertValueInst instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53941 91177308-0d34-0410-b5e6-96231b3b80d8