index greater than the size of the vector is invalid. The shuffle may be
shrinking the size of the vector. Fixes a crash!
Also drop the maximum recursion depth of the safety check for this
optimization to five.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183080 91177308-0d34-0410-b5e6-96231b3b80d8
as the BinaryOperator, *not* in the block where the IRBuilder is currently
inserting into. Fixes a bug where scalarizePHI would create instructions
that would not dominate all uses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182639 91177308-0d34-0410-b5e6-96231b3b80d8
The earlier change list introduced the following inst combines:
B * (uitofp i1 C) —> select C, B, 0
A * (1 - uitofp i1 C) —> select C, 0, A
select C, 0, B + select C, A, 0 —> select C, A, B
Together these 3 changes would simplify :
A * (1 - uitofp i1 C) + B * uitofp i1 C
down to :
select C, B, A
In practice we found that the first two substitutions can have a
negative effect on performance, because they reduce opportunities to
use FMA contractions; between the two options FMAs are often the
better choice. This change list amends the previous one to enable
just these inst combines:
select C, B, 0 + select C, 0, A —> select C, B, A
A * (1 - uitofp i1 C) + B * uitofp i1 C —> select C, B, A
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182499 91177308-0d34-0410-b5e6-96231b3b80d8
This is useful if something that looks like (x & (1 << y)) ? 64 : 32 is
the divisor in a modulo operation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182200 91177308-0d34-0410-b5e6-96231b3b80d8
There are two transforms in visitUrem that conflict with each other.
*) One, if a divisor is a power of two, subtracts one from the divisor
and turns it into a bitwise-and.
*) The other unwraps both operands if they are surrounded by zext
instructions.
Flipping the order allows the subtraction to go beneath the sign
extension.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181668 91177308-0d34-0410-b5e6-96231b3b80d8
The shift amount may be larger than the type leading to undefined behavior.
Limit the transform to constant shift amounts. While there update the bits to
clear in the result which may enable additional optimizations.
PR15959.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181604 91177308-0d34-0410-b5e6-96231b3b80d8
That's obviously wrong. Conservatively restrict it to the sign bit, which
matches the original intention of this analysis. Fixes PR15940.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181518 91177308-0d34-0410-b5e6-96231b3b80d8
A * (1 - (uitofp i1 C)) -> select C, 0, A
B * (uitofp i1 C) -> select C, B, 0
select C, 0, A + select C, B, 0 -> select C, B, A
These come up in code that has been hand-optimized from a select to a linear blend,
on platforms where that may have mattered. We want to undo such changes
with the following transform:
A*(1 - uitofp i1 C) + B*(uitofp i1 C) -> select C, A, B
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181216 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r180802
There's ongoing discussion about whether this is the right place to make
this transformation. Reverting for now while we figure it out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180834 91177308-0d34-0410-b5e6-96231b3b80d8
Always fold a shuffle-of-shuffle into a single shuffle when there's only one
input vector in the first place. Continue to be more conservative when there's
multiple inputs.
rdar://13402653
PR15866
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180802 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes the optimization introduced in r179748 and reverted in r179750.
While the optimization was sound, it did not properly respect differences in
bit-width.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180777 91177308-0d34-0410-b5e6-96231b3b80d8
One performs: (X == 13 | X == 14) -> X-13 <u 2
The other: (A == C1 || A == C2) -> (A & ~(C1 ^ C2)) == C1
The problem is that there are certain values of C1 and C2 that
trigger both transforms but the first one blocks out the second,
this generates suboptimal code.
Reordering the transforms should be better in every case and
allows us to do interesting stuff like turn:
%shr = lshr i32 %X, 4
%and = and i32 %shr, 15
%add = add i32 %and, -14
%tobool = icmp ne i32 %add, 0
into:
%and = and i32 %X, 240
%tobool = icmp ne i32 %and, 224
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179493 91177308-0d34-0410-b5e6-96231b3b80d8
When trying to collapse sequences of insertelement/extractelement
instructions into single shuffle instructions, there is one specific
case where the Instruction Combiner wrongly updates the resulting
Mask of shuffle indexes.
The problem is in function CollectShuffleElments.
If we have a sequence of insert/extract element instructions
like the one below:
%tmp1 = extractelement <4 x float> %LHS, i32 0
%tmp2 = insertelement <4 x float> %RHS, float %tmp1, i32 1
%tmp3 = extractelement <4 x float> %RHS, i32 2
%tmp4 = insertelement <4 x float> %tmp2, float %tmp3, i32 3
Where:
. %RHS will have a mask of [4,5,6,7]
. %LHS will have a mask of [0,1,2,3]
The Mask of shuffle indexes is wrongly computed to [4,1,6,7]
instead of [4,0,6,7].
When analyzing %tmp2 in order to compute the Mask for the
resulting shuffle instruction, the algorithm forgets to update
the mask index at position 1 with the index associated to the
element extracted from %LHS by instruction %tmp1.
Patch by Andrea DiBiagio!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179291 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit 342d92c7a0.
Turns out we're going with a different schema design to represent
DW_TAG_imported_modules so we won't need this extra field.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178215 91177308-0d34-0410-b5e6-96231b3b80d8
This is just the basic groundwork for supporting DW_TAG_imported_module but I
wanted to commit this before pushing support further into Clang or LLVM so that
this rather churny change is isolated from the rest of the work. The major
churn here is obviously adding another field (within the common DIScope prefix)
to all DIScopes (files, classes, namespaces, lexical scopes, etc). This should
be the last big churny change needed for DW_TAG_imported_module/using directive
support/PR14606.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178099 91177308-0d34-0410-b5e6-96231b3b80d8
The problem is that the code mistakenly took for granted that following constructor
is able to create an APFloat from a *SIGNED* integer:
APFloat::APFloat(const fltSemantics &ourSemantics, integerPart value)
rdar://13486998
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177906 91177308-0d34-0410-b5e6-96231b3b80d8
This simplification happens at 2 places :
- using the nsw attribute when the shl / mul is used by a sign test
- when the shl / mul is compared for (in)equality to zero
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177856 91177308-0d34-0410-b5e6-96231b3b80d8