This reverts commit r185099.
Looks like both the ppc-64 and mips bots are still failing after I reverted this
change.
Since:
1. The mips bot always performs a clean build,
2. The ppc64-bot failed again after a clean build (I asked the ppc-64
maintainers to clean the bot which they did... Thanks Will!),
I think it is safe to assume that this change was not the cause of the failures
that said builders were seeing. Thus I am recomitting.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185111 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r185095. This is causing a FileCheck failure on
the 3dnow intrinsics on at least the mips/ppc bots but not on the x86
bots.
Reverting while I figure out what is going on.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185099 91177308-0d34-0410-b5e6-96231b3b80d8
The category which an APFloat belongs to should be dependent on the
actual value that the APFloat has, not be arbitrarily passed in by the
user. This will prevent inconsistency bugs where the category and the
actual value in APFloat differ.
I also fixed up all of the references to this constructor (which were
only in LLVM).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185095 91177308-0d34-0410-b5e6-96231b3b80d8
index greater than the size of the vector is invalid. The shuffle may be
shrinking the size of the vector. Fixes a crash!
Also drop the maximum recursion depth of the safety check for this
optimization to five.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183080 91177308-0d34-0410-b5e6-96231b3b80d8
as the BinaryOperator, *not* in the block where the IRBuilder is currently
inserting into. Fixes a bug where scalarizePHI would create instructions
that would not dominate all uses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182639 91177308-0d34-0410-b5e6-96231b3b80d8
The earlier change list introduced the following inst combines:
B * (uitofp i1 C) —> select C, B, 0
A * (1 - uitofp i1 C) —> select C, 0, A
select C, 0, B + select C, A, 0 —> select C, A, B
Together these 3 changes would simplify :
A * (1 - uitofp i1 C) + B * uitofp i1 C
down to :
select C, B, A
In practice we found that the first two substitutions can have a
negative effect on performance, because they reduce opportunities to
use FMA contractions; between the two options FMAs are often the
better choice. This change list amends the previous one to enable
just these inst combines:
select C, B, 0 + select C, 0, A —> select C, B, A
A * (1 - uitofp i1 C) + B * uitofp i1 C —> select C, B, A
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182499 91177308-0d34-0410-b5e6-96231b3b80d8
There are two transforms in visitUrem that conflict with each other.
*) One, if a divisor is a power of two, subtracts one from the divisor
and turns it into a bitwise-and.
*) The other unwraps both operands if they are surrounded by zext
instructions.
Flipping the order allows the subtraction to go beneath the sign
extension.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181668 91177308-0d34-0410-b5e6-96231b3b80d8
The shift amount may be larger than the type leading to undefined behavior.
Limit the transform to constant shift amounts. While there update the bits to
clear in the result which may enable additional optimizations.
PR15959.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181604 91177308-0d34-0410-b5e6-96231b3b80d8
That's obviously wrong. Conservatively restrict it to the sign bit, which
matches the original intention of this analysis. Fixes PR15940.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181518 91177308-0d34-0410-b5e6-96231b3b80d8
A * (1 - (uitofp i1 C)) -> select C, 0, A
B * (uitofp i1 C) -> select C, B, 0
select C, 0, A + select C, B, 0 -> select C, B, A
These come up in code that has been hand-optimized from a select to a linear blend,
on platforms where that may have mattered. We want to undo such changes
with the following transform:
A*(1 - uitofp i1 C) + B*(uitofp i1 C) -> select C, A, B
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181216 91177308-0d34-0410-b5e6-96231b3b80d8
the things, and renames it to CBindingWrapping.h. I also moved
CBindingWrapping.h into Support/.
This new file just contains the macros for defining different wrap/unwrap
methods.
The calls to those macros, as well as any custom wrap/unwrap definitions
(like for array of Values for example), are put into corresponding C++
headers.
Doing this required some #include surgery, since some .cpp files relied
on the fact that including Wrap.h implicitly caused the inclusion of a
bunch of other things.
This also now means that the C++ headers will include their corresponding
C API headers; for example Value.h must include llvm-c/Core.h. I think
this is harmless, since the C API headers contain just external function
declarations and some C types, so I don't believe there should be any
nasty dependency issues here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180881 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r180802
There's ongoing discussion about whether this is the right place to make
this transformation. Reverting for now while we figure it out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180834 91177308-0d34-0410-b5e6-96231b3b80d8
Always fold a shuffle-of-shuffle into a single shuffle when there's only one
input vector in the first place. Continue to be more conservative when there's
multiple inputs.
rdar://13402653
PR15866
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180802 91177308-0d34-0410-b5e6-96231b3b80d8
Differences in bitwidth between X and Y could exist even if C1 and C2 have
the same Log2 representation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180779 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes the optimization introduced in r179748 and reverted in r179750.
While the optimization was sound, it did not properly respect differences in
bit-width.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180777 91177308-0d34-0410-b5e6-96231b3b80d8
One performs: (X == 13 | X == 14) -> X-13 <u 2
The other: (A == C1 || A == C2) -> (A & ~(C1 ^ C2)) == C1
The problem is that there are certain values of C1 and C2 that
trigger both transforms but the first one blocks out the second,
this generates suboptimal code.
Reordering the transforms should be better in every case and
allows us to do interesting stuff like turn:
%shr = lshr i32 %X, 4
%and = and i32 %shr, 15
%add = add i32 %and, -14
%tobool = icmp ne i32 %add, 0
into:
%and = and i32 %X, 240
%tobool = icmp ne i32 %and, 224
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179493 91177308-0d34-0410-b5e6-96231b3b80d8