foldMemoryOperand how to "fold" them, by converting them into constant-pool
loads. When they aren't folded, they use xorps/cmpeqd, but for example when
register pressure is high, they may now be folded as memory operands, which
reduces register pressure.
Also, mark V_SET0 isAsCheapAsAMove so that two-address-elimination will
remat it instead of copying zeros around (V_SETALLONES was already marked).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60461 91177308-0d34-0410-b5e6-96231b3b80d8
the predicates by extending simple predicates to create
more complex predicates instead of duplicating the logic
for the simple predicates.
This doesn't reduce much redundancy in DAGISelEmitter.cpp's
generated source yet; that will require improvements to
DAGISelEmitter.cpp's instruction sorting, to make it more
effectively group nodes with similar predicates together.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57565 91177308-0d34-0410-b5e6-96231b3b80d8
the same pattern as roundpd/roundps, the Intel compiler
builtins do not: rounds* has an extra operand. Fixes
gcc.target/i386/sse4_1-rounds[sd]-[1234].c
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57370 91177308-0d34-0410-b5e6-96231b3b80d8
a constant vector ("{0x123, 0x456}" syntax). The fix is to simplify the
_mm_srli_si128 macro, and move the "* 8" from the macro into the compiler
back-end. I can't change the existing __builtins because so many people are
using them :-(."
Patch by Stuart Hastings!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@56944 91177308-0d34-0410-b5e6-96231b3b80d8
with ConstantInt. This led to fixing a bug in TargetLowering.cpp
using getValue instead of getAPIntValue.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@56159 91177308-0d34-0410-b5e6-96231b3b80d8
i32>. This is a little messy, but it works.
We should really get rid of the intrinsics, though, since they map
perfectly well to standard LLVM instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@55864 91177308-0d34-0410-b5e6-96231b3b80d8
wrong for volatile loads and stores. In fact this
is almost all of them! There are three types of
problems: (1) it is wrong to change the width of
a volatile memory access. These may be used to
do memory mapped i/o, in which case a load can have
an effect even if the result is not used. Consider
loading an i32 but only using the lower 8 bits. It
is wrong to change this into a load of an i8, because
you are no longer tickling the other three bytes. It
is also unwise to make a load/store wider. For
example, changing an i16 load into an i32 load is
wrong no matter how aligned things are, since the
fact of loading an additional 2 bytes can have
i/o side-effects. (2) it is wrong to change the
number of volatile load/stores: they may be counted
by the hardware. (3) it is wrong to change a volatile
load/store that requires one memory access into one
that requires several. For example on x86-32, you
can store a double in one processor operation, but to
store an i64 requires two (two i32 stores). In a
multi-threaded program you may want to bitcast an i64
to a double and store as a double because that will
occur atomically, and be indivisible to other threads.
So it would be wrong to convert the store-of-double
into a store of an i64, because this will become two
i32 stores - no longer atomic. My policy here is
to say that the number of processor operations for
an illegal operation is undefined. So it is alright
to change a store of an i64 (requires at least two
stores; but could be validly lowered to memcpy for
example) into a store of double (one processor op).
In short, if the new store is legal and has the same
size then I say that the transform is ok. It would
also be possible to say that transforms are always
ok if before they were illegal, whether after they
are illegal or not, but that's more awkward to do
and I doubt it buys us anything much.
However this exposed an interesting thing - on x86-32
a store of i64 is considered legal! That is because
operations are marked legal by default, regardless of
whether the type is legal or not. In some ways this
is clever: before type legalization this means that
operations on illegal types are considered legal;
after type legalization there are no illegal types
so now operations are only legal if they really are.
But I consider this to be too cunning for mere mortals.
Better to do things explicitly by testing AfterLegalize.
So I have changed things so that operations with illegal
types are considered illegal - indeed they can never
map to a machine operation. However this means that
the DAG combiner is more conservative because before
it was "accidentally" performing transforms where the
type was illegal because the operation was nonetheless
marked legal. So in a few such places I added a check
on AfterLegalize, which I suppose was actually just
forgotten before. This causes the DAG combiner to do
slightly more than it used to, which resulted in the X86
backend blowing up because it got a slightly surprising
node it wasn't expecting, so I tweaked it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52254 91177308-0d34-0410-b5e6-96231b3b80d8
Note, some of the code will be moved into target independent part of DAG combiner in a subsequent patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50918 91177308-0d34-0410-b5e6-96231b3b80d8