Added abstract class MemSDNode for any Node that have an associated MemOperand
Changed atomic.lcs => atomic.cmp.swap, atomic.las => atomic.load.add, and
atomic.lss => atomic.load.sub
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52706 91177308-0d34-0410-b5e6-96231b3b80d8
clear() on each iteration. This avoids allocating and deallocating
all of DenseMap's memory on each iteration.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52642 91177308-0d34-0410-b5e6-96231b3b80d8
fixes PR2476; patch by Richard Osborne. The same
problem exists for a bunch of other operators, but
I'm ignoring this because they will be automagically
fixed when the new LegalizeTypes infrastructure lands,
since it already solves this problem centrally.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52610 91177308-0d34-0410-b5e6-96231b3b80d8
and provides fairly efficient removal of arbitrary elements. Switch
ScheduleDAGRRList from std::set to this new priority queue.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52582 91177308-0d34-0410-b5e6-96231b3b80d8
to DenseMap<SDNode*, SUnit*>, and adjust the way cloned SUnit nodes are
handled so that only the original node needs to be in the map.
This speeds up llc on 447.dealII.llvm.bc by about 2%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52576 91177308-0d34-0410-b5e6-96231b3b80d8
integer of the same type. Before it was "promotion",
but this is confusing because it is quite different
to promotion of integers. Call it "softening" instead,
inspired by "soft float".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52546 91177308-0d34-0410-b5e6-96231b3b80d8
rather than bundling them together. Rename FloatToInt
to PromoteFloat (better, if not perfect). Reorganize
files by types rather than by operations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52408 91177308-0d34-0410-b5e6-96231b3b80d8
of value info (sign/zero ext info) from one MBB to another. This doesn't
handle much right now because of two limitations:
1) only handles zext/sext, not random bit propagation (no assert exists
for this)
2) doesn't handle phis.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52383 91177308-0d34-0410-b5e6-96231b3b80d8
still excluding types like i1 (not byte sized)
and i120 (loading an i120 requires loading an i64,
an i32, an i16 and an i8, which is expensive).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52310 91177308-0d34-0410-b5e6-96231b3b80d8
not valid if the load is volatile. Hopefully
all wrong DAG combiner transforms of volatile
loads and stores have now been caught.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52293 91177308-0d34-0410-b5e6-96231b3b80d8
on some code when !AfterLegalize - but since
this whole code section is turned off by an
"if (0)" it's not really turning anything on.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52276 91177308-0d34-0410-b5e6-96231b3b80d8
wrong for volatile loads and stores. In fact this
is almost all of them! There are three types of
problems: (1) it is wrong to change the width of
a volatile memory access. These may be used to
do memory mapped i/o, in which case a load can have
an effect even if the result is not used. Consider
loading an i32 but only using the lower 8 bits. It
is wrong to change this into a load of an i8, because
you are no longer tickling the other three bytes. It
is also unwise to make a load/store wider. For
example, changing an i16 load into an i32 load is
wrong no matter how aligned things are, since the
fact of loading an additional 2 bytes can have
i/o side-effects. (2) it is wrong to change the
number of volatile load/stores: they may be counted
by the hardware. (3) it is wrong to change a volatile
load/store that requires one memory access into one
that requires several. For example on x86-32, you
can store a double in one processor operation, but to
store an i64 requires two (two i32 stores). In a
multi-threaded program you may want to bitcast an i64
to a double and store as a double because that will
occur atomically, and be indivisible to other threads.
So it would be wrong to convert the store-of-double
into a store of an i64, because this will become two
i32 stores - no longer atomic. My policy here is
to say that the number of processor operations for
an illegal operation is undefined. So it is alright
to change a store of an i64 (requires at least two
stores; but could be validly lowered to memcpy for
example) into a store of double (one processor op).
In short, if the new store is legal and has the same
size then I say that the transform is ok. It would
also be possible to say that transforms are always
ok if before they were illegal, whether after they
are illegal or not, but that's more awkward to do
and I doubt it buys us anything much.
However this exposed an interesting thing - on x86-32
a store of i64 is considered legal! That is because
operations are marked legal by default, regardless of
whether the type is legal or not. In some ways this
is clever: before type legalization this means that
operations on illegal types are considered legal;
after type legalization there are no illegal types
so now operations are only legal if they really are.
But I consider this to be too cunning for mere mortals.
Better to do things explicitly by testing AfterLegalize.
So I have changed things so that operations with illegal
types are considered illegal - indeed they can never
map to a machine operation. However this means that
the DAG combiner is more conservative because before
it was "accidentally" performing transforms where the
type was illegal because the operation was nonetheless
marked legal. So in a few such places I added a check
on AfterLegalize, which I suppose was actually just
forgotten before. This causes the DAG combiner to do
slightly more than it used to, which resulted in the X86
backend blowing up because it got a slightly surprising
node it wasn't expecting, so I tweaked it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52254 91177308-0d34-0410-b5e6-96231b3b80d8
maps can be deleted. This happens when RAUW
replaces a node N with another equivalent node
E, deleting the first node. Solve this by
adding (N, E) to ReplacedNodes, which is already
used to remap nodes to replacements. This means
that deleted nodes are being allowed in maps,
which can be delicate: the memory may be reused
for a new node which might get confused with the
old deleted node pointer hanging around in the
maps, so detect this and flush out maps if it
occurs (ExpungeNode). The expunging operation
is expensive, however it never occurs during
a llvm-gcc bootstrap or anywhere in the nightly
testsuite. It occurs three times in "make check":
Alpha/illegal-element-type.ll,
PowerPC/illegal-element-type.ll and
X86/mmx-shift.ll. If expunging proves to be too
expensive then there are other more complicated
ways of solving the problem.
In the normal case this patch adds the overhead
of a few more map lookups, which is hopefully
negligable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52214 91177308-0d34-0410-b5e6-96231b3b80d8
of integer types. Fix the isMask APInt method to
actually work (hopefully) rather than crashing
because it adds apints of different bitwidths.
It looks like isShiftedMask is also broken, but
I'm leaving that one to the APInt people (it is
not used anywhere).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52142 91177308-0d34-0410-b5e6-96231b3b80d8
of apint codegen failure is the DAG combiner doing
the wrong thing because it was comparing MVT's using
< rather than comparing the number of bits. Removing
the < method makes this mistake impossible to commit.
Instead, add helper methods for comparing bits and use
them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52098 91177308-0d34-0410-b5e6-96231b3b80d8
no visible functionality change, but enables a future patch where node creation
will update the CFG if it decides to create an unconditional rather than a conditional branch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52067 91177308-0d34-0410-b5e6-96231b3b80d8
and better control the abstraction. Rename the type
to MVT. To update out-of-tree patches, the main
thing to do is to rename MVT::ValueType to MVT, and
rewrite expressions like MVT::getSizeInBits(VT) in
the form VT.getSizeInBits(). Use VT.getSimpleVT()
to extract a MVT::SimpleValueType for use in switch
statements (you will get an assert failure if VT is
an extended value type - these shouldn't exist after
type legalization).
This results in a small speedup of codegen and no
new testsuite failures (x86-64 linux).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52044 91177308-0d34-0410-b5e6-96231b3b80d8
issue is operand promotion for setcc/select... but looks like the fundamental
stuff is implemented for CellSPU.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51884 91177308-0d34-0410-b5e6-96231b3b80d8
and/or to handle more cases (such as this add-sitofp.ll testcase), and
port it to selectiondag's ComputeNumSignBits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51469 91177308-0d34-0410-b5e6-96231b3b80d8
several things that were neither in an anonymous namespace nor static
but not intended to be global.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51017 91177308-0d34-0410-b5e6-96231b3b80d8
ComputeMaskedBits handles, just use a 'default:'. This avoids
TargetLowering's list getting out of date with SelectionDAG's.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50693 91177308-0d34-0410-b5e6-96231b3b80d8
ffastmath mode. This fixes rdar://5902801, a miscompilation
of gcc.dg/builtins-8.c.
Bill, please pull this into Tak.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50523 91177308-0d34-0410-b5e6-96231b3b80d8
Move platform independent code (lowering of possibly overwritten
arguments, check for tail call optimization eligibility) from
target X86ISelectionLowering.cpp to TargetLowering.h and
SelectionDAGISel.cpp.
Initial PowerPC tail call implementation:
Support ppc32 implemented and tested (passes my tests and
test-suite llvm-test).
Support ppc64 implemented and half tested (passes my tests).
On ppc tail call optimization is performed if
caller and callee are fastcc
call is a tail call (in tail call position, call followed by ret)
no variable argument lists or byval arguments
option -tailcallopt is enabled
Supported:
* non pic tail calls on linux/darwin
* module-local tail calls on linux(PIC/GOT)/darwin(PIC)
* inter-module tail calls on darwin(PIC)
If constraints are not met a normal call will be emitted.
A test checking the argument lowering behaviour on x86-64 was added.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50477 91177308-0d34-0410-b5e6-96231b3b80d8
This removes the existing bottleneck related to the removal of elements from
the middle of the queue.
Also fixes a subtle bug in ScheduleDAGRRList::CapturePred:
It was updating the state of the SUnit before removing it. As a result, the
comparison operators were working incorrectly and this SUnit could not be removed
from the queue properly.
Reviewed by Evan and Dan. Approved by Dan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50412 91177308-0d34-0410-b5e6-96231b3b80d8
We now compile test2/test3 to:
_test2:
## InlineAsm Start
set %xmm0, %xmm1
## InlineAsm End
addps %xmm1, %xmm0
ret
_test3:
## InlineAsm Start
set %xmm0, %xmm1
## InlineAsm End
paddd %xmm1, %xmm0
ret
as expected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50389 91177308-0d34-0410-b5e6-96231b3b80d8
towards PR2094. It now compiles the attached .ll file to:
_sad16_sse2:
movslq %ecx, %rax
## InlineAsm Start
%ecx %rdx %rax %rax %r8d %rdx %rsi
## InlineAsm End
## InlineAsm Start
set %eax
## InlineAsm End
ret
which is pretty decent for a 3 output, 4 input asm.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50386 91177308-0d34-0410-b5e6-96231b3b80d8
c1, f1 = CopyToReg
c2, f2 = CopyToReg
c3 = TokenFactor c1, c2
...
= user c3, ..., f2
Now that the two CopyToReg's and the user are "flagged" together. They effectively forms a single scheduling unit. The TokenFactor is now both an operand and a successor of the Flagged nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50376 91177308-0d34-0410-b5e6-96231b3b80d8
memcpy/memset expansion. It was a bug for the SVOffset value
to be used in the actual address calculations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50359 91177308-0d34-0410-b5e6-96231b3b80d8
ComputeMaskedBits knows about cttz, ctlz, and ctpop. Teach
SelectionDAG's ComputeMaskedBits what InstCombine's knows
about SRem. And teach them both some things about high bits
in Mul, UDiv, URem, and Sub. This allows instcombine and
dagcombine to eliminate sign-extension operations in
several new cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50358 91177308-0d34-0410-b5e6-96231b3b80d8
conversion open the door for many nasty implicit conversion issues, and
can be easily solved by initializing with (V.begin(), V.end()) when
needed.
This patch includes many small cleanups for sdisel also.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50340 91177308-0d34-0410-b5e6-96231b3b80d8
When choosing between constraints with multiple options,
like "ir", test to see if we can use the 'i' constraint and
go with that if possible. This produces more optimal ASM in
all cases (sparing a register and an instruction to load it),
and fixes inline asm like this:
void test () {
asm volatile (" %c0 %1 " : : "imr" (42), "imr"(14));
}
Previously we would dump "42" into a memory location (which
is ok for the 'm' constraint) which would cause a problem
because the 'c' modifier is not valid on memory operands.
Isn't it great how inline asm turns 'missed optimization'
into 'compile failed'??
Incidentally, this was the todo in
PowerPC/2007-04-24-InlineAsm-I-Modifier.ll
Please do NOT pull this into Tak.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50315 91177308-0d34-0410-b5e6-96231b3b80d8
- Make targetlowering.h fit in 80 cols.
- Make LowerAsmOperandForConstraint const.
- Make lowerXConstraint -> LowerXConstraint
- Make LowerXConstraint return a const char* instead of taking a string byref.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50312 91177308-0d34-0410-b5e6-96231b3b80d8
to the block that defines their operands. This doesn't work in the
case that the operand is an invoke, because invoke is a terminator
and must be the last instruction in a block.
Replace it with support in SelectionDAGISel for copying struct values
into sequences of virtual registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50279 91177308-0d34-0410-b5e6-96231b3b80d8
function, and then use it to fix a bug in SplitVectorOp that expected inserts
to always have constant insertion indices.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50273 91177308-0d34-0410-b5e6-96231b3b80d8
LegalizeTypes. Correct the load logic so
that it actually works, and also teach it
to handle floating point extending loads.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49923 91177308-0d34-0410-b5e6-96231b3b80d8
rather than having it suck them out of a node. Add
a bunch of new libcalls, and remove dead softfloat
code (dead, because FloatToInt is used not Expand
in this case). Note that indexed stores probably
aren't handled properly, likewise for loads.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49915 91177308-0d34-0410-b5e6-96231b3b80d8
Rename SDOperandImpl back to SDOperand.
Introduce the SDUse class that represents a use of the SDNode referred by
an SDOperand. Now it is more similar to Use/Value classes.
Patch is approved by Dan Gohman.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49795 91177308-0d34-0410-b5e6-96231b3b80d8
ScheduleDAG; they don't correspond to any actual instructions so they
don't need to be scheduled.
This fixes a bug where the EntryToken was being scheduled multiple
times in some cases, though it ended up not causing any trouble because
EntryToken doesn't expand into anything. With this fixed the schedulers
reliably schedule the expected number of units, so we can check this
with an assertion.
This requires a tweak to test/CodeGen/X86/loop-hoist.ll because it
ends up getting scheduled differently in a trivial way, though it was
enough to fool the prcontext+grep that the test does.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49701 91177308-0d34-0410-b5e6-96231b3b80d8
much simpler than in LegalizeDAG because calls are
not yet expanded into call sequences: that happens
after type legalization has finished.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49634 91177308-0d34-0410-b5e6-96231b3b80d8
in its maps. Add some sanity checks that catch
this kind of thing. Hopefully these can be
removed one day (once all problems are fixed!)
but for the moment it seems wise to have them in.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49612 91177308-0d34-0410-b5e6-96231b3b80d8
on any current target and aren't optimized in DAGCombiner. Instead
of using intermediate nodes, expand the operations, choosing between
simple loads/stores, target-specific code, and library calls,
immediately.
Previously, the code to emit optimized code for these operations
was only used at initial SelectionDAG construction time; now it is
used at all times. This fixes some cases where rep;movs was being
used for small copies where simple loads/stores would be better.
This also cleans up code that checks for alignments less than 4;
let the targets make that decision instead of doing it in
target-independent code. This allows x86 to use rep;movs in
low-alignment cases.
Also, this fixes a bug that resulted in the use of rep;stos for
memsets of 0 with non-constant memory size when the alignment was
at least 4. It's better to use the library in this case, which
can be significantly faster when the size is large.
This also preserves more SourceValue information when memory
intrinsics are lowered into simple loads/stores.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49572 91177308-0d34-0410-b5e6-96231b3b80d8
If it cannot be expanded, it will keep the old behaviour and try to shrink the constant.
Part of enhancement for PR2191.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49280 91177308-0d34-0410-b5e6-96231b3b80d8
before an invoke. Failure to do this causes references in
the landing pad to variables that were not set. Fixes
g++.dg/eh/delayslot1.C
g++.dg/eh/fp-regs.C
g++.old-deja/g++.brendan/eh1.C
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49243 91177308-0d34-0410-b5e6-96231b3b80d8
There is no point in creating a long live range defined by an implicit_def. Scheduler now duplicates implicit_def instruction for each of its uses. Therefore, if an implicit_def node has multiple uses, it will become a number of very short live ranges, rather than a long one. This will make coalescer's job easier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49164 91177308-0d34-0410-b5e6-96231b3b80d8
review feedback.
-enable-eh is still accepted but doesn't do anything.
EH intrinsics use Dwarf EH if the target supports that,
and are handled by LowerInvoke otherwise.
The separation of the EH table and frame move data is,
I think, logically figured out, but either one still
causes full EH info to be generated (not sure how to
split the metadata correctly).
MachineModuleInfo::needsFrameInfo is no longer used and
is removed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49064 91177308-0d34-0410-b5e6-96231b3b80d8
not marked nounwind, or for all functions when -enable-eh
is set, provided the target supports Dwarf EH.
llvm-gcc generates nounwind in the right places; other FEs
will need to do so also. Given such a FE, -enable-eh should
no longer be needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49006 91177308-0d34-0410-b5e6-96231b3b80d8
In order to handle indexed nodes I had to introduce
a new constructor, and since I was there I factorized
the code in the various load constructors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48894 91177308-0d34-0410-b5e6-96231b3b80d8
nodes. This doesn't currently have much impact the generated code, but it
does produce simpler-looking SelectionDAGs, and consequently
simpler-looking ScheduleDAGs, because there are fewer spurious
dependencies.
In particular, CopyValueToVirtualRegister now uses the entry node as the
input chain dependency for new CopyToReg nodes instead of calling getRoot
and depending on the most recent memory reference.
Also, rename UnorderedChains to PendingExports and pull it up from being
a local variable in SelectionDAGISel::BuildSelectionDAG to being a
member variable of SelectionDAGISel, so that it doesn't have to be
passed around to all the places that need it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48893 91177308-0d34-0410-b5e6-96231b3b80d8
called LimitedSumOfUnscheduledPredsOfSuccs. It terminates the computation
after a given treshold is reached. This new function is always faster, but
brings real wins only on bigger test-cases.
The old function SumOfUnscheduledPredsOfSuccs is left in-place for now and therefore a warning about an unused static function is produced.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48872 91177308-0d34-0410-b5e6-96231b3b80d8
LLVM Value/Use does and MachineRegisterInfo/MachineOperand does.
This allows constant time for all uses list maintenance operations.
The idea was suggested by Chris. Reviewed by Evan and Dan.
Patch is tested and approved by Dan.
On normal use-cases compilation speed is not affected. On very big basic
blocks there are compilation speedups in the range of 15-20% or even better.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48822 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes Bugzilla #1835 (http://llvm.org/bugs/show_bug.cgi?id=1835).
This patched is reviewed by Tanya and Dan. Dan tested and approved it.
The reason for the bad performance of the old algorithm is that it is very naive and scans every
time all nodes of the DAG in the worst case.
This patch introduces a new algorithm based on the paper "Online algorithms
for maintaining the topological order of a directed acyclic graph" by
David J.Pearce and Paul H.J.Kelly. This is the MNR algorithm. It has a
linear time worst-case and performs much better in most situations.
The paper can be found here:
http://fano.ics.uci.edu/cites/Document/Online-algorithms-for-maintaining-the-topological-order-of-a-directed-acyclic-graph.html
The main idea of the new algorithm is to compute the topological ordering of the SNodes in the
DAG and to maintain it even after DAG modifications. The topological ordering allows for very fast
node reachability checks.
Tests on very big input files with tens of thousands of instructions in a BB indicate huge
speed-ups (up to 10x compilation time improvement) compared to the old version.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48817 91177308-0d34-0410-b5e6-96231b3b80d8
flags. This is needed by the new legalize types
infrastructure which wants to expand the 64 bit
constants previously used to hold the flags on
32 bit machines. There are two functional changes:
(1) in LowerArguments, if a parameter has the zext
attribute set then that is marked in the flags;
before it was being ignored; (2) PPC had some bogus
code for handling two word arguments when using the
ELF 32 ABI, which was hard to convert because of
the bogusness. As suggested by the original author
(Nicolas Geoffray), I've disabled it for the moment.
Tested with "make check" and the Ada ACATS testsuite.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48640 91177308-0d34-0410-b5e6-96231b3b80d8
Use getIntPtrConstant in a couple places to shorten stuff up
Handle splitting vector shuffles with undefs in the mask
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48351 91177308-0d34-0410-b5e6-96231b3b80d8
the fcopysign expansion from LegalizeDAG to get rid of
what seems to be a bug: the use of sign extension means
that when copying the sign bit from an f32 to an f64,
the upper 32 bits of the f64 (now an i64) are set, not
just the top bit... I also generalized it to work for
any sized floating point types, and removed the bogosity:
SDOperand Mask1 = (SrcVT == MVT::f64)
? DAG.getConstantFP(BitsToDouble(1ULL << 63), SrcVT)
: DAG.getConstantFP(BitsToFloat(1U << 31), SrcVT);
Mask1 = DAG.getNode(ISD::BIT_CONVERT, SrcNVT, Mask1);
(here SrcNVT is an integer with the same size as SrcVT).
As far as I can see this takes a 1 << 63, converts to
a double, converts that to a floating point constant
then converts that to an integer constant, ending up
with... 1 << 63 as an integer constant! So I just
generate this integer constant directly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48305 91177308-0d34-0410-b5e6-96231b3b80d8
getCopyToParts problem was noticed by the new
LegalizeTypes infrastructure. In order to avoid
this kind of thing in the future I've added a
check that EXTRACT_ELEMENT is only used with
integers. Once LegalizeTypes is up and running
most likely BUILD_PAIR and EXTRACT_ELEMENT can
be removed, in favour of using apints instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48294 91177308-0d34-0410-b5e6-96231b3b80d8
X86 lowering normalize vector 0 to v4i32. However DAGCombine can fold (sub x, x) -> 0 after legalization. It can create a zero vector of a type that's not expected (e.g. v8i16). We don't want to disable the optimization since leaving a (sub x, x) is really bad. Add isel patterns for other types of vector 0 to ensure correctness. It's highly unlikely to happen other than in bugpoint reduced test cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48279 91177308-0d34-0410-b5e6-96231b3b80d8
and it's the result that requires expansion. This code is a little confusing
because the TargetLoweringInfo tables for [US]INT_TO_FP use the operand type
(the integer type) rather than the result type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48206 91177308-0d34-0410-b5e6-96231b3b80d8
return ValueType can depend its operands' ValueType.
This is a cosmetic change, no functionality impacted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48145 91177308-0d34-0410-b5e6-96231b3b80d8
Change insert/extract subreg instructions to be able to be used in TableGen patterns.
Use the above features to reimplement an x86-64 pseudo instruction as a pattern.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48130 91177308-0d34-0410-b5e6-96231b3b80d8
field to 32 bits, thus enabling correct handling of ByVal
structs bigger than 0x1ffff. Abstract interface a bit.
Fixes gcc.c-torture/execute/pr23135.c and
gcc.c-torture/execute/pr28982b.c in gcc testsuite (were ICE'ing
on ppc32, quietly producing wrong code on x86-32.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48122 91177308-0d34-0410-b5e6-96231b3b80d8
they are produced by calls (which are known exact) and by cross block copies
which are known to be produced by extends.
This improves:
define double @test2() {
%tmp85 = call double asm sideeffect "fld0", "={st(0)}"()
ret double %tmp85
}
from:
_test2:
subl $20, %esp
# InlineAsm Start
fld0
# InlineAsm End
fstpl 8(%esp)
movsd 8(%esp), %xmm0
movsd %xmm0, (%esp)
fldl (%esp)
addl $20, %esp
#FP_REG_KILL
ret
to:
_test2:
# InlineAsm Start
fld0
# InlineAsm End
#FP_REG_KILL
ret
by avoiding a f64 <-> f80 trip
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48108 91177308-0d34-0410-b5e6-96231b3b80d8
an RFP register class.
Teach ScheduleDAG how to handle CopyToReg with different src/dst
reg classes.
This allows us to compile trivial inline asms that expect stuff
on the top of x87-fp stack.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48107 91177308-0d34-0410-b5e6-96231b3b80d8
in different register classes, e.g. copy of ST(0) to RFP*. This gets
some really trivial inline asm working that plops things on the top of
stack (PR879)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48105 91177308-0d34-0410-b5e6-96231b3b80d8
of BUILD_VECTORS that only have two unique elements:
1. The previous code was nondeterminstic, because it walked a map in
SDOperand order, which isn't determinstic.
2. The previous code didn't handle the case when one element was undef
very well. Now we ensure that the generated shuffle mask has the
undef vector on the RHS (instead of potentially being on the LHS)
and that any elements that refer to it are themselves undef. This
allows us to compile CodeGen/X86/vec_set-9.ll into:
_test3:
movd %rdi, %xmm0
punpcklqdq %xmm0, %xmm0
ret
instead of:
_test3:
movd %rdi, %xmm1
#IMPLICIT_DEF %xmm0
punpcklqdq %xmm1, %xmm0
ret
... saving a register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48060 91177308-0d34-0410-b5e6-96231b3b80d8
_test3:
movd %rdi, %xmm1
#IMPLICIT_DEF %xmm0
punpcklqdq %xmm1, %xmm0
ret
instead of:
_test3:
#IMPLICIT_DEF %rax
movd %rax, %xmm0
movd %rdi, %xmm1
punpcklqdq %xmm1, %xmm0
ret
This is still not ideal. There is no reason to two xmm regs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48058 91177308-0d34-0410-b5e6-96231b3b80d8
except ppc long double. This allows us to shrink constant pool
entries for x86 long double constants, which in turn allows us to
use flds/fldl instead of fldt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47938 91177308-0d34-0410-b5e6-96231b3b80d8
bug in r47928 (Int64Ty is the correct type for the constant
pool entry here) and removes the asserts, now that the code
is capable of handling i128.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47932 91177308-0d34-0410-b5e6-96231b3b80d8
For x86, if sse2 is available, it's not a good idea since cvtss2sd is slower than a movsd load and it prevents load folding. On x87, it's important to shrink fp constant since fldt is very expensive.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47931 91177308-0d34-0410-b5e6-96231b3b80d8
The basic idea is that all these algorithms are computing the longest paths from the root node or to the exit node. Therefore the existing implementation that uses and iterative and potentially
exponential algorithm was changed to a well-known graph algorithm based on dynamic programming. It has a linear run-time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47884 91177308-0d34-0410-b5e6-96231b3b80d8
generic & x86 versions; change generic to follow x86
and improve comments. Add PPC version (not right
for non-Darwin.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47734 91177308-0d34-0410-b5e6-96231b3b80d8