rather than depending on LiveVariables. This decreases compile time from:
0.5909s (LV + Regalloc) to 0.421s (just regalloc).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53256 91177308-0d34-0410-b5e6-96231b3b80d8
soft float: experiments show that targets aren't
expecting this for results or for operands. Add
support select/select_cc result soft float and
correct operand soft float for these.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53245 91177308-0d34-0410-b5e6-96231b3b80d8
MachineMemOperands. The pools are owned by MachineFunctions.
This drastically reduces the number of calls to malloc/free made
during the "Emit" phase of scheduling, as well as later phases
in CodeGen. Combined with other changes, this speeds up the
"instruction selection" phase of CodeGen by 10% in some cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53212 91177308-0d34-0410-b5e6-96231b3b80d8
and reused across SelectionDAGs.
This drastically reduces the number of calls to malloc/free made during
instruction selection, and improves memory locality.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53211 91177308-0d34-0410-b5e6-96231b3b80d8
properly track dead nodes that are on the original SDNode's operand
list but not the new one, and have no other uses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53201 91177308-0d34-0410-b5e6-96231b3b80d8
simple const SDOperand*, which is what's usually needed.
For AddNodeIDOperands, which is small, just duplicate the function to
accept an SDUse*.
For SelectionDAG::getNode - Add an overload that accepts SDUse* that
copies the operands into a temporary SDOperand array, but also has
special-case checks for 0 through 3 operands to avoid the copy in
the common cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53183 91177308-0d34-0410-b5e6-96231b3b80d8
that fixed problems in EmitStackConvert where the source and target type
have different alignment by creating a stack slot with the max
alignment of source and target type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53150 91177308-0d34-0410-b5e6-96231b3b80d8
hook for each way in which a result type can be
legalized (promotion, expansion, softening etc),
just use one: ReplaceNodeResults, which returns
a node with exactly the same result types as the
node passed to it, but presumably with a bunch of
custom code behind the scenes. No change if the
new LegalizeTypes infrastructure is not turned on.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53137 91177308-0d34-0410-b5e6-96231b3b80d8
moves in order to get correct debug info. Since
I can't imagine how any target could possibly
be any different, I've just stripped out the
option: now all the world's like Darwin!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53134 91177308-0d34-0410-b5e6-96231b3b80d8
- Also remove LiveVariables::instructionChanged, etc. Replace all calls with cheaper calls which update VarInfo kill list.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53097 91177308-0d34-0410-b5e6-96231b3b80d8
- CommuteInstruction copies kill / dead markers over to new instruction. So use replaceKillInstruction instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53061 91177308-0d34-0410-b5e6-96231b3b80d8
Also, if LV isn't around, then TwoAddr doesn't need to be updating flags, since they won't have been set in the first place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53058 91177308-0d34-0410-b5e6-96231b3b80d8
SelectionDAG::SelectNodeTo in the instruction selector. This
updates existing nodes in place instead of creating new ones.
Go back to selecting ISD::DBG_LABEL nodes into
TargetInstrInfo::DBG_LABEL nodes instead of leaving them
unselected, now that SelectNodeTo allows us to update them
in place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53057 91177308-0d34-0410-b5e6-96231b3b80d8
to be passed the list of value types, and use this
where appropriate. Inappropriate places are where
the value type list is already known and may be
long, in which case the existing method is more
efficient.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53035 91177308-0d34-0410-b5e6-96231b3b80d8
the need for a flavor operand, and add a new SDNode subclass,
LabelSDNode, for use with them to eliminate the need for a label id
operand.
Change instruction selection to let these label nodes through
unmodified instead of creating copies of them. Teach the MachineInstr
emitter how to emit a MachineInstr directly from an ISD label node.
This avoids the need for allocating SDNodes for the label id and
flavor value, as well as SDNodes for each of the post-isel label,
label id, and label flavor.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52943 91177308-0d34-0410-b5e6-96231b3b80d8
SelectionDAG::allnodes_size is linear, but that doesn't appear to
outweigh the benefit of reducing heap traffic. If it does become a
problem, we should teach SelectionDAG to keep a count of how many
nodes are live, because there are several other places where that
information would be useful as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52926 91177308-0d34-0410-b5e6-96231b3b80d8
purpose, and give it a custom SDNode subclass so that it doesn't
need to have line number, column number, filename string, and
directory string, all existing as individual SDNodes to be the
operands.
This was the only user of ISD::STRING, StringSDNode, etc., so
remove those and some associated code.
This makes stop-points considerably easier to read in
-view-legalize-dags output, and reduces overhead (creating new
nodes and copying std::strings into them) on code containing
debugging information.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52924 91177308-0d34-0410-b5e6-96231b3b80d8
SmallVectors. Change the signature of TargetLowering::LowerArguments
to avoid returning a vector by value, and update the two targets
which still use this directly, Sparc and IA64, accordingly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52917 91177308-0d34-0410-b5e6-96231b3b80d8
only needs one bit for each register. UsedRegs is a SmallVector
sized at 16, so this eliminates a heap allocation/free for every
call and return processed by Legalize on most targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52915 91177308-0d34-0410-b5e6-96231b3b80d8
it impossible to create a MERGE_VALUES node with
only one result: sometimes it is useful to be able
to create a node with only one result out of one of
the results of a node with more than one result, for
example because the new node will eventually be used
to replace a one-result node using ReplaceAllUsesWith,
cf X86TargetLowering::ExpandFP_TO_SINT. On the other
hand, most users of MERGE_VALUES don't need this and
for them the optimization was valuable. So add a new
utility method getMergeValues for creating MERGE_VALUES
nodes which by default performs the optimization.
Change almost everywhere to use getMergeValues (and
tidy some stuff up at the same time).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52893 91177308-0d34-0410-b5e6-96231b3b80d8
Move GetConstantStringInfo to lib/Analysis. Remove
string output routine from Constant. Update all
callers. Change debug intrinsic api slightly to
accomodate move of routine, these now return values
instead of strings.
This unbreaks llvm-gcc bootstrap.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52884 91177308-0d34-0410-b5e6-96231b3b80d8
information of the original load or store, which is checked to be
at least as good, and possibly better.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52849 91177308-0d34-0410-b5e6-96231b3b80d8
This speed up LiveVariables on instcombine at -O0 -g from 0.3855s to 0.3503s. Look for more improvements in this area soon!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52804 91177308-0d34-0410-b5e6-96231b3b80d8
<16 x float> is 64-byte aligned (for some reason),
which gets us into the stack realignment code. The
computation changing FP-relative offsets to SP-relative
was broken, assiging a spill temp to a location
also used for parameter passing. This
fixes it by rounding up the stack frame to a multiple
of the largest alignment (I concluded it wasn't fixable
without doing this, but I'm not very sure.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52750 91177308-0d34-0410-b5e6-96231b3b80d8
string output routine from Constant. Update all
callers. Change debug intrinsic api slightly to
accomodate move of routine, these now return values
instead of strings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52748 91177308-0d34-0410-b5e6-96231b3b80d8
shift.
- Add a readme entry for a missing vector_shuffle optimization that results in
awful codegen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52740 91177308-0d34-0410-b5e6-96231b3b80d8
For this it is convenient to permit floats to
be used with EXTRACT_ELEMENT, so I tweaked
things to allow that. I also added libcalls
for ppcf128 to i32 forms of FP_TO_XINT, since
they exist in libgcc and this case can certainly
occur (and does occur in the testsuite) - before
the i64 libcall was being used. Also, the
XINT_TO_FP result seemed to be wrong when
the argument is an i128: the wrong fudge
factor was added (the i32 and i64 cases were
handled directly, but the i128 code fell
through to some generic softening code which
seemed to think it was i64 to f32!). So I
fixed it by adding a fudge factor that I
found in my breakfast cereal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52739 91177308-0d34-0410-b5e6-96231b3b80d8
Added abstract class MemSDNode for any Node that have an associated MemOperand
Changed atomic.lcs => atomic.cmp.swap, atomic.las => atomic.load.add, and
atomic.lss => atomic.load.sub
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52706 91177308-0d34-0410-b5e6-96231b3b80d8
,------.
| |
| v
| t2 = phi ... t1 ...
| |
| v
| t1 = ...
| ... = ... t1 ...
| |
`------'
where there is a use in a PHI node that's a predecessor to the defining
block. We don't want to mark all predecessors as having the value "alive" in
this case. Also, the assert was too restrictive and didn't handle this case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52655 91177308-0d34-0410-b5e6-96231b3b80d8
clear() on each iteration. This avoids allocating and deallocating
all of DenseMap's memory on each iteration.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52642 91177308-0d34-0410-b5e6-96231b3b80d8
fixes PR2476; patch by Richard Osborne. The same
problem exists for a bunch of other operators, but
I'm ignoring this because they will be automagically
fixed when the new LegalizeTypes infrastructure lands,
since it already solves this problem centrally.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52610 91177308-0d34-0410-b5e6-96231b3b80d8
and provides fairly efficient removal of arbitrary elements. Switch
ScheduleDAGRRList from std::set to this new priority queue.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52582 91177308-0d34-0410-b5e6-96231b3b80d8
to DenseMap<SDNode*, SUnit*>, and adjust the way cloned SUnit nodes are
handled so that only the original node needs to be in the map.
This speeds up llc on 447.dealII.llvm.bc by about 2%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52576 91177308-0d34-0410-b5e6-96231b3b80d8
This is not always a win, but there are much more wins than loses and wins tend to be more noticeable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52554 91177308-0d34-0410-b5e6-96231b3b80d8
integer of the same type. Before it was "promotion",
but this is confusing because it is quite different
to promotion of integers. Call it "softening" instead,
inspired by "soft float".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52546 91177308-0d34-0410-b5e6-96231b3b80d8
According to DWARF-2 specification, the line information is provided through an offset in the .debug_line section.
Replace the label reference that is used with a section offset.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52468 91177308-0d34-0410-b5e6-96231b3b80d8
rather than bundling them together. Rename FloatToInt
to PromoteFloat (better, if not perfect). Reorganize
files by types rather than by operations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52408 91177308-0d34-0410-b5e6-96231b3b80d8
of value info (sign/zero ext info) from one MBB to another. This doesn't
handle much right now because of two limitations:
1) only handles zext/sext, not random bit propagation (no assert exists
for this)
2) doesn't handle phis.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52383 91177308-0d34-0410-b5e6-96231b3b80d8
still excluding types like i1 (not byte sized)
and i120 (loading an i120 requires loading an i64,
an i32, an i16 and an i8, which is expensive).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52310 91177308-0d34-0410-b5e6-96231b3b80d8
not valid if the load is volatile. Hopefully
all wrong DAG combiner transforms of volatile
loads and stores have now been caught.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52293 91177308-0d34-0410-b5e6-96231b3b80d8
on some code when !AfterLegalize - but since
this whole code section is turned off by an
"if (0)" it's not really turning anything on.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52276 91177308-0d34-0410-b5e6-96231b3b80d8
wrong for volatile loads and stores. In fact this
is almost all of them! There are three types of
problems: (1) it is wrong to change the width of
a volatile memory access. These may be used to
do memory mapped i/o, in which case a load can have
an effect even if the result is not used. Consider
loading an i32 but only using the lower 8 bits. It
is wrong to change this into a load of an i8, because
you are no longer tickling the other three bytes. It
is also unwise to make a load/store wider. For
example, changing an i16 load into an i32 load is
wrong no matter how aligned things are, since the
fact of loading an additional 2 bytes can have
i/o side-effects. (2) it is wrong to change the
number of volatile load/stores: they may be counted
by the hardware. (3) it is wrong to change a volatile
load/store that requires one memory access into one
that requires several. For example on x86-32, you
can store a double in one processor operation, but to
store an i64 requires two (two i32 stores). In a
multi-threaded program you may want to bitcast an i64
to a double and store as a double because that will
occur atomically, and be indivisible to other threads.
So it would be wrong to convert the store-of-double
into a store of an i64, because this will become two
i32 stores - no longer atomic. My policy here is
to say that the number of processor operations for
an illegal operation is undefined. So it is alright
to change a store of an i64 (requires at least two
stores; but could be validly lowered to memcpy for
example) into a store of double (one processor op).
In short, if the new store is legal and has the same
size then I say that the transform is ok. It would
also be possible to say that transforms are always
ok if before they were illegal, whether after they
are illegal or not, but that's more awkward to do
and I doubt it buys us anything much.
However this exposed an interesting thing - on x86-32
a store of i64 is considered legal! That is because
operations are marked legal by default, regardless of
whether the type is legal or not. In some ways this
is clever: before type legalization this means that
operations on illegal types are considered legal;
after type legalization there are no illegal types
so now operations are only legal if they really are.
But I consider this to be too cunning for mere mortals.
Better to do things explicitly by testing AfterLegalize.
So I have changed things so that operations with illegal
types are considered illegal - indeed they can never
map to a machine operation. However this means that
the DAG combiner is more conservative because before
it was "accidentally" performing transforms where the
type was illegal because the operation was nonetheless
marked legal. So in a few such places I added a check
on AfterLegalize, which I suppose was actually just
forgotten before. This causes the DAG combiner to do
slightly more than it used to, which resulted in the X86
backend blowing up because it got a slightly surprising
node it wasn't expecting, so I tweaked it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52254 91177308-0d34-0410-b5e6-96231b3b80d8
maps can be deleted. This happens when RAUW
replaces a node N with another equivalent node
E, deleting the first node. Solve this by
adding (N, E) to ReplacedNodes, which is already
used to remap nodes to replacements. This means
that deleted nodes are being allowed in maps,
which can be delicate: the memory may be reused
for a new node which might get confused with the
old deleted node pointer hanging around in the
maps, so detect this and flush out maps if it
occurs (ExpungeNode). The expunging operation
is expensive, however it never occurs during
a llvm-gcc bootstrap or anywhere in the nightly
testsuite. It occurs three times in "make check":
Alpha/illegal-element-type.ll,
PowerPC/illegal-element-type.ll and
X86/mmx-shift.ll. If expunging proves to be too
expensive then there are other more complicated
ways of solving the problem.
In the normal case this patch adds the overhead
of a few more map lookups, which is hopefully
negligable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52214 91177308-0d34-0410-b5e6-96231b3b80d8
of integer types. Fix the isMask APInt method to
actually work (hopefully) rather than crashing
because it adds apints of different bitwidths.
It looks like isShiftedMask is also broken, but
I'm leaving that one to the APInt people (it is
not used anywhere).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52142 91177308-0d34-0410-b5e6-96231b3b80d8
of apint codegen failure is the DAG combiner doing
the wrong thing because it was comparing MVT's using
< rather than comparing the number of bits. Removing
the < method makes this mistake impossible to commit.
Instead, add helper methods for comparing bits and use
them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52098 91177308-0d34-0410-b5e6-96231b3b80d8
no visible functionality change, but enables a future patch where node creation
will update the CFG if it decides to create an unconditional rather than a conditional branch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52067 91177308-0d34-0410-b5e6-96231b3b80d8
and better control the abstraction. Rename the type
to MVT. To update out-of-tree patches, the main
thing to do is to rename MVT::ValueType to MVT, and
rewrite expressions like MVT::getSizeInBits(VT) in
the form VT.getSizeInBits(). Use VT.getSimpleVT()
to extract a MVT::SimpleValueType for use in switch
statements (you will get an assert failure if VT is
an extended value type - these shouldn't exist after
type legalization).
This results in a small speedup of codegen and no
new testsuite failures (x86-64 linux).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52044 91177308-0d34-0410-b5e6-96231b3b80d8
are the same as in unpacked structs, only field
positions differ. This only matters for structs
containing x86 long double or an apint; it may
cause backwards compatibility problems if someone
has bitcode containing a packed struct with a
field of one of those types.
The issue is that only 10 bytes are needed to
hold an x86 long double: the store size is 10
bytes, but the ABI size is 12 or 16 bytes (linux/
darwin) which comes from rounding the store size
up by the alignment. Because it seemed silly not
to pack an x86 long double into 10 bytes in a
packed struct, this is what was done. I now
think this was a mistake. Reserving the ABI size
for an x86 long double field even in a packed
struct makes things more uniform: the ABI size is
now always used when reserving space for a type.
This means that developers are less likely to
make mistakes. It also makes life easier for the
CBE which otherwise could not represent all LLVM
packed structs (PR2402).
Front-end people might need to adjust the way
they create LLVM structs - see following change
to llvm-gcc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51928 91177308-0d34-0410-b5e6-96231b3b80d8
constant shows up in the assembly language output. Helps with
debugging without a HP calculator having to be handy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51885 91177308-0d34-0410-b5e6-96231b3b80d8
issue is operand promotion for setcc/select... but looks like the fundamental
stuff is implemented for CellSPU.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51884 91177308-0d34-0410-b5e6-96231b3b80d8
instruction to execute. This can be used for transformations (like two-address
conversion) to remat an instruction instead of generating a "move"
instruction. The idea is to decrease the live ranges and register pressure and
all that jazz.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51660 91177308-0d34-0410-b5e6-96231b3b80d8
Running /Users/void/llvm/llvm.src/test/CodeGen/X86/dg.exp ...
FAIL: /Users/void/llvm/llvm.src/test/CodeGen/X86/2007-11-30-LoadFolding-Bug.ll
Failed with exit(1) at line 1
while running: llvm-as < /Users/void/llvm/llvm.src/test/CodeGen/X86/2007-11-30-LoadFolding-Bug.ll | llc -march=x86 -mattr=+sse2 -stats |& grep {1 .*folded into instructions}
child process exited abnormally
Make this conditional for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51563 91177308-0d34-0410-b5e6-96231b3b80d8
and/or to handle more cases (such as this add-sitofp.ll testcase), and
port it to selectiondag's ComputeNumSignBits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51469 91177308-0d34-0410-b5e6-96231b3b80d8
live interval to infinity if the instruction being rewritten is an
original remat def instruction. We were only checking against the clone
of the remat def which doesn't actually appear in the IR at all.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51440 91177308-0d34-0410-b5e6-96231b3b80d8
BB1:
vr1025 = copy vr1024
..
BB2:
vr1024 = op
= op vr1025
<loop eventually branch back to BB1>
Even though vr1025 is copied from vr1024, it's not safe to coalesced them since live range of vr1025 intersects the def of vr1024. This happens when vr1025 is assigned the value of the previous iteration of vr1024 in the loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51394 91177308-0d34-0410-b5e6-96231b3b80d8
If local spiller optimization turns some instruction into an identity copy, it will be removed. If the output register happens to be dead (and source is obviously killed), transfer the kill / dead information to last use / def in the same MBB.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51306 91177308-0d34-0410-b5e6-96231b3b80d8
are represented as "weak", but there are subtle differences
in some cases on Darwin, so we need both. The intent
is that "common" will behave identically to "weak" unless
somebody changes their target to do something else.
No functional change as yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51118 91177308-0d34-0410-b5e6-96231b3b80d8
address of the PassInfo directly instead of calling getPassInfo.
This eliminates a bunch of dynamic initializations of static data.
Also, fold RegisterPassBase into PassInfo, make a bunch of its
data members const, and rearrange some code to initialize data
members in constructors instead of using setter member functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51022 91177308-0d34-0410-b5e6-96231b3b80d8
several things that were neither in an anonymous namespace nor static
but not intended to be global.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51017 91177308-0d34-0410-b5e6-96231b3b80d8
if those blocks consist entirely of common instructions;
merging will not add an extra branch in this case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51006 91177308-0d34-0410-b5e6-96231b3b80d8
semantically identical, but little difference in
either results or execution speed; but it's much
easier to read, at least IMO.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50999 91177308-0d34-0410-b5e6-96231b3b80d8
case where there are multiple blocks with a large
number of common tail instructions more efficiently
(compile time optimization).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50916 91177308-0d34-0410-b5e6-96231b3b80d8
Darwin. This is a hack of course, but it does
at least look at the right thing: gotpcrel means
that this is already an offset, so an explicit
offset is not needed (and wrong). I think this
is good enough for the moment: Anton is working
on something better.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50850 91177308-0d34-0410-b5e6-96231b3b80d8
on x86-64 linux. This causes no regressions on
32 bit linux and 32 bit ppc. More tests pass
on 64 bit ppc with no regressions. I didn't
turn on eh on 64 bit linux because the intrinsics
needed to compile the eh runtime aren't done
yet. But if you turn it on and link with the
mainline runtime then eh seems to work fine
on x86-64 linux with this patch. Thanks to
Dale for testing. The main point of the patch
is that if you output that some object is
encoded using 4 bytes you had better not output
8 bytes for it: the patch makes everything
consistent.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50825 91177308-0d34-0410-b5e6-96231b3b80d8
%ecx = op
store %cl<kill>, (addr)
(addr) = op %al
It's not safe to unfold the last operand and eliminate store even though %cl is marked kill. It's a sub-register use which means one of its super-register(s) may be used below.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50794 91177308-0d34-0410-b5e6-96231b3b80d8
ComputeMaskedBits handles, just use a 'default:'. This avoids
TargetLowering's list getting out of date with SelectionDAG's.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50693 91177308-0d34-0410-b5e6-96231b3b80d8
the code being generated does not require an executable stack.
Also, add target-specific code to make use of this on Linux
on x86.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50634 91177308-0d34-0410-b5e6-96231b3b80d8
ffastmath mode. This fixes rdar://5902801, a miscompilation
of gcc.dg/builtins-8.c.
Bill, please pull this into Tak.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50523 91177308-0d34-0410-b5e6-96231b3b80d8
Move platform independent code (lowering of possibly overwritten
arguments, check for tail call optimization eligibility) from
target X86ISelectionLowering.cpp to TargetLowering.h and
SelectionDAGISel.cpp.
Initial PowerPC tail call implementation:
Support ppc32 implemented and tested (passes my tests and
test-suite llvm-test).
Support ppc64 implemented and half tested (passes my tests).
On ppc tail call optimization is performed if
caller and callee are fastcc
call is a tail call (in tail call position, call followed by ret)
no variable argument lists or byval arguments
option -tailcallopt is enabled
Supported:
* non pic tail calls on linux/darwin
* module-local tail calls on linux(PIC/GOT)/darwin(PIC)
* inter-module tail calls on darwin(PIC)
If constraints are not met a normal call will be emitted.
A test checking the argument lowering behaviour on x86-64 was added.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50477 91177308-0d34-0410-b5e6-96231b3b80d8
This removes the existing bottleneck related to the removal of elements from
the middle of the queue.
Also fixes a subtle bug in ScheduleDAGRRList::CapturePred:
It was updating the state of the SUnit before removing it. As a result, the
comparison operators were working incorrectly and this SUnit could not be removed
from the queue properly.
Reviewed by Evan and Dan. Approved by Dan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50412 91177308-0d34-0410-b5e6-96231b3b80d8
We now compile test2/test3 to:
_test2:
## InlineAsm Start
set %xmm0, %xmm1
## InlineAsm End
addps %xmm1, %xmm0
ret
_test3:
## InlineAsm Start
set %xmm0, %xmm1
## InlineAsm End
paddd %xmm1, %xmm0
ret
as expected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50389 91177308-0d34-0410-b5e6-96231b3b80d8
towards PR2094. It now compiles the attached .ll file to:
_sad16_sse2:
movslq %ecx, %rax
## InlineAsm Start
%ecx %rdx %rax %rax %r8d %rdx %rsi
## InlineAsm End
## InlineAsm Start
set %eax
## InlineAsm End
ret
which is pretty decent for a 3 output, 4 input asm.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50386 91177308-0d34-0410-b5e6-96231b3b80d8
e.g.
vr1024<2> extract_subreg vr1025, 2
If vr1024 do not have the same register class as vr1025, it's not safe to coalesce this away. For example, vr1024 might be a GPR32 while vr1025 might be a GPR64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50385 91177308-0d34-0410-b5e6-96231b3b80d8
c1, f1 = CopyToReg
c2, f2 = CopyToReg
c3 = TokenFactor c1, c2
...
= user c3, ..., f2
Now that the two CopyToReg's and the user are "flagged" together. They effectively forms a single scheduling unit. The TokenFactor is now both an operand and a successor of the Flagged nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50376 91177308-0d34-0410-b5e6-96231b3b80d8
memcpy/memset expansion. It was a bug for the SVOffset value
to be used in the actual address calculations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50359 91177308-0d34-0410-b5e6-96231b3b80d8
ComputeMaskedBits knows about cttz, ctlz, and ctpop. Teach
SelectionDAG's ComputeMaskedBits what InstCombine's knows
about SRem. And teach them both some things about high bits
in Mul, UDiv, URem, and Sub. This allows instcombine and
dagcombine to eliminate sign-extension operations in
several new cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50358 91177308-0d34-0410-b5e6-96231b3b80d8
conversion open the door for many nasty implicit conversion issues, and
can be easily solved by initializing with (V.begin(), V.end()) when
needed.
This patch includes many small cleanups for sdisel also.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50340 91177308-0d34-0410-b5e6-96231b3b80d8
When choosing between constraints with multiple options,
like "ir", test to see if we can use the 'i' constraint and
go with that if possible. This produces more optimal ASM in
all cases (sparing a register and an instruction to load it),
and fixes inline asm like this:
void test () {
asm volatile (" %c0 %1 " : : "imr" (42), "imr"(14));
}
Previously we would dump "42" into a memory location (which
is ok for the 'm' constraint) which would cause a problem
because the 'c' modifier is not valid on memory operands.
Isn't it great how inline asm turns 'missed optimization'
into 'compile failed'??
Incidentally, this was the todo in
PowerPC/2007-04-24-InlineAsm-I-Modifier.ll
Please do NOT pull this into Tak.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50315 91177308-0d34-0410-b5e6-96231b3b80d8
- Make targetlowering.h fit in 80 cols.
- Make LowerAsmOperandForConstraint const.
- Make lowerXConstraint -> LowerXConstraint
- Make LowerXConstraint return a const char* instead of taking a string byref.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50312 91177308-0d34-0410-b5e6-96231b3b80d8
to the block that defines their operands. This doesn't work in the
case that the operand is an invoke, because invoke is a terminator
and must be the last instruction in a block.
Replace it with support in SelectionDAGISel for copying struct values
into sequences of virtual registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50279 91177308-0d34-0410-b5e6-96231b3b80d8
function, and then use it to fix a bug in SplitVectorOp that expected inserts
to always have constant insertion indices.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50273 91177308-0d34-0410-b5e6-96231b3b80d8
LegalizeTypes. Correct the load logic so
that it actually works, and also teach it
to handle floating point extending loads.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49923 91177308-0d34-0410-b5e6-96231b3b80d8
rather than having it suck them out of a node. Add
a bunch of new libcalls, and remove dead softfloat
code (dead, because FloatToInt is used not Expand
in this case). Note that indexed stores probably
aren't handled properly, likewise for loads.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49915 91177308-0d34-0410-b5e6-96231b3b80d8
Rename SDOperandImpl back to SDOperand.
Introduce the SDUse class that represents a use of the SDNode referred by
an SDOperand. Now it is more similar to Use/Value classes.
Patch is approved by Dan Gohman.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49795 91177308-0d34-0410-b5e6-96231b3b80d8
ScheduleDAG; they don't correspond to any actual instructions so they
don't need to be scheduled.
This fixes a bug where the EntryToken was being scheduled multiple
times in some cases, though it ended up not causing any trouble because
EntryToken doesn't expand into anything. With this fixed the schedulers
reliably schedule the expected number of units, so we can check this
with an assertion.
This requires a tweak to test/CodeGen/X86/loop-hoist.ll because it
ends up getting scheduled differently in a trivial way, though it was
enough to fool the prcontext+grep that the test does.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49701 91177308-0d34-0410-b5e6-96231b3b80d8