Added fp register clobbering during calls.
Added AsmPrinter support for "fmask", a bitmask that indicates where on the
stack the fp callee saved registers are.
Fixed the stack frame layout for Mips, now the callee saved regs
are in the right stack location (a little documentation about how this
stack frame must look like is present in MipsRegisterInfo.cpp).
This was done using the method MipsRegisterInfo::adjustMipsStackFrame
To be more clear, these are examples of what is solves :
1) FP and RA are also callee saved, and despite they aren't in CSI they
must be saved before the fp callee saved registers.
2) The ABI requires that local varibles are allocated before the callee
saved register area, the opposite behavior from the default allocation.
3) CPU and FPU saved register area must be aligned independent of each
other.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@54403 91177308-0d34-0410-b5e6-96231b3b80d8
- Add a basic machine-level dead block eliminator.
These two have to go together, since many other parts of the code generator are unable to handle the unreachable blocks otherwise created.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@54333 91177308-0d34-0410-b5e6-96231b3b80d8
aren't used anyway, they also used to broke compiling when fastcc was specified for a
function, but not anymore.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@54316 91177308-0d34-0410-b5e6-96231b3b80d8
Added hi,lo registers to be used,def implicitly. This provides better handle of
instructions which use hi/lo.
Fixes a small BranchAnalysis bug
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@54274 91177308-0d34-0410-b5e6-96231b3b80d8
switches use the binary search algorithm) for
environments that don't support it. PPC64 JIT
is such an environment; turn the flag on for that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@54248 91177308-0d34-0410-b5e6-96231b3b80d8
subreg form on x86-64, to avoid the problem with x86-32
having GPRs that don't have 8-bit subregs.
Also, change several 16-bit instructions to use
equivalent 32-bit instructions. These have a smaller
encoding and avoid partial-register updates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@54223 91177308-0d34-0410-b5e6-96231b3b80d8
which is represented in codegen as an 'and' operation. This matches them
with movz instructions, instead of leaving them to be matched by and
instructions with an immediate field.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@54147 91177308-0d34-0410-b5e6-96231b3b80d8
parallel its analogue, Value::value_use_iterator. The operator* method
now returns the user, rather than the use.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@54127 91177308-0d34-0410-b5e6-96231b3b80d8
mmx needs its own fancy shuffle logic based on unpack; for now we get correct but awful code.
Also commit Mon Ping's VSETCC patch
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@54039 91177308-0d34-0410-b5e6-96231b3b80d8
Remove the GetResultInst instruction. It is still accepted in LLVM assembly
and bitcode, where it is now auto-upgraded to ExtractValueInst. Also, remove
support for return instructions with multiple values. These are auto-upgraded
to use InsertValueInst instructions.
The IRBuilder still accepts multiple-value returns, and auto-upgrades them
to InsertValueInst instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53941 91177308-0d34-0410-b5e6-96231b3b80d8
that include useful information like the name of the
block being viewed and the current phase of compilation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53872 91177308-0d34-0410-b5e6-96231b3b80d8
Added gp_rel relocations to support addressing small section contents.
Added command line to specify small section threshold in bytes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53869 91177308-0d34-0410-b5e6-96231b3b80d8
generic SDNode's (nodes with their own constructors
should do sanity checking in the constructor). Add
sanity checks for BUILD_VECTOR and fix all the places
that were producing bogus BUILD_VECTORs, as found by
"make check". My favorite is the BUILD_VECTOR with
only two operands that was being used to build a
vector with four elements!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53850 91177308-0d34-0410-b5e6-96231b3b80d8
returns a node with the right number of
return values. This fixes codegen of
Generic/cast-fp.ll, Generic/fp_to_int.ll
and PowerPC/multiple-return-values.ll
when using -march=ppc32 -mattr=+64bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53794 91177308-0d34-0410-b5e6-96231b3b80d8
multiply to be done as unsigned, so that they have well defined
behavior on overflow. This fixes PR2408.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53767 91177308-0d34-0410-b5e6-96231b3b80d8
replacement of multiple values. This is slightly more efficient
than doing multiple ReplaceAllUsesOfValueWith calls, and theoretically
could be optimized even further. However, an important property of this
new function is that it handles the case where the source value set and
destination value set overlap. This makes it feasible for isel to use
SelectNodeTo in many very common cases, which is advantageous because
SelectNodeTo avoids a temporary node and it doesn't require CSEMap
updates for users of values that don't change position.
Revamp MorphNodeTo, which is what does all the work of SelectNodeTo, to
handle operand lists more efficiently, and to correctly handle a number
of corner cases to which its new wider use exposes it.
This commit also includes a change to the encoding of post-isel opcodes
in SDNodes; now instead of being sandwiched between the target-independent
pre-isel opcodes and the target-dependent pre-isel opcodes, post-isel
opcodes are now represented as negative values. This makes it possible
to test if an opcode is pre-isel or post-isel without having to know
the size of the current target's post-isel instruction set.
These changes speed up llc overall by 3% and reduce memory usage by 10%
on the InstructionCombining.cpp testcase with -fast and -regalloc=local.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53728 91177308-0d34-0410-b5e6-96231b3b80d8
Added HasABICall and HasAbsoluteCall (equivalent to gcc -mabicall and
-mno-shared). HasAbsoluteCall is not implemented but HasABICall is the
default for o32 ABI. Now, both should help into a more accurate
relocation types implementation.
Added IsLinux is needed to choose between asm directives.
Instruction name strings cleanup.
AsmPrinter improved.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53551 91177308-0d34-0410-b5e6-96231b3b80d8
This is a question of the debugging setup code not
being called at the right time, and it's called from
target-dependent code for some reason. I have only
attempted to fix Darwin, but I'm pretty sure it's
broken elsewhere; I'll leave that to people who can
test it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53254 91177308-0d34-0410-b5e6-96231b3b80d8
MachineMemOperands. The pools are owned by MachineFunctions.
This drastically reduces the number of calls to malloc/free made
during the "Emit" phase of scheduling, as well as later phases
in CodeGen. Combined with other changes, this speeds up the
"instruction selection" phase of CodeGen by 10% in some cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53212 91177308-0d34-0410-b5e6-96231b3b80d8
important.
- Cleanup in the Subtarget info with addition of new features, not all support
yet, but they allow the future inclusion of features easier. Among new features,
we have : Arch family info (mips1, mips2, ...), ABI info (o32, eabi), 64-bit
integer
and float registers, allegrex vector FPU (VFPU), single float only support.
- TargetMachine now detects allegrex core.
- Added allegrex (Mips32r2) sext_inreg instructions.
- *Added Float Point Instructions*, handling single float only, and
aliased accesses for 32-bit FPUs.
- Some cleanup in FP instruction formats and FP register classes.
- Calling conventions improved to support mips 32-bit EABI.
- Added Asm Printer support for fp cond codes.
- Added support for sret copy to a return register.
- EABI support added into LowerCALL and FORMAL_ARGS.
- MipsFunctionInfo now keeps a virtual register per function to track the
sret on function entry until function ret.
- MipsInstrInfo FP support into methods (isMoveInstr, isLoadFromStackSlot, ...),
FP cond codes mapping and initial FP Branch Analysis.
- Two new Mips SDNode to handle fp branch and compare instructions : FPBrcond,
FPCmp
- MipsTargetLowering : handling different FP classes, Allegrex support, sret
return copy, no homing location within EABI, non 32-bit stack objects
arguments, and asm constraint for float.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53146 91177308-0d34-0410-b5e6-96231b3b80d8
hook for each way in which a result type can be
legalized (promotion, expansion, softening etc),
just use one: ReplaceNodeResults, which returns
a node with exactly the same result types as the
node passed to it, but presumably with a bunch of
custom code behind the scenes. No change if the
new LegalizeTypes infrastructure is not turned on.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53137 91177308-0d34-0410-b5e6-96231b3b80d8
moves in order to get correct debug info. Since
I can't imagine how any target could possibly
be any different, I've just stripped out the
option: now all the world's like Darwin!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53134 91177308-0d34-0410-b5e6-96231b3b80d8
- Also remove LiveVariables::instructionChanged, etc. Replace all calls with cheaper calls which update VarInfo kill list.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53097 91177308-0d34-0410-b5e6-96231b3b80d8
Also, if LV isn't around, then TwoAddr doesn't need to be updating flags, since they won't have been set in the first place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53058 91177308-0d34-0410-b5e6-96231b3b80d8
to be passed the list of value types, and use this
where appropriate. Inappropriate places are where
the value type list is already known and may be
long, in which case the existing method is more
efficient.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@53035 91177308-0d34-0410-b5e6-96231b3b80d8
the need for a flavor operand, and add a new SDNode subclass,
LabelSDNode, for use with them to eliminate the need for a label id
operand.
Change instruction selection to let these label nodes through
unmodified instead of creating copies of them. Teach the MachineInstr
emitter how to emit a MachineInstr directly from an ISD label node.
This avoids the need for allocating SDNodes for the label id and
flavor value, as well as SDNodes for each of the post-isel label,
label id, and label flavor.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52943 91177308-0d34-0410-b5e6-96231b3b80d8
purpose, and give it a custom SDNode subclass so that it doesn't
need to have line number, column number, filename string, and
directory string, all existing as individual SDNodes to be the
operands.
This was the only user of ISD::STRING, StringSDNode, etc., so
remove those and some associated code.
This makes stop-points considerably easier to read in
-view-legalize-dags output, and reduces overhead (creating new
nodes and copying std::strings into them) on code containing
debugging information.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52924 91177308-0d34-0410-b5e6-96231b3b80d8
SmallVectors. Change the signature of TargetLowering::LowerArguments
to avoid returning a vector by value, and update the two targets
which still use this directly, Sparc and IA64, accordingly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52917 91177308-0d34-0410-b5e6-96231b3b80d8
it impossible to create a MERGE_VALUES node with
only one result: sometimes it is useful to be able
to create a node with only one result out of one of
the results of a node with more than one result, for
example because the new node will eventually be used
to replace a one-result node using ReplaceAllUsesWith,
cf X86TargetLowering::ExpandFP_TO_SINT. On the other
hand, most users of MERGE_VALUES don't need this and
for them the optimization was valuable. So add a new
utility method getMergeValues for creating MERGE_VALUES
nodes which by default performs the optimization.
Change almost everywhere to use getMergeValues (and
tidy some stuff up at the same time).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52893 91177308-0d34-0410-b5e6-96231b3b80d8
<16 x float> is 64-byte aligned (for some reason),
which gets us into the stack realignment code. The
computation changing FP-relative offsets to SP-relative
was broken, assiging a spill temp to a location
also used for parameter passing. This
fixes it by rounding up the stack frame to a multiple
of the largest alignment (I concluded it wasn't fixable
without doing this, but I'm not very sure.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52750 91177308-0d34-0410-b5e6-96231b3b80d8
shift.
- Add a readme entry for a missing vector_shuffle optimization that results in
awful codegen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52740 91177308-0d34-0410-b5e6-96231b3b80d8
InvalidateInstructionCache method instead of calling through
a hook on the JIT. This is a host feature, not a target feature.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52734 91177308-0d34-0410-b5e6-96231b3b80d8
Added abstract class MemSDNode for any Node that have an associated MemOperand
Changed atomic.lcs => atomic.cmp.swap, atomic.las => atomic.load.add, and
atomic.lss => atomic.load.sub
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52706 91177308-0d34-0410-b5e6-96231b3b80d8
shuffle could be skipped. The check is invalid because the loop index i
doesn't correspond to the element actually inserted. The correct check is
already done a few lines earlier, for whether the element is already in
the right spot, so this shouldn't have any effect on the codegen for
code that was already correct.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52486 91177308-0d34-0410-b5e6-96231b3b80d8
wrong for volatile loads and stores. In fact this
is almost all of them! There are three types of
problems: (1) it is wrong to change the width of
a volatile memory access. These may be used to
do memory mapped i/o, in which case a load can have
an effect even if the result is not used. Consider
loading an i32 but only using the lower 8 bits. It
is wrong to change this into a load of an i8, because
you are no longer tickling the other three bytes. It
is also unwise to make a load/store wider. For
example, changing an i16 load into an i32 load is
wrong no matter how aligned things are, since the
fact of loading an additional 2 bytes can have
i/o side-effects. (2) it is wrong to change the
number of volatile load/stores: they may be counted
by the hardware. (3) it is wrong to change a volatile
load/store that requires one memory access into one
that requires several. For example on x86-32, you
can store a double in one processor operation, but to
store an i64 requires two (two i32 stores). In a
multi-threaded program you may want to bitcast an i64
to a double and store as a double because that will
occur atomically, and be indivisible to other threads.
So it would be wrong to convert the store-of-double
into a store of an i64, because this will become two
i32 stores - no longer atomic. My policy here is
to say that the number of processor operations for
an illegal operation is undefined. So it is alright
to change a store of an i64 (requires at least two
stores; but could be validly lowered to memcpy for
example) into a store of double (one processor op).
In short, if the new store is legal and has the same
size then I say that the transform is ok. It would
also be possible to say that transforms are always
ok if before they were illegal, whether after they
are illegal or not, but that's more awkward to do
and I doubt it buys us anything much.
However this exposed an interesting thing - on x86-32
a store of i64 is considered legal! That is because
operations are marked legal by default, regardless of
whether the type is legal or not. In some ways this
is clever: before type legalization this means that
operations on illegal types are considered legal;
after type legalization there are no illegal types
so now operations are only legal if they really are.
But I consider this to be too cunning for mere mortals.
Better to do things explicitly by testing AfterLegalize.
So I have changed things so that operations with illegal
types are considered illegal - indeed they can never
map to a machine operation. However this means that
the DAG combiner is more conservative because before
it was "accidentally" performing transforms where the
type was illegal because the operation was nonetheless
marked legal. So in a few such places I added a check
on AfterLegalize, which I suppose was actually just
forgotten before. This causes the DAG combiner to do
slightly more than it used to, which resulted in the X86
backend blowing up because it got a slightly surprising
node it wasn't expecting, so I tweaked it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52254 91177308-0d34-0410-b5e6-96231b3b80d8
of apint codegen failure is the DAG combiner doing
the wrong thing because it was comparing MVT's using
< rather than comparing the number of bits. Removing
the < method makes this mistake impossible to commit.
Instead, add helper methods for comparing bits and use
them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52098 91177308-0d34-0410-b5e6-96231b3b80d8
and better control the abstraction. Rename the type
to MVT. To update out-of-tree patches, the main
thing to do is to rename MVT::ValueType to MVT, and
rewrite expressions like MVT::getSizeInBits(VT) in
the form VT.getSizeInBits(). Use VT.getSimpleVT()
to extract a MVT::SimpleValueType for use in switch
statements (you will get an assert failure if VT is
an extended value type - these shouldn't exist after
type legalization).
This results in a small speedup of codegen and no
new testsuite failures (x86-64 linux).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52044 91177308-0d34-0410-b5e6-96231b3b80d8
MUL is not anymore directly matched because its a pseudoinstruction.
LogicI class fixed to zero-extend immediates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52036 91177308-0d34-0410-b5e6-96231b3b80d8
are the same as in unpacked structs, only field
positions differ. This only matters for structs
containing x86 long double or an apint; it may
cause backwards compatibility problems if someone
has bitcode containing a packed struct with a
field of one of those types.
The issue is that only 10 bytes are needed to
hold an x86 long double: the store size is 10
bytes, but the ABI size is 12 or 16 bytes (linux/
darwin) which comes from rounding the store size
up by the alignment. Because it seemed silly not
to pack an x86 long double into 10 bytes in a
packed struct, this is what was done. I now
think this was a mistake. Reserving the ABI size
for an x86 long double field even in a packed
struct makes things more uniform: the ABI size is
now always used when reserving space for a type.
This means that developers are less likely to
make mistakes. It also makes life easier for the
CBE which otherwise could not represent all LLVM
packed structs (PR2402).
Front-end people might need to adjust the way
they create LLVM structs - see following change
to llvm-gcc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51928 91177308-0d34-0410-b5e6-96231b3b80d8