in-tree implementations of TargetLoweringBase::isFMAFasterThanMulAndAdd in
order to resolve the following issues with fmuladd (i.e. optional FMA)
intrinsics:
1. On X86(-64) targets, ISD::FMA nodes are formed when lowering fmuladd
intrinsics even if the subtarget does not support FMA instructions, leading
to laughably bad code generation in some situations.
2. On AArch64 targets, ISD::FMA nodes are formed for operations on fp128,
resulting in a call to a software fp128 FMA implementation.
3. On PowerPC targets, FMAs are not generated from fmuladd intrinsics on types
like v2f32, v8f32, v4f64, etc., even though they promote, split, scalarize,
etc. to types that support hardware FMAs.
The function has also been slightly renamed for consistency and to force a
merge/build conflict for any out-of-tree target implementing it. To resolve,
see comments and fixed in-tree examples.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185956 91177308-0d34-0410-b5e6-96231b3b80d8
The stack coloring pass has code to delete stores and loads that become
trivially dead after coloring. Extend it to cope with single instructions
that copy from one frame index to another.
The testcase happens to show an example of this kicking in at the moment.
It did occur in Real Code too though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185705 91177308-0d34-0410-b5e6-96231b3b80d8
SystemZ wants normal register scavenging slots, as close to the stack or
frame pointer as possible. The only reason it was using custom code was
because PrologEpilogInserter assumed an x86-like layout, where the frame
pointer is at the opposite end of the frame from the stack pointer.
This meant that when frame pointer elimination was disabled,
the slots ended up being as close as possible to the incoming
stack pointer, which is the opposite of what we want on SystemZ.
This patch adds a new knob to say which layout is used and converts
SystemZ to use target-independent scavenging slots. It's one of the pieces
needed to support frame-to-frame MVCs, where two slots might be required.
The ABI requires us to allocate 160 bytes for calls, so one approach
would be to use that area as temporary spill space instead. It would need
some surgery to make sure that the slot isn't live across a call though.
I stuck to the "isFPCloseToIncomingSP - ..." style comment on the
"do what the surrounding code does" principle. The FP case is already
covered by several Systemz/frame-* tests, which fail without the
PrologueEpilogueInserter change, so no new ones are needed.
No behavioural change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185696 91177308-0d34-0410-b5e6-96231b3b80d8
*NOTE* In a recent version of posix, they added the restrict keyword to the
arguments for this function. From some spelunking it seems that on some
platforms, the call has restrict on its arguments and others it does not. Thus I
left off the restrict keyword from the function prototype in the comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185501 91177308-0d34-0410-b5e6-96231b3b80d8
This allows getDebugThreadLocalSymbol to return a generic MCExpr
instead of just a MCSymbolRefExpr.
This is in preparation for supporting debug info for TLS variables
on PowerPC, where we need to describe the variable location using
a more complex expression than just MCSymbolRefExpr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185460 91177308-0d34-0410-b5e6-96231b3b80d8
Based on GCC's output for TLS variables (OP_constNu, x@dtpoff,
OP_lo_user), this implements debug info support for TLS in ELF. Verified
that this output is correct/sufficient on Linux (using gold - if you're
using binutils-ld, you'll need something with the fix for
http://sourceware.org/bugzilla/show_bug.cgi?id=15685 in it).
Support on non-ELF is sort of "arbitrary" at the moment - if Apple folks
want to discuss (or just go ahead & implement) how this should work in
MachO, etc, I'm open.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185203 91177308-0d34-0410-b5e6-96231b3b80d8
This patch modifies TableGen to generate a function in
${TARGET}GenInstrInfo.inc called getNamedOperandIdx(), which can be used
to look up indices for operands based on their names.
In order to activate this feature for an instruction, you must set the
UseNamedOperandTable bit.
For example, if you have an instruction like:
def ADD : TargetInstr <(outs GPR:$dst), (ins GPR:$src0, GPR:$src1)>;
You can look up the operand indices using the new function, like this:
Target::getNamedOperandIdx(Target::ADD, Target::OpName::dst) => 0
Target::getNamedOperandIdx(Target::ADD, Target::OpName::src0) => 1
Target::getNamedOperandIdx(Target::ADD, Target::OpName::src1) => 2
The operand names are case sensitive, so $dst and $DST are considered
different operands.
This change is useful for R600 which has instructions with a large number
of operands, many of which model single bit instruction configuration
values. These configuration bits are common across most instructions,
but may have a different operand index depending on the instruction type.
It is useful to have a convenient way to look up the operand indices,
so these bits can be generically set on any instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184879 91177308-0d34-0410-b5e6-96231b3b80d8
Frame index handling is now target-agnostic, so delete the target hooks
for creation & asm printing of target-specific addressing in DBG_VALUEs
and any related functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184067 91177308-0d34-0410-b5e6-96231b3b80d8
Replace the ill-defined MinLatency and ILPWindow properties with
with straightforward buffer sizes:
MCSchedMode::MicroOpBufferSize
MCProcResourceDesc::BufferSize
These can be used to more precisely model instruction execution if desired.
Disabled some misched tests temporarily. They'll be reenabled in a few commits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184032 91177308-0d34-0410-b5e6-96231b3b80d8
building outside projects with a different compiler than that used to build
LLVM itself (eg switching between gcc and clang).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183650 91177308-0d34-0410-b5e6-96231b3b80d8
Account for the cost of scaling factor in Loop Strength Reduce when rating the
formulae. This uses a target hook.
The default implementation of the hook is: if the addressing mode is legal, the
scaling factor is free.
<rdar://problem/13806271>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183045 91177308-0d34-0410-b5e6-96231b3b80d8
NOTE: If this broke your out-of-tree backend, in *RegisterInfo.td, change
the instances of SubRegIndex that have a comps template arg to use the
ComposedSubRegIndex class instead.
In TableGen land, this adds Size and Offset attributes to SubRegIndex,
and the ComposedSubRegIndex class, for which the Size and Offset are
computed by TableGen. This also adds an accessor in MCRegisterInfo, and
Size/Offsets for the X86 and ARM subreg indices.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183020 91177308-0d34-0410-b5e6-96231b3b80d8
When -ffast-math is in effect (on Linux, at least), clang defines
__FINITE_MATH_ONLY__ > 0 when including <math.h>. This causes the
preprocessor to include <bits/math-finite.h>, which renames the sqrt functions.
For instance, "sqrt" is renamed as "__sqrt_finite".
This patch adds the 3 new names in such a way that they will be treated
as equivalent to their respective original names.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182739 91177308-0d34-0410-b5e6-96231b3b80d8
Change SelectionDAG::getXXXNode() interfaces as well as call sites of
these functions to pass in SDLoc instead of DebugLoc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182703 91177308-0d34-0410-b5e6-96231b3b80d8
This lane mask provides information about which register lanes
completely cover super-registers. See the block comment before
getCoveringLanes().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182034 91177308-0d34-0410-b5e6-96231b3b80d8
It was just a less powerful and more confusing version of
MCCFIInstruction. A side effect is that, since MCCFIInstruction uses
dwarf register numbers, calls to getDwarfRegNum are pushed out, which
should allow further simplifications.
I left the MachineModuleInfo::addFrameMove interface unchanged since
this patch was already fairly big.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181680 91177308-0d34-0410-b5e6-96231b3b80d8
I think it's almost impossible to fold atomic fences profitably under
LLVM/C++11 semantics. As a result, this is now unused and just
cluttering up the target interface.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179940 91177308-0d34-0410-b5e6-96231b3b80d8
variant/dialect. Addresses a FIXME in the emitMnemonicAliases function.
Use and test case to come shortly.
rdar://13688439 and part of PR13340.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179804 91177308-0d34-0410-b5e6-96231b3b80d8
The code in getTypeConversion attempts to promote the element vector type
before it trys to split or widen the vector.
After it failed finding a legal vector type by promoting it would continue using
the promoted vector element type. Thereby missing legal splitted vector types.
For example the type v32i32 that has a legal split of 4 x v3i32 on x86/sse2
would be transformed to: v32i256 and from there on successively split to:
v16i256, v8i256, v1i256 and then finally ends up as an i64 type.
By resetting the vector element type to the original vector element type that
existed before the promotion the code will attempt to split the vector type to
smaller vector widths of the same type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178999 91177308-0d34-0410-b5e6-96231b3b80d8
This comment documents the current behavior of the ARM implementation of this
callback, and also the soon-to-be-committed PPC version.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178959 91177308-0d34-0410-b5e6-96231b3b80d8
During LTO, the target options on functions within the same Module may
change. This would necessitate resetting some of the back-end. Do this for X86,
because it's a Friday afternoon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178917 91177308-0d34-0410-b5e6-96231b3b80d8
Don't require instructions to inherit Sched<...>. Sometimes it is more
convenient to say:
let SchedRW = ... in {
...
}
Which is now possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177199 91177308-0d34-0410-b5e6-96231b3b80d8
See the Mips16ISetLowering.cpp patch to see a use of this.
For now now the extra code in Mips16ISetLowering.cpp is a nop but is
used for test purposes. Mips32 registers are setup and then removed and
then the Mips16 registers are setup.
Normally you need to add register classes and then call
computeRegisterProperties.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177120 91177308-0d34-0410-b5e6-96231b3b80d8
This allows abitrary groups of processor resources. Using something in
a subset automatically counts againts the superset. Currently, this
only works if the superset is also a ProcResGroup as opposed to a
SuperUnit.
This allows SandyBridge to be expressed naturally, which will be
checked in shortly.
def SBPort01 : ProcResGroup<[SBPort0, SBPort1]>;
def SBPort15 : ProcResGroup<[SBPort1, SBPort5]>;
def SBPort23 : ProcResGroup<[SBPort2, SBPort3]>;
def SBPort015 : ProcResGroup<[SBPort0, SBPort1, SBPort5]>;
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177112 91177308-0d34-0410-b5e6-96231b3b80d8
Add the current PEI register scavenger as a parameter to the
processFunctionBeforeFrameFinalized callback.
This change is necessary in order to allow the PowerPC target code to
set the register scavenger frame index after the save-area offset
adjustments performed by processFunctionBeforeFrameFinalized. Only
after these adjustments have been made is it possible to estimate
the size of the stack frame.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177108 91177308-0d34-0410-b5e6-96231b3b80d8
This doesn't reset all of the target options within the TargetOptions
object. This is because some of those are ABI-specific and must be determined if
it's okay to change those on the fly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176986 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds many more functions to the target library information.
All of the functions being added were discovered while doing the migration
of the simplify-libcalls attribute annotation functionality to the
functionattrs pass. As a part of that work the attribute annotation logic
will query TLI to determine if a function should be annotated or not.
Signed-off-by: Meador Inge <meadori@codesourcery.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176514 91177308-0d34-0410-b5e6-96231b3b80d8
- ISD::SHL/SRL/SRA must have either both scalar or both vector operands
but TLI.getShiftAmountTy() so far only return scalar type. As a
result, backend logic assuming that breaks.
- Rename the original TLI.getShiftAmountTy() to
TLI.getScalarShiftAmountTy() and re-define TLI.getShiftAmountTy() to
return target-specificed scalar type or the same vector type as the
1st operand.
- Fix most TICG logic assuming TLI.getShiftAmountTy() a simple scalar
type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176364 91177308-0d34-0410-b5e6-96231b3b80d8
to TargetFrameLowering, where it belongs. Incidentally, this allows us
to delete some duplicated (and slightly different!) code in TRI.
There are potentially other layering problems that can be cleaned up
as a result, or in a similar manner.
The refactoring was OK'd by Anton Korobeynikov on llvmdev.
Note: this touches the target interfaces, so out-of-tree targets may
be affected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175788 91177308-0d34-0410-b5e6-96231b3b80d8