For now this is distinct from isCodeGenOnly, as code-gen-only
instructions can (and often do) still have encoding information
associated with them. Once we've migrated all of them over to true
pseudo-instructions that are lowered to real instructions prior to
the printer/emitter, we can remove isCodeGenOnly and just use isPseudo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134539 91177308-0d34-0410-b5e6-96231b3b80d8
itineraries.
- Refactor TargetSubtarget to be based on MCSubtargetInfo.
- Change tablegen generated subtarget info to initialize MCSubtargetInfo
and hide more details from targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134257 91177308-0d34-0410-b5e6-96231b3b80d8
be the first encoded as the first feature. It then uses the CPU name to look up
features / scheduling itineray even though clients know full well the CPU name
being used to query these properties.
The fix is to just have the clients explictly pass the CPU name!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134127 91177308-0d34-0410-b5e6-96231b3b80d8
sink them into MC layer.
- Added MCInstrInfo, which captures the tablegen generated static data. Chang
TargetInstrInfo so it's based off MCInstrInfo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134021 91177308-0d34-0410-b5e6-96231b3b80d8
target machine from those that are only needed by codegen. The goal is to
sink the essential target description into MC layer so we can start building
MC based tools without needing to link in the entire codegen.
First step is to refactor TargetRegisterInfo. This patch added a base class
MCRegisterInfo which TargetRegisterInfo is derived from. Changed TableGen to
separate register description from the rest of the stuff.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133782 91177308-0d34-0410-b5e6-96231b3b80d8
If the linker supports it, this will hold the CIE and FDE information in a
compact format. The implementation of the compact unwinding emission is coming
soon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133658 91177308-0d34-0410-b5e6-96231b3b80d8
A RegisterTuples instance is used to synthesize super-registers by
zipping together lists of sub-registers. This is useful for generating
pseudo-registers representing register sequence constraints like 'two
consecutive GPRs', or 'an even-odd pair of floating point registers'.
The RegisterTuples def can be used in register set operations when
building register classes. That is the only way of accessing the
synthesized super-registers.
For example, the ARM QQ register class of pseudo-registers could have
been formed like this:
// Form pairs Q0_Q1, Q2_Q3, ...
def QQPairs : RegisterTuples<[qsub_0, qsub_1],
[(decimate QPR, 2),
(decimate (shl QPR, 1), 2)]>;
def QQ : RegisterClass<..., (add QQPairs)>;
Similarly, pseudo-registers representing '3 consecutive D-regs with
wraparound' look like:
// Form D0_D1_D2, D1_D2_D3, ..., D30_D31_D0, D31_D0_D1.
def DSeqTriples : RegisterTuples<[dsub_0, dsub_1, dsub_2],
[(rotl DPR, 0),
(rotl DPR, 1),
(rotl DPR, 2)]>;
TableGen automatically computes aliasing information for the synthesized
registers.
Register tuples are still somewhat experimental. We still need to see
how they interact with MC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133407 91177308-0d34-0410-b5e6-96231b3b80d8
Targets that need to change the default allocation order should use the
AltOrders mechanism instead. See the X86 and ARM targets for examples.
The allocation_order_begin() and allocation_order_end() methods have been
replaced with getRawAllocationOrder(), and there is further support
functions in RegisterClassInfo.
It is no longer possible to insert arbitrary code into generated
register classes. This is a feature.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133332 91177308-0d34-0410-b5e6-96231b3b80d8
A register class can define AltOrders and AltOrderSelect instead of
defining method protos and bodies. The AltOrders lists can be defined
with set operations, and TableGen can verify that the alternative
allocation orders only contain valid registers.
This is currently an opt-in feature, and it is still possible to
override allocation_order_begin/end. That will not be true for long.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133320 91177308-0d34-0410-b5e6-96231b3b80d8
The LSDA is a bit difficult for the non-initiated to read. Even with comments,
it's not always clear what's going on. This wraps the ASM streamer in a class
that retains the LSDA and then emits a human-readable description of what's
going on in it.
So instead of having to make sense of:
Lexception1:
.byte 255
.byte 155
.byte 168
.space 1
.byte 3
.byte 26
Lset0 = Ltmp7-Leh_func_begin1
.long Lset0
Lset1 = Ltmp812-Ltmp7
.long Lset1
Lset2 = Ltmp913-Leh_func_begin1
.long Lset2
.byte 3
Lset3 = Ltmp812-Leh_func_begin1
.long Lset3
Lset4 = Leh_func_end1-Ltmp812
.long Lset4
.long 0
.byte 0
.byte 1
.byte 0
.byte 2
.byte 125
.long __ZTIi@GOTPCREL+4
.long __ZTIPKc@GOTPCREL+4
you can read this instead:
## Exception Handling Table: Lexception1
## @LPStart Encoding: omit
## @TType Encoding: indirect pcrel sdata4
## @TType Base: 40 bytes
## @CallSite Encoding: udata4
## @Action Table Size: 26 bytes
## Action 1:
## A throw between Ltmp7 and Ltmp812 jumps to Ltmp913 on an exception.
## For type(s): __ZTIi@GOTPCREL+4 __ZTIPKc@GOTPCREL+4
## Action 2:
## A throw between Ltmp812 and Leh_func_end1 does not have a landing pad.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133286 91177308-0d34-0410-b5e6-96231b3b80d8
Also switch the return type to ArrayRef<unsigned> which works out nicely
for ARM's implementation of this function because of the clever ArrayRef
constructors.
The name change indicates that the returned allocation order may contain
reserved registers as has been the case for a while.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133216 91177308-0d34-0410-b5e6-96231b3b80d8
This is intended to support using REG_SEQUENCE SDNode's with type MVT::untyped, and is part of the long road to eliminating some of the hacks we currently use to support register pairs and other strange constraints, particularly on ARM NEON.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133178 91177308-0d34-0410-b5e6-96231b3b80d8
This virtual function will replace allocation_order_begin/end as the one
to override when implementing custom allocation orders. It is simpler to
have one function return an ArrayRef than having two virtual functions
computing different ends of the same array.
Use getRawAllocationOrder() in place of allocation_order_begin() where
it makes sense, but leave some clients that look like they really want
the filtered allocation orders from RegisterClassInfo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133170 91177308-0d34-0410-b5e6-96231b3b80d8
This simplifies many of the target description files since it is common
for register classes to be related or contain sequences of numbered
registers.
I have verified that this doesn't change the files generated by TableGen
for ARM and X86. It alters the allocation order of MBlaze GPR and Mips
FGR32 registers, but I believe the change is benign.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133105 91177308-0d34-0410-b5e6-96231b3b80d8
At the time I wrote this code (circa 2007), TargetRegisterInfo was using a std::set to perform these queries. Switching to the static hashtables was an obvious improvement, but in reality there's no reason to do anything other than scan.
With this change, total LLC time on a whole-program 403.gcc is reduced by approximately 1.5%, almost all of which comes from a 15% reduction in LiveVariables time. It also reduces the binary size of LLC by 86KB, thanks to eliminating a bunch of very large static tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133051 91177308-0d34-0410-b5e6-96231b3b80d8
Make the hash tables as small as possible while ensuring that all
lookups can be done in less than 8 probes.
Cut the aliases hash table in half by only storing a < b pairs - it
is a symmetric relation.
Use larger multipliers on the initial hash function to ensure that it
properly covers the whole table, and to resolve some clustering in the
very regular ARM register bank.
This reduces the size of most of these tables by 4x - 8x. For instance,
the ARM tables shrink from 48 KB to 8 KB.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132888 91177308-0d34-0410-b5e6-96231b3b80d8
Besides moving structural computations to CodeGenRegisters.cpp, this
also well-defines the order of these lists:
- Sub-register lists come from a pre-order traversal of the graph
defined by the SubRegs lists in the .td files.
- Super-register lists are topologically ordered so no register comes
before any of its sub-registers. When the sub-register graph is not a
tree, independent super-registers appear in numerical order.
- Lists of overlapping registers are ordered according to register
number.
This reverses the order of the super-regs lists, but nobody was
depending on that. The previous order of the overlaps lists was odd, and
it may have depended on the precise behavior of std::stable_sort.
The old computations are still there, but will be removed shortly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132881 91177308-0d34-0410-b5e6-96231b3b80d8
Some register classes are only used for instruction operand constraints.
They should never be used for virtual registers. Previously, those
register classes were given an empty allocation order, but now you can
say 'let isAllocatable=0' in the register class definition.
TableGen calculates if a register is part of any allocatable register
class, and makes that information available in TargetRegisterDesc::inAllocatableClass.
The goal here is to eliminate use cases for overriding allocation_order_*
methods.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132508 91177308-0d34-0410-b5e6-96231b3b80d8
Add TargetRegisterInfo::hasSubClassEq and use it to check for compatible
register classes instead of trying to list all register classes in
X86's getLoadStoreRegOpcode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132398 91177308-0d34-0410-b5e6-96231b3b80d8
patch we add a flag to enable a new type legalization decision - to promote
integer elements in vectors. Currently, the rest of the codegen does not support
this kind of legalization. This flag will be removed when the transition is
complete.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132394 91177308-0d34-0410-b5e6-96231b3b80d8
same dwarf number. This will be used for creating a dwarf number to register
mapping.
The only case that needs this so far is the XMM/YMM registers that unfortunately
do have the same numbers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132314 91177308-0d34-0410-b5e6-96231b3b80d8
This patch does not change the behavior of the type legalizer. The codegen
produces the same code.
This infrastructural change is needed in order to enable complex decisions
for vector types (needed by the vector-select patch).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132263 91177308-0d34-0410-b5e6-96231b3b80d8
suffix (e.g. .xdata$myfunc). The suffix part isn't implemented yet, but
I'll get to it in the next patch.
Fix up all callers of the affected functions. Make them pass said suffix to
the function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132205 91177308-0d34-0410-b5e6-96231b3b80d8
LTO friendly as we can now correctly merge files compiled with or without
-fasynchronous-unwind-tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132033 91177308-0d34-0410-b5e6-96231b3b80d8
Add a size alignment check to the .seh_stackalloc directive parser. Add a
more descriptive error message to the .seh_handler directive parser.
Add methods to the TargetAsmInfo struct in support of all this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131992 91177308-0d34-0410-b5e6-96231b3b80d8
scheme uses internally. Implement it for x86 (the only architecture that LLVM
supports for which this matters right now).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131969 91177308-0d34-0410-b5e6-96231b3b80d8
Original log entry:
Refactor getActionType and getTypeToTransformTo ; place all of the 'decision'
code in one place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131536 91177308-0d34-0410-b5e6-96231b3b80d8
who used this flag, and it now emits CFI and doesn't emit this anymore. All
other targets left this flag "false".
<rdar://problem/8486371>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130918 91177308-0d34-0410-b5e6-96231b3b80d8
model constants which can be added to base registers via add-immediate
instructions which don't require an additional register to materialize
the immediate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130743 91177308-0d34-0410-b5e6-96231b3b80d8
give it a bit more responsibility. Also implement it for MachO.
If hacked to use cfi, 32 bit MachO will produce
.cfi_personality 155, L___gxx_personality_v0$non_lazy_ptr
and 64 bit will produce
.cfi_presonality ___gxx_personality_v0
The general idea is that .cfi_personality gets passed the final symbol. It is
up to codegen to produce it if using indirect representation (like 32 bit
MachO), but it is up to MC to decide which relocations to create.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130341 91177308-0d34-0410-b5e6-96231b3b80d8
The hook will be used by the register allocator when recomputing register
classes after removing constraints.
Thumb1 code doesn't allow anything larger than tGPR, and x86 needs to ensure
that the spill size doesn't change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130228 91177308-0d34-0410-b5e6-96231b3b80d8
These values were not used for anything. Spill size and alignment is a property
of the register class, not the register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129906 91177308-0d34-0410-b5e6-96231b3b80d8
On the x86-64 and thumb2 targets, some registers are more expensive to encode
than others in the same register class.
Add a CostPerUse field to the TableGen register description, and make it
available from TRI->getCostPerUse. This represents the cost of a REX prefix or a
32-bit instruction encoding required by choosing a high register.
Teach the greedy register allocator to prefer cheap registers for busy live
ranges (as indicated by spill weight).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129864 91177308-0d34-0410-b5e6-96231b3b80d8
Add a avoidWriteAfterWrite() target hook to identify register classes that
suffer from write-after-write hazards. For those register classes, try to avoid
writing the same register in two consecutive instructions.
This is currently disabled by default. We should not spill to avoid hazards!
The command line flag -avoid-waw-hazard can be used to enable waw avoidance.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129772 91177308-0d34-0410-b5e6-96231b3b80d8
the generated FastISel. X86 doesn't need to generate code to match ADD16ri8
since ADD16ri will do just fine. This is a small codesize win in the generated
instruction selector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129692 91177308-0d34-0410-b5e6-96231b3b80d8
kind of predicate: one that is specific to imm nodes. The predicate function
specified here just checks an int64_t directly instead of messing around with
SDNode's. The virtue of this is that it means that fastisel and other things
can reason about these predicates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129675 91177308-0d34-0410-b5e6-96231b3b80d8
structure and fix some fixmes. We now have a TreePredicateFn class
that handles all of the decoding of these things. This is an internal
cleanup that has no impact on the code generated by tblgen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129670 91177308-0d34-0410-b5e6-96231b3b80d8
Additional fixes:
Do something reasonable for subtargets with generic
itineraries by handle node latency the same as for an empty
itinerary. Now nodes default to unit latency unless an itinerary
explicitly specifies a zero cycle stage or it is a TokenFactor chain.
Original fixes:
UnitsSharePred was a source of randomness in the scheduler: node
priority depended on the queue data structure. I rewrote the recent
VRegCycle heuristics to completely replace the old heuristic without
any randomness. To make the ndoe latency adjustments work, I also
needed to do something a little more reasonable with TokenFactor. I
gave it zero latency to its consumers and always schedule it as low as
possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129421 91177308-0d34-0410-b5e6-96231b3b80d8
transformations in target-specific DAG combines without causing DAGCombiner to
delete the same node twice. If you know of a better way to avoid this (see my
next patch for an example), please let me know.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128758 91177308-0d34-0410-b5e6-96231b3b80d8
the alias of an InstAlias instead of the thing being aliased. Because we need to
know the features that are valid for an InstAlias.
This is part of a work-in-progress.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127986 91177308-0d34-0410-b5e6-96231b3b80d8
to have single return block (at least getting there) for optimizations. This
is general goodness but it would prevent some tailcall optimizations.
One specific case is code like this:
int f1(void);
int f2(void);
int f3(void);
int f4(void);
int f5(void);
int f6(void);
int foo(int x) {
switch(x) {
case 1: return f1();
case 2: return f2();
case 3: return f3();
case 4: return f4();
case 5: return f5();
case 6: return f6();
}
}
=>
LBB0_2: ## %sw.bb
callq _f1
popq %rbp
ret
LBB0_3: ## %sw.bb1
callq _f2
popq %rbp
ret
LBB0_4: ## %sw.bb3
callq _f3
popq %rbp
ret
This patch teaches codegenprep to duplicate returns when the return value
is a phi and where the phi operands are produced by tail calls followed by
an unconditional branch:
sw.bb7: ; preds = %entry
%call8 = tail call i32 @f5() nounwind
br label %return
sw.bb9: ; preds = %entry
%call10 = tail call i32 @f6() nounwind
br label %return
return:
%retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ]
ret i32 %retval.0
This allows codegen to generate better code like this:
LBB0_2: ## %sw.bb
jmp _f1 ## TAILCALL
LBB0_3: ## %sw.bb1
jmp _f2 ## TAILCALL
LBB0_4: ## %sw.bb3
jmp _f3 ## TAILCALL
rdar://9147433
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127953 91177308-0d34-0410-b5e6-96231b3b80d8
Proof-of-concept code that code-gens a module to an in-memory MachO object.
This will be hooked up to a run-time dynamic linker library (see: llvm-rtdyld
for similarly conceptual work for that part) which will take the compiled
object and link it together with the rest of the system, providing back to the
JIT a table of available symbols which will be used to respond to the
getPointerTo*() queries.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127916 91177308-0d34-0410-b5e6-96231b3b80d8
rather than an int. Thankfully, this only causes LLVM to miss optimizations, not
generate incorrect code.
This just fixes the zext at the return. We still insert an i32 ZextAssert when
reading a function's arguments, but it is followed by a truncate and another i8
ZextAssert so it is not optimized.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127766 91177308-0d34-0410-b5e6-96231b3b80d8
flexible.
If it returns a register class that's different from the input, then that's the
register class used for cross-register class copies.
If it returns a register class that's the same as the input, then no cross-
register class copies are needed (normal copies would do).
If it returns null, then it's not at all possible to copy registers of the
specified register class.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127368 91177308-0d34-0410-b5e6-96231b3b80d8
regs. This is the only change in this checkin that may affects the
default scheduler. With better register tracking and heuristics, it
doesn't make sense to artificially lower the register limit so much.
Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to
give the scheduler a way to account for div and sqrt on targets that
don't have an itinerary. It is currently defaults to 10 (the actual
number doesn't matter much), but only takes effect on non-default
schedulers: list-hybrid and list-ilp.
Added several heuristics that can be individually disabled for the
non-default sched=list-ilp mode. This helps us determine how much
better we can do on a given benchmark than the default
scheduler. Certain compute intensive loops run much faster in this
mode with the right set of heuristics, and it doesn't seem to have
much negative impact elsewhere. Not all of the heuristics are needed,
but we still need to experiment to decide which should be disabled by
default for sched=list-ilp.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127067 91177308-0d34-0410-b5e6-96231b3b80d8
and iprintf is available on the target. Currently iprintf is only
marked as being available on the XCore.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126935 91177308-0d34-0410-b5e6-96231b3b80d8
A major part of its (eventual) goal is to support a much cleaner separation between disassembly callbacks
provided by the target and the disassembler emitter itself, i.e. not requiring hardcoding of knowledge in tblgen
like the existing disassembly emitters do.
The hope is that some day this will allow us to replace the existing non-Thumb ARM disassembler and remove
some of the hacks the old one introduced to tblgen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125966 91177308-0d34-0410-b5e6-96231b3b80d8
query about available library functions. For now this just has
memset_pattern16, which exists on darwin, but it can be extended for a
bunch of other things in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125965 91177308-0d34-0410-b5e6-96231b3b80d8
Motivation: Improve the parsing of not usual (different from registers or
immediates) operand forms.
This commit implements only the generic support. The ARM specific modifications
will come next.
A table like the one below is autogenerated for every instruction
containing a 'ParserMethod' in its AsmOperandClass
static const OperandMatchEntry OperandMatchTable[20] = {
/* Mnemonic, Operand List Mask, Operand Class, Features */
{ "cdp", 29 /* 0, 2, 3, 4 */, MCK_Coproc, Feature_IsThumb|Feature_HasV6 },
{ "cdp", 58 /* 1, 3, 4, 5 */, MCK_Coproc, Feature_IsARM },
A matcher function very similar (but lot more naive) to
MatchInstructionImpl scans the table. After the mnemonic match, the
features are checked and if the "to be parsed" operand index is
present in the mask, there's a real match. Then, a switch like the one
below dispatch the parsing to the custom method provided in
'ParseMethod':
case MCK_Coproc:
return TryParseCoprocessorOperandName(Operands);
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125030 91177308-0d34-0410-b5e6-96231b3b80d8
the load, then it may be legal to transform the load and store to integer
load and store of the same width.
This is done if the target specified the transformation as profitable. e.g.
On arm, this can transform:
vldr.32 s0, []
vstr.32 s0, []
to
ldr r12, []
str r12, []
rdar://8944252
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124708 91177308-0d34-0410-b5e6-96231b3b80d8
default implementation for x86, going through the stack in a similr
fashion to how the codegen implements BUILD_VECTOR. Eventually this
will get matched to VINSERTF128 if AVX is available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124307 91177308-0d34-0410-b5e6-96231b3b80d8
implementation of EXTRACT_SUBVECTOR for x86, going through the stack
in a similr fashion to how the codegen implements BUILD_VECTOR.
Eventually this will get matched to VEXTRACTF128 if AVX is available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124292 91177308-0d34-0410-b5e6-96231b3b80d8
This will be used to check patterns referencing a forthcoming
INSERT_SUBVECTOR SDNode and will also be used to check
EXTRACT_SUBVECTOR nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124191 91177308-0d34-0410-b5e6-96231b3b80d8
flags. They are still not enable in this revision.
Added TargetInstrInfo::isZeroCost() to fix a fundamental problem with
the scheduler's model of operand latency in the selection DAG.
Generalized unit tests to work with sched-cycles.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123969 91177308-0d34-0410-b5e6-96231b3b80d8
TargetInstrInfo:
Change produceSameValue() to take MachineRegisterInfo as an optional argument.
When in SSA form, targets can use it to make more aggressive equality analysis.
Machine LICM:
1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead.
2. Fix a bug which prevent CSE of instructions which are not re-materializable.
3. Use improved form of produceSameValue.
ARM:
1. Teach ARM produceSameValue to look pass some PIC labels.
2. Look for operands from different loads of different constant pool entries
which have same values.
3. Re-implement PIC GA materialization using movw + movt. Combine the pair with
a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible
to re-materialize the instruction, allow machine LICM to hoist the set of
instructions out of the loop and make it possible to CSE them. It's a bit
hacky, but it significantly improve code quality.
4. Some minor bug fixes as well.
With the fixes, using movw + movt to materialize GAs significantly outperform the
load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap
and 176.gcc ~10%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123905 91177308-0d34-0410-b5e6-96231b3b80d8
Fix the TargetRegisterInfo::NoRegister places where someone preferred
typing 'TargetRegisterInfo::NoRegister' instead of typing '0'.
Note that TableGen is already emitting xx::NoRegister in xxGenRegisterNames.inc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123140 91177308-0d34-0410-b5e6-96231b3b80d8
The numbering plan is now:
0 NoRegister.
[1;2^30) Physical registers.
[2^30;2^31) Stack slots.
[2^31;2^32) Virtual registers. (With -1u and -2u used by DenseMapInfo.)
Each segment is filled from the left, so any mistaken interpretation should
quickly cause crashes.
FirstVirtualRegister has been removed. TargetRegisterInfo provides predicates
conversion functions that should be used instead of interpreting register
numbers manually.
It is now legal to pass NoRegister to isPhysicalRegister() and
isVirtualRegister(). The result is false in both cases.
It is quite rare to represent stack slots in this way, so isPhysicalRegister()
and isVirtualRegister() require that isStackSlot() be checked first if it can
possibly return true. This allows a very fast implementation of the common
predicates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123137 91177308-0d34-0410-b5e6-96231b3b80d8
physical register numbers.
This makes the hack used in LiveInterval official, and lets LiveInterval be
oblivious of stack slots.
The isPhysicalRegister() and isVirtualRegister() predicates don't know about
this, so when a variable may contain a stack slot, isStackSlot() should always
be tested first.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123128 91177308-0d34-0410-b5e6-96231b3b80d8
Print virtual registers numbered from 0 instead of the arbitrary
FirstVirtualRegister. The first virtual register is printed as %vreg0.
TRI::NoRegister is printed as %noreg.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123107 91177308-0d34-0410-b5e6-96231b3b80d8
depending on TRI::FirstVirtualRegister.
Also use TRI::printReg instead of printing virtual registers directly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123101 91177308-0d34-0410-b5e6-96231b3b80d8
Provide MRI::getNumVirtRegs() and TRI::index2VirtReg() functions to allow
iteration over virtual registers without depending on the representation of
virtual register numbers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123098 91177308-0d34-0410-b5e6-96231b3b80d8
Instead encode llvm IR level property "HasSideEffects" in an operand (shared
with IsAlignStack). Added MachineInstrs::hasUnmodeledSideEffects() to check
the operand when the instruction is an INLINEASM.
This allows memory instructions to be moved around INLINEASM instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123044 91177308-0d34-0410-b5e6-96231b3b80d8
Also fix an off-by-one in SelectionDAGBuilder that was preventing shuffle
vectors from being translated to EXTRACT_SUBVECTOR.
Patch by Tim Northover.
The test changes are needed to keep those spill-q tests from testing aligned
spills and restores. If the only aligned stack objects are spill slots, we
no longer realign the stack frame. Prior to this patch, an EXTRACT_SUBVECTOR
was legalized by loading from the stack, which created an aligned frame index.
Now, however, there is nothing except the spill slot in the stack frame, so
I added an aligned alloca.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122995 91177308-0d34-0410-b5e6-96231b3b80d8
etc. takes an option OptSize. If OptSize is true, it would return
the inline limit for functions with attribute OptSize.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122952 91177308-0d34-0410-b5e6-96231b3b80d8
DAG scheduling during isel. Most new functionality is currently
guarded by -enable-sched-cycles and -enable-sched-hazard.
Added InstrItineraryData::IssueWidth field, currently derived from
ARM itineraries, but could be initialized differently on other targets.
Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is
active, and if so how many cycles of state it holds.
Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry
into the scheduler's available queue.
ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to
get information about it's SUnits, provides RecedeCycle for bottom-up
scheduling, correctly computes scoreboard depth, tracks IssueCount, and
considers potential stall cycles when checking for hazards.
ScheduleDAGRRList now models machine cycles and hazards (under
flags). It tracks MinAvailableCycle, drives the hazard recognizer and
priority queue's ready filter, manages a new PendingQueue, properly
accounts for stall cycles, etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122541 91177308-0d34-0410-b5e6-96231b3b80d8
may be called. If the entry block is empty, the insertion point iterator will be
the "end()" value. Calling ->getParent() on it (among others) causes problems.
Modify materializeFrameBaseRegister to take the machine basic block and insert
the frame base register at the beginning of that block. (It's very similar to
what the code does all ready. The only difference is that it will always insert
at the beginning of the entry block instead of after a previous materialization
of the frame base register. I doubt that that matters here.)
<rdar://problem/8782198>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122104 91177308-0d34-0410-b5e6-96231b3b80d8
the MCCodeEmitter, which seems like a better organization.
- Also, cleaned up some magic constants while in the area.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121953 91177308-0d34-0410-b5e6-96231b3b80d8
registers that alias Reg, including itself. This is almost the same as the
existing getAliasSet() method, except for the inclusion of Reg.
The name matches the reflexive TRI::regsOverlap(x, y) relation.
It is very common to do stuff to a register and all its aliases:
stuff(Reg)
for (const unsigned *Alias = TRI->getAliasSet(Reg); *Alias; ++Alias)
stuff(*Alias);
That can now be written as the simpler:
for (const unsigned *Alias = TRI->getOverlaps(Reg); *Alias; ++Alias)
stuff(*Alias);
This change requires a bit more constant space for the alias lists because Reg
is included and because the empty alias list cannot be shared any longer.
If the getAliasSet method is eventually removed, this space can be reclaimed by
sharing overlap lists. For instance, %rax and %eax have identical overlap sets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121800 91177308-0d34-0410-b5e6-96231b3b80d8
legalization time. Since at legalization time there is no mapping from
SDNode back to the corresponding LLVM instruction and the return
SDNode is target specific, this requires a target hook to check for
eligibility. Only x86 and ARM support this form of sibcall optimization
right now.
rdar://8707777
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120501 91177308-0d34-0410-b5e6-96231b3b80d8
MCStreamer instead of just MCObjectStreamer. Address changes cannot
be as efficient as we have to use DW_LNE_set_addres, but at least
most of the logic is shared.
This will be used so that, with CodeGen still using EmitDwarfLocDirective,
llvm-gcc is able to produce debug_line sections without needing an
assembler that supports .loc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119777 91177308-0d34-0410-b5e6-96231b3b80d8
and xor. The 32-bit move immediates can be hoisted out of loops by machine
LICM but the isel hacks were preventing them.
Instead, let peephole optimization pass recognize registers that are defined by
immediates and the ARM target hook will fold the immediates in.
Other changes include 1) do not fold and / xor into cmp to isel TST / TEQ
instructions if there are multiple uses. This happens when the 'and' is live
out, machine sink would have sinked the computation and that ends up pessimizing
code. The peephole pass would recognize situations where the 'and' can be
toggled to define CPSR and eliminate the comparison anyway.
2) Move peephole pass to after machine LICM, sink, and CSE to avoid blocking
important optimizations.
rdar://8663787, rdar://8241368
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119548 91177308-0d34-0410-b5e6-96231b3b80d8
operand list instead of the operand list redundantly declared on the alias
or instruction.
With this change, we finally remove the ins/outs list on the alias. Before:
def : InstAlias<(outs GR16:$dst), (ins GR8 :$src),
"movsx $src, $dst",
(MOVSX16rr8W GR16:$dst, GR8:$src)>;
After:
def : InstAlias<"movsx $src, $dst",
(MOVSX16rr8W GR16:$dst, GR8:$src)>;
This also makes the alias mechanism more general and powerful, which will
be exploited in subsequent patches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118329 91177308-0d34-0410-b5e6-96231b3b80d8
with a SimpleValueType, while an EVT supports equality and
inequality comparisons with SimpleValueType.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118169 91177308-0d34-0410-b5e6-96231b3b80d8
value type, so there is no point in passing it around using
an EVT. Use the simpler MVT everywhere. Rather than trying
to propagate this information maximally in all the code that
using the calling convention stuff, I chose to do a mainly
low impact change instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118167 91177308-0d34-0410-b5e6-96231b3b80d8
1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to
"optimize for latency". Call instructions don't have the right latency and
this is more likely to use introduce spills.
2. Fix if-converter cost function. For ARM, it should use instruction latencies,
not # of micro-ops since multi-latency instructions is completely executed
even when the predicate is false. Also, some instruction will be "slower"
when they are predicated due to the register def becoming implicit input.
rdar://8598427
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118135 91177308-0d34-0410-b5e6-96231b3b80d8