that are generated by TableGen and are already available in
MipsGenRegisterInfo.inc. Suggested by Jakob Stoklund Olesen.
Also, fix bug in function DecodeAFGR64RegisterClass.
Patch by Vladimir Medic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158846 91177308-0d34-0410-b5e6-96231b3b80d8
Regunit live ranges are computed on demand, so when mi-sched calls
handleMove, some regunits may not have live ranges yet.
That makes updating them easier: Just skip the non-existing ranges. They
will be computed correctly from the rescheduled machine code when they
are needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158831 91177308-0d34-0410-b5e6-96231b3b80d8
With this change, we avoid relying on the IR Builder to constant fold the operations.
No functionality change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158829 91177308-0d34-0410-b5e6-96231b3b80d8
There is a pretty staggering amount of this in LLVM's header files, this
is not all of the instances I'm afraid. These include all of the
functions that (in my build) are used by a non-static inline (or
external) function. Specifically, these issues were caught by the new
'-Winternal-linkage-in-inline' warning.
I'll try to just clean up the remainder of the clearly redundant "static
inline" cases on functions (not methods!) defined within headers if
I can do so in a reliable way.
There were even several cases of a missing 'inline' altogether, or my
personal favorite "static bool inline". Go figure. ;]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158800 91177308-0d34-0410-b5e6-96231b3b80d8
I'll admit I'm not entirely satisfied with this change, but it seemed
the cleanest option. Other suggestions quite welcome
The issue is that the traits specializations have static methods which
return the typedef'ed PHI_iterator type. In both the IR and MI layers
this is typedef'ed to a custom iterator class defined in an anonymous
namespace giving the types and the functions returning them internal
linkage. However, because the traits specialization is defined in the
'llvm' namespace (where it has to be, specialized template lives there),
and is in turn used in the templated implementation of the SSAUpdater.
This led to the linkage conflict that Clang now warns about.
The simplest solution to me was just to define the PHI_iterator as
a nested class inside the trait specialization. That way it still
doesn't get scoped widely, it can't be accidentally reused somewhere,
etc. This is a little gross just because nested class definitions are
a little gross, but the alternatives seem more ad-hoc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158799 91177308-0d34-0410-b5e6-96231b3b80d8
-stable-loops enables a new algorithm for generating the Loop
forest. It differs from the original algorithm in a few respects:
- Not determined by use-list order.
- Initially guarantees RPO order of block and subloops.
- Linear in the number of CFG edges.
- Nonrecursive.
I didn't want to change the LoopInfo API yet, so the block lists are
still inclusive. This seems strange to me, and it means that building
LoopInfo is not strictly linear, but it may not be a problem in
practice. At least the block lists start out in RPO order now. In the
future we may add an attribute or wrapper analysis that allows other
passes to assume RPO order.
The primary motivation of this work was not to optimize LoopInfo, but
to allow reproducing performance issues by decomposing the compilation
stages. I'm often unable to do this with the current LoopInfo, because
the loop tree order determines Loop pass order. Serializing the IR
tends to invert the order, which reverses the optimization order. This
makes it nearly impossible to debug interdependent loop optimizations
such as LSR.
I also believe this will provide more stable performance results across time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158790 91177308-0d34-0410-b5e6-96231b3b80d8
The implementation only needs inclusion from LoopInfo.cpp and
MachineLoopInfo.cpp. Clients of the interface should only include the
interface. This makes the interface readable and speeds up rebuilds
after modifying the implementation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158787 91177308-0d34-0410-b5e6-96231b3b80d8
llvm::RawMemoryObject handles empty ranges just fine, and the assert can
be triggered in the wild by e.g. invoking clang with a file that
included an empty pre-compiled header file when clang has been built
with assertions enabled. Without assertions enabled, clang will properly
report that the empty file is not a valid PCH.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158769 91177308-0d34-0410-b5e6-96231b3b80d8
When LiveIntervals is tracking fixed interference in regunits, make sure
to update those intervals as well. Currently guarded by -live-regunits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158766 91177308-0d34-0410-b5e6-96231b3b80d8
ensureAlignment() in MachineFunction). Also, drop setMaxAlignment() in
favor of this new function. This creates a main entry point to setting
MaxAlignment, which will be helpful for future work. No functionality
change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158758 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or
FSUB + FMUL. The combines are performed when:
(a) Either
AllowExcessFPPrecision option (-enable-excess-fp-precision for llc)
OR
UnsafeFPMath option (-enable-unsafe-fp-math)
are set, and
(b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of
the FADD/FSUB, and
(c) The FMUL only has one user (the FADD/FSUB).
If your target has fast FMA instructions you can make use of these combines by
overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for
types supported by your FMA instruction, and adding patterns to match ISD::FMA
to your FMA instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158757 91177308-0d34-0410-b5e6-96231b3b80d8
The PPC::EXTSW instruction preserves the low 32 bits of its input, just
like some of the x86 instructions. Use it to reduce register pressure
when the low 32 bits have multiple uses.
This requires a small change to PeepholeOptimizer since EXTSW takes a
64-bit input register.
This is related to PR5997.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158743 91177308-0d34-0410-b5e6-96231b3b80d8
StringMap suffered from the same bug as DenseMap: when you explicitly
construct it with a small number of buckets, you can arrange for the
tombstone-based growth path to be followed when the number of buckets
was less than '8'. In that case, even with a full map, it would compare
'0' as not less than '0', and refuse to grow the table, leading to
inf-loops trying to find an empty bucket on the next insertion. The fix
is very simple: use '<=' as the comparison. The same fix was applied to
DenseMap as well during its recent refactoring.
Thanks to Alex Bolz for the great report and test case. =]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158725 91177308-0d34-0410-b5e6-96231b3b80d8
The condition code didn't actually matter for arm "b" instructions,
unlike "bl". It should just use the R_ARM_JUMP24 reloc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158722 91177308-0d34-0410-b5e6-96231b3b80d8
For processors with the G5-like instruction-grouping scheme, this helps avoid
early group termination due to a write-after-write dependency within the group.
It should also help on pipelined embedded cores.
On POWER7, over the test suite, this gives an average 0.5% speedup. The largest
speedups are:
SingleSource/Benchmarks/Stanford/Quicksort - 33%
MultiSource/Applications/d/make_dparser - 21%
MultiSource/Benchmarks/FreeBench/analyzer/analyzer - 12%
MultiSource/Benchmarks/MiBench/telecomm-FFT/telecomm-fft - 12%
Largest slowdowns:
SingleSource/Benchmarks/Stanford/Bubblesort - 23%
MultiSource/Benchmarks/Prolangs-C++/city/city - 21%
MultiSource/Benchmarks/BitBench/uuencode/uuencode - 16%
MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/mpeg2decode - 13%
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158719 91177308-0d34-0410-b5e6-96231b3b80d8
TargetLoweringObjectFileELF. Use this to support it on X86. Unlike ARM,
on X86 it is not easy to find out if .init_array should be used or not, so
the decision is made via TargetOptions and defaults to off.
Add a command line option to llc that enables it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158692 91177308-0d34-0410-b5e6-96231b3b80d8
Original commit msg:
add the 'alloc' metadata node to represent the size of offset of buffers pointed to by pointers.
This metadata can be attached to any instruction returning a pointer
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158688 91177308-0d34-0410-b5e6-96231b3b80d8
This patch changes the type used to hold the FU bitset from unsigned to uint64_t.
This will be needed for some upcoming PowerPC itineraries.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158679 91177308-0d34-0410-b5e6-96231b3b80d8
The NOP, WFE, WFI, SEV and YIELD instructions are all hints w/
a different immediate value in bits [7,0]. Define a generic HINT
instruction and refactor NOP, WFI, WFI, SEV and YIELD to be
assembly aliases of that.
rdar://11600518
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158674 91177308-0d34-0410-b5e6-96231b3b80d8
when a compile time constant is known. This occurs when implicitly zero
extending function arguments from 16 bits to 32 bits. The 8 bit case doesn't
need to be handled, as the 8 bit constants are encoded directly, thereby
not needing a separate load instruction to form the constant into a register.
<rdar://problem/11481151>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158659 91177308-0d34-0410-b5e6-96231b3b80d8
This patch causes problems when both dynamic stack realignment and
dynamic allocas combine in the same function. With this patch, we no
longer build the epilog correctly, and silently restore registers from
the wrong position in the stack.
Thanks to Matt for tracking this down, and getting at least an initial
test case to Chad. I'm going to try to check a variation of that test
case in so we can easily track the fixes required.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158654 91177308-0d34-0410-b5e6-96231b3b80d8
This cleans up the method used to find trip counts in order to form CTR loops on PPC.
This refactoring allows the pass to find loops which have a constant trip count but also
happen to end with a comparison to zero. This also adds explicit FIXMEs to mark two different
classes of loops that are currently ignored.
In addition, we now search through all potential induction operations instead of just the first.
Also, we check the predicate code on the conditional branch and abort the transformation if the
code is not EQ or NE, and we then make sure that the branch to be transformed matches the
condition register defined by the comparison (multiple possible comparisons will be considered).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158607 91177308-0d34-0410-b5e6-96231b3b80d8
The present implementation handles only TBAA and FP metadata, discarding everything else.
For debug metadata, the current behavior is maintained (the debug metadata associated with
one of the instructions will be kept, discarding that attached to the other).
This should address PR 13040.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158606 91177308-0d34-0410-b5e6-96231b3b80d8