11 Commits

Author SHA1 Message Date
Andrew Trick
6a7770b7ae Enable MI Sched for x86.
This changes the SelectionDAG scheduling preference to source
order. Soon, the SelectionDAG scheduler can be bypassed saving
a nice chunk of compile time.

Performance differences that result from this change are often a
consequence of register coalescing. The register coalescer is far from
perfect. Bugs can be filed for deficiencies.

On x86 SandyBridge/Haswell, the source order schedule is often
preserved, particularly for small blocks.

Register pressure is generally improved over the SD scheduler's ILP
mode. However, we are still able to handle large blocks that require
latency hiding, unlike the SD scheduler's BURR mode. MI scheduler also
attempts to discover the critical path in single-block loops and
adjust heuristics accordingly.

The MI scheduler relies on the new machine model. This is currently
unimplemented for AVX, so we may not be generating the best code yet.

Unit tests are updated so they don't depend on SD scheduling heuristics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192750 91177308-0d34-0410-b5e6-96231b3b80d8
2013-10-15 23:33:07 +00:00
Andrew Trick
6ea2b9608a Allocate local registers in order for optimal coloring.
Also avoid locals evicting locals just because they want a cheaper register.

Problem: MI Sched knows exactly how many registers we have and assumes
they can be colored. In cases where we have large blocks, usually from
unrolled loops, greedy coloring fails. This is a source of
"regressions" from the MI Scheduler on x86. I noticed this issue on
x86 where we have long chains of two-address defs in the same live
range. It's easy to see this in matrix multiplication benchmarks like
IRSmk and even the unit test misched-matmul.ll.

A fundamental difference between the LLVM register allocator and
conventional graph coloring is that in our model a live range can't
discover its neighbors, it can only verify its neighbors. That's why
we initially went for greedy coloring and added eviction to deal with
the hard cases. However, for singly defined and two-address live
ranges, we can optimally color without visiting neighbors simply by
processing the live ranges in instruction order.

Other beneficial side effects:

It is much easier to understand and debug regalloc for large blocks
when the live ranges are allocated in order. Yes, global allocation is
still very confusing, but it's nice to be able to comprehend what
happened locally.

Heuristics could be added to bias register assignment based on
instruction locality (think late register pairing, banks...).

Intuituvely this will make some test cases that are on the threshold
of register pressure more stable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187139 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-25 18:35:14 +00:00
Andrew Trick
b2b5dc642c Revert "Temporarily enable MI-Sched on X86."
This reverts commit 98a9b72e8c56dc13a2617de84503a3d78352789c.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184823 91177308-0d34-0410-b5e6-96231b3b80d8
2013-06-25 02:48:58 +00:00
Andrew Trick
98a9b72e8c Temporarily enable MI-Sched on X86.
Sorry for the unit test churn. I'll try to make the change permanently
next time.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184705 91177308-0d34-0410-b5e6-96231b3b80d8
2013-06-24 09:13:20 +00:00
Benjamin Kramer
3a36cb0805 Force a triple so we don't get bitten by windows' different regalloc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182935 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-30 15:39:35 +00:00
Benjamin Kramer
051ed0ae2d Force fragile test to the atom scheduler model.
The pattern the test originally checked for doesn't occur on other -mcpu
settings. On atom it's still there though slightly differently scheduled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182933 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-30 15:22:28 +00:00
Tim Northover
5b1552548a X86: allow registers 8-15 in test
This test was failing on some hosts when an unexpected register was used for a
variable. This just extends the regexp to allow the new x86-64 registers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182929 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-30 13:56:32 +00:00
Tim Northover
15983b80a0 X86: use sub-register sequences for MOV*r0 operations
Instead of having a bunch of separate MOV8r0, MOV16r0, ... pseudo-instructions,
it's better to use a single MOV32r0 (which will expand to "xorl %reg, %reg")
and obtain other sizes with EXTRACT_SUBREG and SUBREG_TO_REG. The encoding is
smaller and partial register updates can sometimes be avoided.

Until recently, this sequence was a barrier to rematerialization though. That
should now be fixed so it's an appropriate time to make the change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182928 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-30 13:19:42 +00:00
Andrew Trick
07269265b0 misched: tag a few XFAILs that I plan to fix
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153222 91177308-0d34-0410-b5e6-96231b3b80d8
2012-03-21 22:31:31 +00:00
Eric Christopher
7c2cdb1c05 Turn on list-ilp scheduling by default on x86 and x86-64, fix up
testcases accordingly. Some are currently xfailed and will be filed
as bugs to be fixed or understood.

Performance results:

roughly neutral on SPEC
some micro benchmarks in the llvm suite are up between 100 and 150%, only
a pair of regressions that are due to be investigated

john-the-ripper saw:
10% improvement in traditional DES
8% improvement in BSDI DES
59% improvement in FreeBSD MD5
67% improvement in OpenBSD Blowfish
14% improvement in LM DES

Small compile time impact.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127208 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-08 02:42:25 +00:00
Bill Wendling
fe633f0ed6 Testcase for r105741.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105750 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-09 20:30:22 +00:00