llvm-6502/lib
Bill Wendling c806f89fda Merging r215685:
------------------------------------------------------------------------
r215685 | wschmidt | 2014-08-14 18:25:26 -0700 (Thu, 14 Aug 2014) | 69 lines

[PPC64] Add missing dependency on X2 to LDinto_toc.

The LDinto_toc pattern has been part of 64-bit PowerPC for a long
time, and represents loading from a memory location into the TOC
register (X2).  However, this pattern doesn't explicitly record that
it modifies that register.  This patch adds the missing dependency.

It was very surprising to me that this has never shown up as a problem
in the past, and that we only saw this problem recently in a single
scenario when building a self-hosted clang.  It turns out that in most
cases we have another dependency present that keeps the LDinto_toc
instruction tied in place.  LDinto_toc is used for TOC restore
following a call site, so this is a typical sequence:

   BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>, %X12<imp-use>, %X1<imp-def>, ...
   LDinto_toc 24, %X1
   ADJCALLSTACKUP 96, 0, %R1<imp-def>, %R1<imp-use>

Because the LDinto_toc is inserted prior to the ADJCALLSTACKUP, there
is a natural anti-dependency between the two that keeps it in place.

Therefore we don't usually see a problem.  However, in one particular
case, one call is followed immediately by another call, and the second
call requires a parameter that is a TOC-relative address.  This is the
code sequence:

  BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>, %X4<imp-use>, %X5<imp-use>, %X12<imp-use>, %X1<imp-def>, ...
  LDinto_toc 24, %X1
  ADJCALLSTACKUP 96, 0, %R1<imp-def>, %R1<imp-use>
  ADJCALLSTACKDOWN 96, %R1<imp-def>, %R1<imp-use>
  %vreg39<def> = ADDIStocHA %X2, <ga:@.str>; G8RC_and_G8RC_NOX0:%vreg39
  %vreg40<def> = ADDItocL %vreg39<kill>, <ga:@.str>; G8RC:%vreg40 G8RC_and_G8RC_NOX0:%vreg39

Note that the back-to-back stack adjustments are the same size!  The
back end is smart enough to recognize this and optimize them away:

  BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>, %X4<imp-use>, %X5<imp-use>, %X12<imp-use>, %X1<imp-def>, ...
  LDinto_toc 24, %X1
  %vreg39<def> = ADDIStocHA %X2, <ga:@.str>; G8RC_and_G8RC_NOX0:%vreg39
  %vreg40<def> = ADDItocL %vreg39<kill>, <ga:@.str>; G8RC:%vreg40 G8RC_and_G8RC_NOX0:%vreg39

Now there is nothing to prevent the ADDIStocHA instruction from moving
ahead of the LDinto_toc instruction, and because of the longest-path
heuristic, this is what happens.

With the accompanying patch, %X2 is represented as an implicit def:

  BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>, %X4<imp-use>, %X5<imp-use>, %X12<imp-use>, %X1<imp-def>, ...
  LDinto_toc 24, %X1, %X2<imp-def,dead>
  ADJCALLSTACKUP 96, 0, %R1<imp-def,dead>, %R1<imp-use>
  ADJCALLSTACKDOWN 96, %R1<imp-def,dead>, %R1<imp-use>
  %vreg39<def> = ADDIStocHA %X2, <ga:@.str>; G8RC_and_G8RC_NOX0:%vreg39
  %vreg40<def> = ADDItocL %vreg39<kill>, <ga:@.str>; G8RC:%vreg40 G8RC_and_G8RC_NOX0:%vreg39

So now when the two stack adjustments are removed, ADDIStocHA is
prevented from being moved above LDinto_toc.

I have not yet created a test case for this, because the original
failure occurs on a relatively large function that needs reduction.
However, this is a fairly serious bug, despite its infrequency, and I
wanted to get this patch onto the list as soon as possible so that it
can be considered for a 3.5 backport.  I'll work on whittling down a
test case.

Have we missed the boat for 3.5 at this point?

Thanks,
Bill

------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@215878 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-18 05:16:33 +00:00
..
Analysis Merging r214423: 2014-08-04 04:22:44 +00:00
AsmParser Add a dereferenceable attribute 2014-07-18 15:51:28 +00:00
Bitcode Add a dereferenceable attribute 2014-07-18 15:51:28 +00:00
CodeGen Merging r214670: 2014-08-12 05:41:11 +00:00
DebugInfo Revert "Introduce a string_ostream string builder facilty" 2014-06-26 22:52:05 +00:00
ExecutionEngine Fixing an MSVC conversion warning about implicitly converting the shift results to 64-bits. No functional change intended. 2014-07-21 12:31:43 +00:00
IR Rename metadata llvm.loop.vectorize.unroll to llvm.loop.vectorize.interleave. 2014-07-21 23:11:03 +00:00
IRReader Update the MemoryBuffer API to use ErrorOr. 2014-07-06 17:43:13 +00:00
LineEditor
Linker Include <tuple> to make buildbots happy 2014-06-27 18:38:12 +00:00
LTO MergedLoadStoreMotion pass 2014-07-18 19:13:09 +00:00
MC [MC] Pass MCSymbolData to needsRelocateWithSymbol 2014-07-20 23:15:06 +00:00
Object Correct the ownership passing semantics of object::createBinary and make them explicit in the type system. 2014-07-21 16:26:24 +00:00
Option Generic: add range-adapter for option parsing. 2014-07-09 13:03:37 +00:00
ProfileData Update the MemoryBuffer API to use ErrorOr. 2014-07-06 17:43:13 +00:00
Support Merging r213894: 2014-07-29 23:27:06 +00:00
TableGen [TableGen] Allow shift operators to take bits<n> 2014-07-17 17:04:27 +00:00
Target Merging r215685: 2014-08-18 05:16:33 +00:00
Transforms Merging r213726: 2014-08-04 04:28:45 +00:00
CMakeLists.txt
LLVMBuild.txt
Makefile