From c9c0b59402ebb0ca2ee25d418c657892410239f2 Mon Sep 17 00:00:00 2001 From: Gordon Henriksen Date: Mon, 2 Mar 2009 03:47:20 +0000 Subject: [PATCH] Make some improvements to the GC docs. Also, drop reference to the half-baked runtime interface. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65802 91177308-0d34-0410-b5e6-96231b3b80d8 --- docs/GarbageCollection.html | 542 +++++++++++++++--------------------- 1 file changed, 219 insertions(+), 323 deletions(-) diff --git a/docs/GarbageCollection.html b/docs/GarbageCollection.html index 8a303df37b9..c2236d4d81a 100644 --- a/docs/GarbageCollection.html +++ b/docs/GarbageCollection.html @@ -20,19 +20,15 @@
  1. Introduction
  2. -
  3. Using the collectors +
  4. Getting started
  5. @@ -51,20 +47,7 @@ -
  6. Recommended runtime interface - -
  7. - -
  8. Implementing a collector plugin +
  9. Compiler plugin interface
    • Overview of available features
    • Computing stack maps
    • @@ -145,7 +128,7 @@ support accurate garbage collection.

      @@ -168,174 +151,234 @@ collector models. For instance, the intrinsics permit:

      support a broad class of garbage collected languages including Scheme, ML, Java, C#, Perl, Python, Lua, Ruby, other scripting languages, and more.

      -

      However, LLVM does not itself implement a garbage collector. This is because -collectors are tightly coupled to object models, and LLVM is agnostic to object -models. Since LLVM is agnostic to object models, it would be inappropriate for -LLVM to dictate any particular collector. Instead, LLVM provides a framework for -garbage collector implementations in two manners:

      +

      However, LLVM does not itself provide a garbage collector—this should +be part of your language's runtime library. LLVM provides a framework for +compile time code generation plugins. The role of these +plugins is to generate code and data structures which conforms to the binary +interface specified by the runtime library. This is similar to the +relationship between LLVM and DWARF debugging info, for example. The +difference primarily lies in the lack of an established standard in the domain +of garbage collection—thus the plugins.

      + +

      The aspects of the binary interface with which LLVM's GC support is +concerned are:

        -
      • At compile time with collector plugins for - the compiler. Collector plugins have ready access to important garbage - collector algorithms. Leveraging these tools, it is straightforward to - emit type-accurate stack maps for your runtime in as little as ~100 lines of - C++ code.
      • - -
      • At runtime with suggested runtime - interfaces, which allow front-end compilers to support a range of - collection runtimes.
      • +
      • Creation of GC-safe points within code where collection is allowed to + execute safely.
      • +
      • Definition of a stack frame descriptor. For each safe point in the code, + a frame descriptor maps where object references are located within the + frame so that the GC may traverse and perhaps update them.
      • +
      • Write barriers when storing object references within the heap. These + are commonly used to optimize incremental scans.
      • +
      • Emission of read barriers when loading object references. These are + useful for interoperating with concurrent collectors.
      +

      There are additional areas that LLVM does not directly address:

      + +
        +
      • Registration of global roots.
      • +
      • Discovery or registration of stack frame descriptors.
      • +
      • The functions used by the program to allocate memory, trigger a + collection, etc.
      • +
      + +

      In general, LLVM's support for GC does not include features which can be +adequately addressed with other features of the IR and does not specify a +particular binary interface. On the plus side, this means that you should be +able to integrate LLVM with an existing runtime. On the other hand, it leaves +a lot of work for the developer of a novel language. However, it's easy to get +started quickly and scale up to a more sophisticated implementation as your +compiler matures.

      +
      -

      In general, using a collector implies:

      +

      Using a GC with LLVM implies many things, for example:

        -
      • Emitting compatible code, including initialization in the main - program if necessary.
      • -
      • Loading a compiler plugin if the collector is not statically linked with - your compiler. For llc, use the -load option.
      • -
      • Selecting the collection algorithm by applying the gc "..." - attribute to your garbage collected functions, or equivalently with - the setGC method.
      • -
      • Linking your final executable with the garbage collector runtime.
      • +
      • Write a runtime library or find an existing one which implements a GC + heap.
          +
        1. Implement a memory allocator.
        2. +
        3. Design a binary interface for frame descriptors, used to identify + references within a stack frame.*
        4. +
        5. Implement a stack crawler to discover functions on the call stack.*
        6. +
        7. Implement a registry for global roots.
        8. +
        9. Design a binary interface for type descriptors, used to map references + within heap objects.
        10. +
        11. Implement a collection routine bringing together all of the above.
        12. +
      • +
      • Emit compatible code from your compiler.
          +
        • Initialization in the main function.
        • +
        • Use the gc "..." attribute to enable GC code generation + (or F.setGC("...")).
        • +
        • Use @llvm.gcroot to mark stack roots.
        • +
        • Use @llvm.gcread and/or @llvm.gcwrite to + manipulate GC references, if necessary.
        • +
        • Allocate memory using the GC allocation routine provided by the + runtime library.
        • +
        • Generate type descriptors according to your runtime's binary interface.
        • +
      • +
      • Write a compiler plugin to interface LLVM with the runtime library.*
          +
        • Lower @llvm.gcread and @llvm.gcwrite to appropriate + code sequences.*
        • +
        • Generate stack maps according to the runtime's binary interface.*
        • +
      • +
      • Load the plugin into the compiler. Use llc -load or link the + plugin statically with your language's compiler.*
      • +
      • Link program executables with the runtime.
      -

      This table summarizes the available runtimes.

      - - - - - - - - - - - - - - - - - - - - - - - - - - -
      Collectorgc attributeLinkagegcrootgcreadgcwrite
      SemiSpacegc "shadow-stack"TODO FIXMErequiredoptionaloptional
      Ocamlgc "ocaml"provided by ocamloptrequiredoptionaloptional
      - -

      The sections for Collection intrinsics and -Recommended runtime interface detail the interfaces that -collectors may require user programs to utilize.

      +

      To help with several of these tasks (those indicated with a *), LLVM +includes a highly portable, built-in ShadowStack code generator. It is compiled +into llc and works even with the interpreter and C backends.

      -
      - Collector *llvm::createShadowStackCollector(); -
      -
      -

      The ShadowStack backend is invoked with the gc "shadow-stack" -function attribute. -Unlike many collectors which rely on a cooperative code generator to generate -stack maps, this algorithm carefully maintains a linked list of stack root -descriptors [Henderson2002]. This so-called "shadow -stack" mirrors the machine stack. Maintaining this data structure is slower -than using stack maps, but has a significant portability advantage because it -requires no special support from the target code generator.

      +

      To turn the shadow stack on for your functions, first call:

      -

      The ShadowStack collector does not use read or write barriers, so the user -program may use load and store instead of llvm.gcread -and llvm.gcwrite.

      +
      F.setGC("shadow-stack");
      -

      ShadowStack is a code generator plugin only. It must be paired with a -compatible runtime.

      +

      for each function your compiler emits. Since the shadow stack is built into +LLVM, you do not need to load a plugin.

      + +

      Your compiler must also use @llvm.gcroot as documented. +Don't forget to create a root for each intermediate value that is generated +when evaluating an expression. In h(f(), g()), the result of +f() could easily be collected if evaluating g() triggers a +collection.

      + +

      There's no need to use @llvm.gcread and @llvm.gcwrite over +plain load and store for now. You will need them when +switching to a more advanced GC.

      -

      The SemiSpace runtime implements the suggested -runtime interface and is compatible with the ShadowStack backend.

      +

      The shadow stack doesn't imply a memory allocation algorithm. A semispace +collector or building atop malloc are great places to start, and can +be implemented with very little code.

      -

      SemiSpace is a very simple copying collector. When it starts up, it -allocates two blocks of memory for the heap. It uses a simple bump-pointer -allocator to allocate memory from the first block until it runs out of space. -When it runs out of space, it traces through all of the roots of the program, -copying blocks to the other half of the memory space.

      - -

      This runtime is highly experimental and has not been used in a real project. -Enhancements would be welcomed.

      +

      When it comes time to collect, however, your runtime needs to traverse the +stack roots, and for this it needs to integrate with the shadow stack. Luckily, +doing so is very simple. (This code is heavily commented to help you +understand the data structure, but there are only 20 lines of meaningful +code.)

      +
      /// @brief A constant shadow stack frame descriptor. The compiler emits one of
      +///        these for each function.
      +/// 
      +/// Storage of metadata values is elided if the %meta parameter to @llvm.gcroot
      +/// is null.
      +struct FrameMap {
      +  int32_t NumRoots;    //< Number of roots in stack frame.
      +  int32_t NumMeta;     //< Number of metadata descriptors. May be < NumRoots.
      +  const void *Meta[0]; //< Metadata for each root.
      +};
      +
      +/// @brief A link in the dynamic shadow stack. One of these is embedded in the
      +///        stack frame of each function on the call stack.
      +struct StackEntry {
      +  StackEntry *Next;    //< Link to next stack entry (the caller's).
      +  const FrameMap *Map; //< Pointer to constant FrameMap.
      +  void *Roots[0];      //< Stack roots (in-place array).
      +};
      +
      +/// @brief The head of the singly-linked list of StackEntries. Functions push
      +///        and pop onto this in their prologue and epilogue.
      +/// 
      +/// Since there is only a global list, this technique is not threadsafe.
      +StackEntry *llvm_gc_root_chain;
      +
      +/// @brief Calls Visitor(root, meta) for each GC root on the stack.
      +///        root and meta are exactly the values passed to
      +///        @llvm.gcroot.
      +/// 
      +/// Visitor could be a function to recursively mark live objects. Or it
      +/// might copy them to another heap or generation.
      +/// 
      +/// @param Visitor A function to invoke for every GC root on the stack.
      +void visitGCRoots(void (*Visitor)(void **Root, const void *Meta)) {
      +  for (StackEntry *R = llvm_gc_root_chain; R; R = R->Next) {
      +    unsigned i = 0;
      +    
      +    // For roots [0, NumMeta), the metadata pointer is in the FrameMap.
      +    for (unsigned e = R->Map->NumMeta; i != e; ++i)
      +      Visitor(&R->Roots[i], R->Map->Meta[i]);
      +    
      +    // For roots [NumMeta, NumRoots), the metadata pointer is null.
      +    for (unsigned e = R->Map->NumRoots; i != e; ++i)
      +      Visitor(&R->Roots[i], NULL);
      +  }
      +}
      + -
      - Collector *llvm::createOcamlCollector(); -
      -
      -

      The ocaml backend is invoked with the gc "ocaml" function attribute. -It supports the -Objective Caml language runtime by emitting -a type-accurate stack map in the form of an ocaml 3.10.0-compatible frametable. -The linkage requirements are satisfied automatically by the ocamlopt -compiler when linking an executable.

      +

      Unlike many GC algorithms which rely on a cooperative code generator to +generate stack maps, this algorithm carefully maintains a linked list of stack +root descriptors [Henderson2002]. This so-called +"shadow stack" mirrors the machine stack. Maintaining this data structure is +slower than using stack maps, but has a significant portability advantage +because it requires no special support from the target code generator.

      -

      The ocaml collector does not use read or write barriers, so the user program -may use load and store instead of llvm.gcread and -llvm.gcwrite.

      +

      The tradeoff for this simplicity and portability is:

      + +
        +
      • High overhead per function call.
      • +
      • Not thread-safe.
      • +
      + +

      Still, it's an easy way to get started.

      -

      This section describes the garbage collection facilities provided by the -LLVM intermediate representation.

      +LLVM intermediate representation. The exact behavior +of these IR features is specified by the binary interface implemented by a +code generation plugin, not by this document.

      -

      These facilities are limited to those strictly necessary for compilation. -They are not intended to be a complete interface to any garbage collector. -Notably, heap allocation is not among the supplied primitives. A user program -will also need to interface with the runtime, using either the -suggested runtime interface or another interface -specified by the runtime.

      +

      These facilities are limited to those strictly necessary; they are not +intended to be a complete interface to any garbage collector. A program will +need to interface with the GC library using the facilities provided by that +program.

      @@ -345,17 +388,22 @@ specified by the runtime.

      - define ty @name(...) gc "collector" { ... + define ty @name(...) gc "name" { ...
      -

      The gc function attribute is used to specify the desired collector -algorithm to the compiler. It is equivalent to specifying the collector name -programmatically using the setGC method of Function.

      +

      The gc function attribute is used to specify the desired GC style +to the compiler. Its programmatic equivalent is the setGC method of +Function.

      -

      Specifying the collector on a per-function basis allows LLVM to link together -programs that use different garbage collection algorithms.

      +

      Setting gc "name" on a function triggers a search for a +matching code generation plugin "name"; it is that plugin which defines +the exact nature of the code generated to support GC. If none is found, the +compiler will raise an error.

      + +

      Specifying the GC style on a per-function basis allows LLVM to link together +programs that use different garbage collection algorithms (or none at all).

      @@ -370,13 +418,31 @@ programs that use different garbage collection algorithms.

      -

      The llvm.gcroot intrinsic is used to inform LLVM of a pointer -variable on the stack. The first argument must be a value referring to an alloca instruction +

      The llvm.gcroot intrinsic is used to inform LLVM that a stack +variable references an object on the heap and is to be tracked for garbage +collection. The exact impact on generated code is specified by a compiler plugin.

      + +

      A compiler which uses mem2reg to raise imperative code using alloca +into SSA form need only add a call to @llvm.gcroot for those variables +which a pointers into the GC heap.

      + +

      It is also important to mark intermediate values with llvm.gcroot. +For example, consider h(f(), g()). Beware leaking the result of +f() in the case that g() triggers a collection.

      + +

      The first argument must be a value referring to an alloca instruction or a bitcast of an alloca. The second contains a pointer to metadata that should be associated with the pointer, and must be a constant or global value address. If your target collector uses tags, use a null pointer for metadata.

      +

      The %metadata argument can be used to avoid requiring heap objects +to have 'isa' pointers or tag bits. [Appel89, Goldberg91, Tolmach94] If +specified, its value will be tracked along with the location of the pointer in +the stack frame.

      +

      Consider the following fragment of Java code:

      @@ -449,6 +515,11 @@ for completeness. In this snippet, %object is the object pointer, and
           ;; Compute the derived pointer.
           %derived = getelementptr %object, i32 0, i32 2, i32 %n
      +

      The use of these intrinsics is naturally optional if the target GC does +require the corresponding barrier. If so, the GC plugin will replace the +intrinsic calls with the corresponding load or store +instruction if they are used.

      +
      @@ -464,16 +535,13 @@ void @llvm.gcwrite(i8* %value, i8* %object, i8** %derived)

      For write barriers, LLVM provides the llvm.gcwrite intrinsic function. It has exactly the same semantics as a non-volatile store to -the derived pointer (the third argument).

      +the derived pointer (the third argument). The exact code generated is specified +by a compiler plugin.

      Many important algorithms require write barriers, including generational and concurrent collectors. Additionally, write barriers could be used to implement reference counting.

      -

      The use of this intrinsic is optional if the target collector does use -write barriers. If so, the collector will replace it with the corresponding -store.

      - @@ -489,124 +557,15 @@ i8* @llvm.gcread(i8* %object, i8** %derived)

      For read barriers, LLVM provides the llvm.gcread intrinsic function. It has exactly the same semantics as a non-volatile load from the -derived pointer (the second argument).

      +derived pointer (the second argument). The exact code generated is specified by +a compiler plugin.

      Read barriers are needed by fewer algorithms than write barriers, and may have a greater performance impact since pointer reads are more frequent than writes.

      -

      As with llvm.gcwrite, a target collector might not require the use -of this intrinsic.

      - - - - - -
      - -

      LLVM specifies the following recommended runtime interface to the garbage -collection at runtime. A program should use these interfaces to accomplish the -tasks not supported by the intrinsics.

      - -

      Unlike the intrinsics, which are integral to LLVM's code generator, there is -nothing unique about these interfaces; a front-end compiler and runtime are free -to agree to a different specification.

      - -

      Note: This interface is a work in progress.

      - -
      - - - - -
      - -
      - void llvm_gc_initialize(unsigned InitialHeapSize); -
      - -

      -The llvm_gc_initialize function should be called once before any other -garbage collection functions are called. This gives the garbage collector the -chance to initialize itself and allocate the heap. The initial heap size to -allocate should be specified as an argument. -

      - -
      - - - - -
      - -
      - void *llvm_gc_allocate(unsigned Size); -
      - -

      The llvm_gc_allocate function is a global function defined by the -garbage collector implementation to allocate memory. It returns a -zeroed-out block of memory of the specified size, sufficiently aligned to store -any object.

      - -
      - - - - -
      - -
      - void llvm_gc_collect(); -
      - -

      -The llvm_gc_collect function is exported by the garbage collector -implementations to provide a full collection, even when the heap is not -exhausted. This can be used by end-user code as a hint, and may be ignored by -the garbage collector. -

      - -
      - - - - -
      -
      - void llvm_cg_walk_gcroots(void (*FP)(void **Root, void *Meta)); -
      - -

      -The llvm_cg_walk_gcroots function is a function provided by the code -generator that iterates through all of the GC roots on the stack, calling the -specified function pointer with each record. For each GC root, the address of -the pointer and the meta-data (from the llvm.gcroot intrinsic) are provided. -

      -
      - - - - -
      -TODO -
      - -
      Implementing a collector plugin @@ -628,8 +587,9 @@ might be accomplished in as few as 100 LOC.

      This is not the appropriate place to implement a garbage collected heap or a garbage collector itself. That code should exist in the language's runtime -library. The compiler plugin is responsible for generating code which is -compatible with that runtime library.

      +library. The compiler plugin is responsible for generating code which +conforms to the binary interface defined by library, most essentially the +stack map.

      To subclass llvm::GCStrategy and register it with the compiler:

      @@ -1203,7 +1163,7 @@ yet computed.)

      Since AsmWriter and CodeGen are separate components of LLVM, a separate abstract base class and registry is provided for printing assembly code, the -GCMetadaPrinter and GCMetadaPrinterRegistry. The AsmWriter +GCMetadaPrinter and GCMetadataPrinterRegistry. The AsmWriter will look for such a subclass if the GCStrategy sets UsesMetadata:

      @@ -1337,70 +1297,6 @@ void MyGCPrinter::finishAssembly(std::ostream &OS, AsmPrinter &AP,
      - - - - -
      - -

      Implementing a garbage collector for LLVM is fairly straightforward. The -LLVM garbage collectors are provided in a form that makes them easy to link into -the language-specific runtime that a language front-end would use. They require -functionality from the language-specific runtime to get information about where pointers are located in heap objects.

      - -

      The implementation must include the -llvm_gc_allocate and -llvm_gc_collect functions. To do this, it will -probably have to trace through the roots -from the stack and understand the GC descriptors -for heap objects. Luckily, there are some example -implementations available. -

      -
      - - - - - -
      -

      -The three most common ways to keep track of where pointers live in heap objects -are (listed in order of space overhead required):

      - -
        -
      1. In languages with polymorphic objects, pointers from an object header are -usually used to identify the GC pointers in the heap object. This is common for -object-oriented languages like Self, Smalltalk, Java, or C#.
      2. - -
      3. If heap objects are not polymorphic, often the "shape" of the heap can be -determined from the roots of the heap or from some other meta-data [Appel89, Goldberg91, Tolmach94]. In this case, the garbage collector can -propagate the information around from meta data stored with the roots. This -often eliminates the need to have a header on objects in the heap. This is -common in the ML family.
      4. - -
      5. If all heap objects have pointers in the same locations, or pointers can be -distinguished just by looking at them (e.g., the low order bit is clear), no -book-keeping is needed at all. This is common for Lisp-like languages.
      6. -
      - -

      The LLVM garbage collectors are capable of supporting all of these styles of -language, including ones that mix various implementations. To do this, it -allows the source-language to associate meta-data with the stack roots, and the heap tracing routines can propagate the -information. In addition, LLVM allows the front-end to extract GC information -in any form from a specific object pointer (this supports situations #1 and #3). -

      - -
      - -