diff --git a/docs/GarbageCollection.html b/docs/GarbageCollection.html index 8a303df37b9..c2236d4d81a 100644 --- a/docs/GarbageCollection.html +++ b/docs/GarbageCollection.html @@ -20,19 +20,15 @@
However, LLVM does not itself implement a garbage collector. This is because -collectors are tightly coupled to object models, and LLVM is agnostic to object -models. Since LLVM is agnostic to object models, it would be inappropriate for -LLVM to dictate any particular collector. Instead, LLVM provides a framework for -garbage collector implementations in two manners:
+However, LLVM does not itself provide a garbage collectorthis should +be part of your language's runtime library. LLVM provides a framework for +compile time code generation plugins. The role of these +plugins is to generate code and data structures which conforms to the binary +interface specified by the runtime library. This is similar to the +relationship between LLVM and DWARF debugging info, for example. The +difference primarily lies in the lack of an established standard in the domain +of garbage collectionthus the plugins.
+ +The aspects of the binary interface with which LLVM's GC support is +concerned are:
There are additional areas that LLVM does not directly address:
+ +In general, LLVM's support for GC does not include features which can be +adequately addressed with other features of the IR and does not specify a +particular binary interface. On the plus side, this means that you should be +able to integrate LLVM with an existing runtime. On the other hand, it leaves +a lot of work for the developer of a novel language. However, it's easy to get +started quickly and scale up to a more sophisticated implementation as your +compiler matures.
+In general, using a collector implies:
+Using a GC with LLVM implies many things, for example:
This table summarizes the available runtimes.
- -Collector | -gc attribute | -Linkage | -gcroot | -gcread | -gcwrite | -
---|---|---|---|---|---|
SemiSpace | -gc "shadow-stack" | -TODO FIXME | -required | -optional | -optional | -
Ocaml | -gc "ocaml" | -provided by ocamlopt | -required | -optional | -optional | -
The sections for Collection intrinsics and -Recommended runtime interface detail the interfaces that -collectors may require user programs to utilize.
+To help with several of these tasks (those indicated with a *), LLVM +includes a highly portable, built-in ShadowStack code generator. It is compiled +into llc and works even with the interpreter and C backends.
The ShadowStack backend is invoked with the gc "shadow-stack" -function attribute. -Unlike many collectors which rely on a cooperative code generator to generate -stack maps, this algorithm carefully maintains a linked list of stack root -descriptors [Henderson2002]. This so-called "shadow -stack" mirrors the machine stack. Maintaining this data structure is slower -than using stack maps, but has a significant portability advantage because it -requires no special support from the target code generator.
+To turn the shadow stack on for your functions, first call:
-The ShadowStack collector does not use read or write barriers, so the user -program may use load and store instead of llvm.gcread -and llvm.gcwrite.
+F.setGC("shadow-stack");
ShadowStack is a code generator plugin only. It must be paired with a -compatible runtime.
+for each function your compiler emits. Since the shadow stack is built into +LLVM, you do not need to load a plugin.
+ +Your compiler must also use @llvm.gcroot as documented. +Don't forget to create a root for each intermediate value that is generated +when evaluating an expression. In h(f(), g()), the result of +f() could easily be collected if evaluating g() triggers a +collection.
+ +There's no need to use @llvm.gcread and @llvm.gcwrite over +plain load and store for now. You will need them when +switching to a more advanced GC.
The SemiSpace runtime implements the suggested -runtime interface and is compatible with the ShadowStack backend.
+The shadow stack doesn't imply a memory allocation algorithm. A semispace +collector or building atop malloc are great places to start, and can +be implemented with very little code.
-SemiSpace is a very simple copying collector. When it starts up, it -allocates two blocks of memory for the heap. It uses a simple bump-pointer -allocator to allocate memory from the first block until it runs out of space. -When it runs out of space, it traces through all of the roots of the program, -copying blocks to the other half of the memory space.
- -This runtime is highly experimental and has not been used in a real project. -Enhancements would be welcomed.
+When it comes time to collect, however, your runtime needs to traverse the +stack roots, and for this it needs to integrate with the shadow stack. Luckily, +doing so is very simple. (This code is heavily commented to help you +understand the data structure, but there are only 20 lines of meaningful +code.)
/// @brief A constant shadow stack frame descriptor. The compiler emits one of +/// these for each function. +/// +/// Storage of metadata values is elided if the %meta parameter to @llvm.gcroot +/// is null. +struct FrameMap { + int32_t NumRoots; //< Number of roots in stack frame. + int32_t NumMeta; //< Number of metadata descriptors. May be < NumRoots. + const void *Meta[0]; //< Metadata for each root. +}; + +/// @brief A link in the dynamic shadow stack. One of these is embedded in the +/// stack frame of each function on the call stack. +struct StackEntry { + StackEntry *Next; //< Link to next stack entry (the caller's). + const FrameMap *Map; //< Pointer to constant FrameMap. + void *Roots[0]; //< Stack roots (in-place array). +}; + +/// @brief The head of the singly-linked list of StackEntries. Functions push +/// and pop onto this in their prologue and epilogue. +/// +/// Since there is only a global list, this technique is not threadsafe. +StackEntry *llvm_gc_root_chain; + +/// @brief Calls Visitor(root, meta) for each GC root on the stack. +/// root and meta are exactly the values passed to +/// @llvm.gcroot. +/// +/// Visitor could be a function to recursively mark live objects. Or it +/// might copy them to another heap or generation. +/// +/// @param Visitor A function to invoke for every GC root on the stack. +void visitGCRoots(void (*Visitor)(void **Root, const void *Meta)) { + for (StackEntry *R = llvm_gc_root_chain; R; R = R->Next) { + unsigned i = 0; + + // For roots [0, NumMeta), the metadata pointer is in the FrameMap. + for (unsigned e = R->Map->NumMeta; i != e; ++i) + Visitor(&R->Roots[i], R->Map->Meta[i]); + + // For roots [NumMeta, NumRoots), the metadata pointer is null. + for (unsigned e = R->Map->NumRoots; i != e; ++i) + Visitor(&R->Roots[i], NULL); + } +}
The ocaml backend is invoked with the gc "ocaml" function attribute. -It supports the -Objective Caml language runtime by emitting -a type-accurate stack map in the form of an ocaml 3.10.0-compatible frametable. -The linkage requirements are satisfied automatically by the ocamlopt -compiler when linking an executable.
+Unlike many GC algorithms which rely on a cooperative code generator to +generate stack maps, this algorithm carefully maintains a linked list of stack +root descriptors [Henderson2002]. This so-called +"shadow stack" mirrors the machine stack. Maintaining this data structure is +slower than using stack maps, but has a significant portability advantage +because it requires no special support from the target code generator.
-The ocaml collector does not use read or write barriers, so the user program -may use load and store instead of llvm.gcread and -llvm.gcwrite.
+The tradeoff for this simplicity and portability is:
+ +Still, it's an easy way to get started.
This section describes the garbage collection facilities provided by the -LLVM intermediate representation.
+LLVM intermediate representation. The exact behavior +of these IR features is specified by the binary interface implemented by a +code generation plugin, not by this document. -These facilities are limited to those strictly necessary for compilation. -They are not intended to be a complete interface to any garbage collector. -Notably, heap allocation is not among the supplied primitives. A user program -will also need to interface with the runtime, using either the -suggested runtime interface or another interface -specified by the runtime.
+These facilities are limited to those strictly necessary; they are not +intended to be a complete interface to any garbage collector. A program will +need to interface with the GC library using the facilities provided by that +program.
The gc function attribute is used to specify the desired collector -algorithm to the compiler. It is equivalent to specifying the collector name -programmatically using the setGC method of Function.
+The gc function attribute is used to specify the desired GC style +to the compiler. Its programmatic equivalent is the setGC method of +Function.
-Specifying the collector on a per-function basis allows LLVM to link together -programs that use different garbage collection algorithms.
+Setting gc "name" on a function triggers a search for a +matching code generation plugin "name"; it is that plugin which defines +the exact nature of the code generated to support GC. If none is found, the +compiler will raise an error.
+ +Specifying the GC style on a per-function basis allows LLVM to link together +programs that use different garbage collection algorithms (or none at all).
The llvm.gcroot intrinsic is used to inform LLVM of a pointer -variable on the stack. The first argument must be a value referring to an alloca instruction +
The llvm.gcroot intrinsic is used to inform LLVM that a stack +variable references an object on the heap and is to be tracked for garbage +collection. The exact impact on generated code is specified by a compiler plugin.
+ +A compiler which uses mem2reg to raise imperative code using alloca +into SSA form need only add a call to @llvm.gcroot for those variables +which a pointers into the GC heap.
+ +It is also important to mark intermediate values with llvm.gcroot. +For example, consider h(f(), g()). Beware leaking the result of +f() in the case that g() triggers a collection.
+ +The first argument must be a value referring to an alloca instruction or a bitcast of an alloca. The second contains a pointer to metadata that should be associated with the pointer, and must be a constant or global value address. If your target collector uses tags, use a null pointer for metadata.
+The %metadata argument can be used to avoid requiring heap objects +to have 'isa' pointers or tag bits. [Appel89, Goldberg91, Tolmach94] If +specified, its value will be tracked along with the location of the pointer in +the stack frame.
+Consider the following fragment of Java code:
@@ -449,6 +515,11 @@ for completeness. In this snippet, %object is the object pointer, and ;; Compute the derived pointer. %derived = getelementptr %object, i32 0, i32 2, i32 %n+
The use of these intrinsics is naturally optional if the target GC does +require the corresponding barrier. If so, the GC plugin will replace the +intrinsic calls with the corresponding load or store +instruction if they are used.
+For write barriers, LLVM provides the llvm.gcwrite intrinsic function. It has exactly the same semantics as a non-volatile store to -the derived pointer (the third argument).
+the derived pointer (the third argument). The exact code generated is specified +by a compiler plugin.Many important algorithms require write barriers, including generational and concurrent collectors. Additionally, write barriers could be used to implement reference counting.
-The use of this intrinsic is optional if the target collector does use -write barriers. If so, the collector will replace it with the corresponding -store.
- @@ -489,124 +557,15 @@ i8* @llvm.gcread(i8* %object, i8** %derived)For read barriers, LLVM provides the llvm.gcread intrinsic function. It has exactly the same semantics as a non-volatile load from the -derived pointer (the second argument).
+derived pointer (the second argument). The exact code generated is specified by +a compiler plugin.Read barriers are needed by fewer algorithms than write barriers, and may have a greater performance impact since pointer reads are more frequent than writes.
-As with llvm.gcwrite, a target collector might not require the use -of this intrinsic.
- - - - - -LLVM specifies the following recommended runtime interface to the garbage -collection at runtime. A program should use these interfaces to accomplish the -tasks not supported by the intrinsics.
- -Unlike the intrinsics, which are integral to LLVM's code generator, there is -nothing unique about these interfaces; a front-end compiler and runtime are free -to agree to a different specification.
- -Note: This interface is a work in progress.
- --The llvm_gc_initialize function should be called once before any other -garbage collection functions are called. This gives the garbage collector the -chance to initialize itself and allocate the heap. The initial heap size to -allocate should be specified as an argument. -
- -The llvm_gc_allocate function is a global function defined by the -garbage collector implementation to allocate memory. It returns a -zeroed-out block of memory of the specified size, sufficiently aligned to store -any object.
- --The llvm_gc_collect function is exported by the garbage collector -implementations to provide a full collection, even when the heap is not -exhausted. This can be used by end-user code as a hint, and may be ignored by -the garbage collector. -
- --The llvm_cg_walk_gcroots function is a function provided by the code -generator that iterates through all of the GC roots on the stack, calling the -specified function pointer with each record. For each GC root, the address of -the pointer and the meta-data (from the llvm.gcroot intrinsic) are provided. -
-This is not the appropriate place to implement a garbage collected heap or a garbage collector itself. That code should exist in the language's runtime -library. The compiler plugin is responsible for generating code which is -compatible with that runtime library.
+library. The compiler plugin is responsible for generating code which +conforms to the binary interface defined by library, most essentially the +stack map.To subclass llvm::GCStrategy and register it with the compiler:
@@ -1203,7 +1163,7 @@ yet computed.)Since AsmWriter and CodeGen are separate components of LLVM, a separate abstract base class and registry is provided for printing assembly code, the -GCMetadaPrinter and GCMetadaPrinterRegistry. The AsmWriter +GCMetadaPrinter and GCMetadataPrinterRegistry. The AsmWriter will look for such a subclass if the GCStrategy sets UsesMetadata:
@@ -1337,70 +1297,6 @@ void MyGCPrinter::finishAssembly(std::ostream &OS, AsmPrinter &AP,Implementing a garbage collector for LLVM is fairly straightforward. The -LLVM garbage collectors are provided in a form that makes them easy to link into -the language-specific runtime that a language front-end would use. They require -functionality from the language-specific runtime to get information about where pointers are located in heap objects.
- -The implementation must include the -llvm_gc_allocate and -llvm_gc_collect functions. To do this, it will -probably have to trace through the roots -from the stack and understand the GC descriptors -for heap objects. Luckily, there are some example -implementations available. -
--The three most common ways to keep track of where pointers live in heap objects -are (listed in order of space overhead required):
- -The LLVM garbage collectors are capable of supporting all of these styles of -language, including ones that mix various implementations. To do this, it -allows the source-language to associate meta-data with the stack roots, and the heap tracing routines can propagate the -information. In addition, LLVM allows the front-end to extract GC information -in any form from a specific object pointer (this supports situations #1 and #3). -
- -