Metadata for annotating loops as parallel. The first consumer for this

metadata is the loop vectorizer. See the documentation update for more info. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175060 91177308-0d34-0410-b5e6-96231b3b80d8
2024-11-02 07:11:49 +00:00 · 2013-02-13 18:08:57 +00:00 · 2013-02-13 18:08:57 +00:00 · 5d0ce79e26
commit 5d0ce79e26
parent 96848dfc46
4 changed files with 183 additions and 0 deletions
--- a/docs/LangRef.rst
+++ b/docs/LangRef.rst
@ -2522,6 +2522,117 @@ Examples:
    !2 = metadata !{ i8 0, i8 2, i8 3, i8 6 }
    !3 = metadata !{ i8 -2, i8 0, i8 3, i8 6 }
 '``llvm.loop``'
 ^^^^^^^^^^^^^^^
 It is sometimes useful to attach information to loop constructs. Currently,
 loop metadata is implemented as metadata attached to the branch instruction
 in the loop latch block. This type of metadata refer to a metadata node that is
 guaranteed to be separate for each loop. The loop-level metadata is prefixed
 with ``llvm.loop``.
 The loop identifier metadata is implemented using a metadata that refers to
 itself as follows:
 .. code-block:: llvm
    !0 = metadata !{ metadata !0 }
 '``llvm.loop.parallel``' Metadata
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 This loop metadata can be used to communicate that a loop should be considered
 a parallel loop. The semantics of parallel loops in this case is the one
 with the strongest cross-iteration instruction ordering freedom: the
 iterations in the loop can be considered completely independent of each
 other (also known as embarrassingly parallel loops).
 This metadata can originate from a programming language with parallel loop
 constructs. In such a case it is completely the programmer's responsibility
 to ensure the instructions from the different iterations of the loop can be
 executed in an arbitrary order, in parallel, or intertwined. No loop-carried
 dependency checking at all must be expected from the compiler.
 In order to fulfill the LLVM requirement for metadata to be safely ignored,
 it is important to ensure that a parallel loop is converted to
 a sequential loop in case an optimization (agnostic of the parallel loop
 semantics) converts the loop back to such. This happens when new memory
 accesses that do not fulfill the requirement of free ordering across iterations
 are added to the loop. Therefore, this metadata is required, but not
 sufficient, to consider the loop at hand a parallel loop. For a loop
 to be parallel,  all its memory accessing instructions need to be
 marked with the ``llvm.mem.parallel_loop_access`` metadata that refer
 to the same loop identifier metadata that identify the loop at hand.
 '``llvm.mem``'
 ^^^^^^^^^^^^^^^
 Metadata types used to annotate memory accesses with information helpful
 for optimizations are prefixed with ``llvm.mem``.
 '``llvm.mem.parallel_loop_access``' Metadata
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 For a loop to be parallel, in addition to using
 the ``llvm.loop.parallel`` metadata to mark the loop latch branch instruction,
 also all of the memory accessing instructions in the loop body need to be
 marked with the ``llvm.mem.parallel_loop_access`` metadata. If there
 is at least one memory accessing instruction not marked with the metadata,
 the loop, despite it possibly using the ``llvm.loop.parallel`` metadata,
 must be considered a sequential loop. This causes parallel loops to be
 converted to sequential loops due to optimization passes that are unaware of
 the parallel semantics and that insert new memory instructions to the loop
 body.
 Example of a loop that is considered parallel due to its correct use of
 both ``llvm.loop.parallel`` and ``llvm.mem.parallel_loop_access``
 metadata types that refer to the same loop identifier metadata.
 .. code-block:: llvm
   for.body:
   ...
   %0 = load i32* %arrayidx, align 4, !llvm.mem.parallel_loop_access !0
   ...
   store i32 %0, i32* %arrayidx4, align 4, !llvm.mem.parallel_loop_access !0
   ...
   br i1 %exitcond, label %for.end, label %for.body, !llvm.loop.parallel !0
   for.end:
   ...
   !0 = metadata !{ metadata !0 }
 It is also possible to have nested parallel loops. In that case the
 memory accesses refer to a list of loop identifier metadata nodes instead of
 the loop identifier metadata node directly:
 .. code-block:: llvm
   outer.for.body:
   ...
   inner.for.body:
   ...
   %0 = load i32* %arrayidx, align 4, !llvm.mem.parallel_loop_access !0
   ...
   store i32 %0, i32* %arrayidx4, align 4, !llvm.mem.parallel_loop_access !0
   ...
   br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop.parallel !1
   inner.for.end:
   ...
   %0 = load i32* %arrayidx, align 4, !llvm.mem.parallel_loop_access !0
   ...
   store i32 %0, i32* %arrayidx4, align 4, !llvm.mem.parallel_loop_access !0
   ...
   br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop.parallel !2
   outer.for.end:                                          ; preds = %for.body
   ...
   !0 = metadata !{ metadata !1, metadata !2 } ; a list of parallel loop identifiers
   !1 = metadata !{ metadata !1 } ; an identifier for the inner parallel loop
   !2 = metadata !{ metadata !2 } ; an identifier for the outer parallel loop
 Module Flags Metadata
 =====================
--- a/include/llvm/Analysis/LoopInfo.h
+++ b/include/llvm/Analysis/LoopInfo.h
@ -377,6 +377,20 @@ public:
  /// isSafeToClone - Return true if the loop body is safe to clone in practice.
  bool isSafeToClone() const;
  /// Returns true if the loop is annotated parallel.
  ///
  /// A parallel loop can be assumed to not contain any dependencies between
  /// iterations by the compiler. That is, any loop-carried dependency checking
  /// can be skipped completely when parallelizing the loop on the target
  /// machine. Thus, if the parallel loop information originates from the
  /// programmer, e.g. via the OpenMP parallel for pragma, it is the
  /// programmer's responsibility to ensure there are no loop-carried
  /// dependencies. The final execution order of the instructions across
  /// iterations is not guaranteed, thus, the end result might or might not
  /// implement actual concurrent execution of instructions across multiple
  /// iterations.
  bool isAnnotatedParallel() const;
  /// hasDedicatedExits - Return true if no exit block for the loop
  /// has a predecessor that is outside the loop.
  bool hasDedicatedExits() const;
--- a/lib/Analysis/LoopInfo.cpp
+++ b/lib/Analysis/LoopInfo.cpp
@ -24,6 +24,7 @@
 #include "llvm/Assembly/Writer.h"
 #include "llvm/IR/Constants.h"
 #include "llvm/IR/Instructions.h"
 #include "llvm/IR/Metadata.h"
 #include "llvm/Support/CFG.h"
 #include "llvm/Support/CommandLine.h"
 #include "llvm/Support/Debug.h"
@ -233,6 +234,55 @@ bool Loop::isSafeToClone() const {
  return true;
 }
 bool Loop::isAnnotatedParallel() const {
  BasicBlock *latch = getLoopLatch();
  if (latch == NULL)
    return false;
  MDNode *desiredLoopIdMetadata =
    latch->getTerminator()->getMetadata("llvm.loop.parallel");
  if (!desiredLoopIdMetadata)
      return false;
  // The loop branch contains the parallel loop metadata. In order to ensure
  // that any parallel-loop-unaware optimization pass hasn't added loop-carried
  // dependencies (thus converted the loop back to a sequential loop), check
  // that all the memory instructions in the loop contain parallelism metadata
  // that point to the same unique "loop id metadata" the loop branch does.
  for (block_iterator BB = block_begin(), BE = block_end(); BB != BE; ++BB) {
    for (BasicBlock::iterator II = (*BB)->begin(), EE = (*BB)->end();
         II != EE; II++) {
      if (!II->mayReadOrWriteMemory())
        continue;
      if (!II->getMetadata("llvm.mem.parallel_loop_access"))
        return false;
      // The memory instruction can refer to the loop identifier metadata
      // directly or indirectly through another list metadata (in case of
      // nested parallel loops). The loop identifier metadata refers to
      // itself so we can check both cases with the same routine.
      MDNode *loopIdMD =
          dyn_cast<MDNode>(II->getMetadata("llvm.mem.parallel_loop_access"));
      bool loopIdMDFound = false;
      for (unsigned i = 0, e = loopIdMD->getNumOperands(); i < e; ++i) {
        if (loopIdMD->getOperand(i) == desiredLoopIdMetadata) {
          loopIdMDFound = true;
          break;
        }
      }
      if (!loopIdMDFound)
        return false;
    }
  }
  return true;
 }
 /// hasDedicatedExits - Return true if no exit block for the loop
 /// has a predecessor that is outside the loop.
 bool Loop::hasDedicatedExits() const {
--- a/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/lib/Transforms/Vectorize/LoopVectorize.cpp
@ -2276,6 +2276,14 @@ void LoopVectorizationLegality::collectLoopUniforms() {
 }
 bool LoopVectorizationLegality::canVectorizeMemory() {
  if (TheLoop->isAnnotatedParallel()) {
    DEBUG(dbgs()
          << "LV: A loop annotated parallel, ignore memory dependency "
          << "checks.\n");
    return true;
  }
  typedef SmallVector<Value*, 16> ValueVector;
  typedef SmallPtrSet<Value*, 16> ValueSet;
  // Holds the Load and Store *instructions*.