mirror of
https://github.com/c64scene-ar/llvm-6502.git
synced 2024-11-02 07:11:49 +00:00
Metadata for annotating loops as parallel. The first consumer for this
metadata is the loop vectorizer. See the documentation update for more info. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175060 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
parent
96848dfc46
commit
5d0ce79e26
111
docs/LangRef.rst
111
docs/LangRef.rst
@ -2522,6 +2522,117 @@ Examples:
|
|||||||
!2 = metadata !{ i8 0, i8 2, i8 3, i8 6 }
|
!2 = metadata !{ i8 0, i8 2, i8 3, i8 6 }
|
||||||
!3 = metadata !{ i8 -2, i8 0, i8 3, i8 6 }
|
!3 = metadata !{ i8 -2, i8 0, i8 3, i8 6 }
|
||||||
|
|
||||||
|
'``llvm.loop``'
|
||||||
|
^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
It is sometimes useful to attach information to loop constructs. Currently,
|
||||||
|
loop metadata is implemented as metadata attached to the branch instruction
|
||||||
|
in the loop latch block. This type of metadata refer to a metadata node that is
|
||||||
|
guaranteed to be separate for each loop. The loop-level metadata is prefixed
|
||||||
|
with ``llvm.loop``.
|
||||||
|
|
||||||
|
The loop identifier metadata is implemented using a metadata that refers to
|
||||||
|
itself as follows:
|
||||||
|
|
||||||
|
.. code-block:: llvm
|
||||||
|
!0 = metadata !{ metadata !0 }
|
||||||
|
|
||||||
|
'``llvm.loop.parallel``' Metadata
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
This loop metadata can be used to communicate that a loop should be considered
|
||||||
|
a parallel loop. The semantics of parallel loops in this case is the one
|
||||||
|
with the strongest cross-iteration instruction ordering freedom: the
|
||||||
|
iterations in the loop can be considered completely independent of each
|
||||||
|
other (also known as embarrassingly parallel loops).
|
||||||
|
|
||||||
|
This metadata can originate from a programming language with parallel loop
|
||||||
|
constructs. In such a case it is completely the programmer's responsibility
|
||||||
|
to ensure the instructions from the different iterations of the loop can be
|
||||||
|
executed in an arbitrary order, in parallel, or intertwined. No loop-carried
|
||||||
|
dependency checking at all must be expected from the compiler.
|
||||||
|
|
||||||
|
In order to fulfill the LLVM requirement for metadata to be safely ignored,
|
||||||
|
it is important to ensure that a parallel loop is converted to
|
||||||
|
a sequential loop in case an optimization (agnostic of the parallel loop
|
||||||
|
semantics) converts the loop back to such. This happens when new memory
|
||||||
|
accesses that do not fulfill the requirement of free ordering across iterations
|
||||||
|
are added to the loop. Therefore, this metadata is required, but not
|
||||||
|
sufficient, to consider the loop at hand a parallel loop. For a loop
|
||||||
|
to be parallel, all its memory accessing instructions need to be
|
||||||
|
marked with the ``llvm.mem.parallel_loop_access`` metadata that refer
|
||||||
|
to the same loop identifier metadata that identify the loop at hand.
|
||||||
|
|
||||||
|
'``llvm.mem``'
|
||||||
|
^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
Metadata types used to annotate memory accesses with information helpful
|
||||||
|
for optimizations are prefixed with ``llvm.mem``.
|
||||||
|
|
||||||
|
'``llvm.mem.parallel_loop_access``' Metadata
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
For a loop to be parallel, in addition to using
|
||||||
|
the ``llvm.loop.parallel`` metadata to mark the loop latch branch instruction,
|
||||||
|
also all of the memory accessing instructions in the loop body need to be
|
||||||
|
marked with the ``llvm.mem.parallel_loop_access`` metadata. If there
|
||||||
|
is at least one memory accessing instruction not marked with the metadata,
|
||||||
|
the loop, despite it possibly using the ``llvm.loop.parallel`` metadata,
|
||||||
|
must be considered a sequential loop. This causes parallel loops to be
|
||||||
|
converted to sequential loops due to optimization passes that are unaware of
|
||||||
|
the parallel semantics and that insert new memory instructions to the loop
|
||||||
|
body.
|
||||||
|
|
||||||
|
Example of a loop that is considered parallel due to its correct use of
|
||||||
|
both ``llvm.loop.parallel`` and ``llvm.mem.parallel_loop_access``
|
||||||
|
metadata types that refer to the same loop identifier metadata.
|
||||||
|
|
||||||
|
.. code-block:: llvm
|
||||||
|
|
||||||
|
for.body:
|
||||||
|
...
|
||||||
|
%0 = load i32* %arrayidx, align 4, !llvm.mem.parallel_loop_access !0
|
||||||
|
...
|
||||||
|
store i32 %0, i32* %arrayidx4, align 4, !llvm.mem.parallel_loop_access !0
|
||||||
|
...
|
||||||
|
br i1 %exitcond, label %for.end, label %for.body, !llvm.loop.parallel !0
|
||||||
|
|
||||||
|
for.end:
|
||||||
|
...
|
||||||
|
!0 = metadata !{ metadata !0 }
|
||||||
|
|
||||||
|
It is also possible to have nested parallel loops. In that case the
|
||||||
|
memory accesses refer to a list of loop identifier metadata nodes instead of
|
||||||
|
the loop identifier metadata node directly:
|
||||||
|
|
||||||
|
.. code-block:: llvm
|
||||||
|
|
||||||
|
outer.for.body:
|
||||||
|
...
|
||||||
|
|
||||||
|
inner.for.body:
|
||||||
|
...
|
||||||
|
%0 = load i32* %arrayidx, align 4, !llvm.mem.parallel_loop_access !0
|
||||||
|
...
|
||||||
|
store i32 %0, i32* %arrayidx4, align 4, !llvm.mem.parallel_loop_access !0
|
||||||
|
...
|
||||||
|
br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop.parallel !1
|
||||||
|
|
||||||
|
inner.for.end:
|
||||||
|
...
|
||||||
|
%0 = load i32* %arrayidx, align 4, !llvm.mem.parallel_loop_access !0
|
||||||
|
...
|
||||||
|
store i32 %0, i32* %arrayidx4, align 4, !llvm.mem.parallel_loop_access !0
|
||||||
|
...
|
||||||
|
br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop.parallel !2
|
||||||
|
|
||||||
|
outer.for.end: ; preds = %for.body
|
||||||
|
...
|
||||||
|
!0 = metadata !{ metadata !1, metadata !2 } ; a list of parallel loop identifiers
|
||||||
|
!1 = metadata !{ metadata !1 } ; an identifier for the inner parallel loop
|
||||||
|
!2 = metadata !{ metadata !2 } ; an identifier for the outer parallel loop
|
||||||
|
|
||||||
|
|
||||||
Module Flags Metadata
|
Module Flags Metadata
|
||||||
=====================
|
=====================
|
||||||
|
|
||||||
|
@ -377,6 +377,20 @@ public:
|
|||||||
/// isSafeToClone - Return true if the loop body is safe to clone in practice.
|
/// isSafeToClone - Return true if the loop body is safe to clone in practice.
|
||||||
bool isSafeToClone() const;
|
bool isSafeToClone() const;
|
||||||
|
|
||||||
|
/// Returns true if the loop is annotated parallel.
|
||||||
|
///
|
||||||
|
/// A parallel loop can be assumed to not contain any dependencies between
|
||||||
|
/// iterations by the compiler. That is, any loop-carried dependency checking
|
||||||
|
/// can be skipped completely when parallelizing the loop on the target
|
||||||
|
/// machine. Thus, if the parallel loop information originates from the
|
||||||
|
/// programmer, e.g. via the OpenMP parallel for pragma, it is the
|
||||||
|
/// programmer's responsibility to ensure there are no loop-carried
|
||||||
|
/// dependencies. The final execution order of the instructions across
|
||||||
|
/// iterations is not guaranteed, thus, the end result might or might not
|
||||||
|
/// implement actual concurrent execution of instructions across multiple
|
||||||
|
/// iterations.
|
||||||
|
bool isAnnotatedParallel() const;
|
||||||
|
|
||||||
/// hasDedicatedExits - Return true if no exit block for the loop
|
/// hasDedicatedExits - Return true if no exit block for the loop
|
||||||
/// has a predecessor that is outside the loop.
|
/// has a predecessor that is outside the loop.
|
||||||
bool hasDedicatedExits() const;
|
bool hasDedicatedExits() const;
|
||||||
|
@ -24,6 +24,7 @@
|
|||||||
#include "llvm/Assembly/Writer.h"
|
#include "llvm/Assembly/Writer.h"
|
||||||
#include "llvm/IR/Constants.h"
|
#include "llvm/IR/Constants.h"
|
||||||
#include "llvm/IR/Instructions.h"
|
#include "llvm/IR/Instructions.h"
|
||||||
|
#include "llvm/IR/Metadata.h"
|
||||||
#include "llvm/Support/CFG.h"
|
#include "llvm/Support/CFG.h"
|
||||||
#include "llvm/Support/CommandLine.h"
|
#include "llvm/Support/CommandLine.h"
|
||||||
#include "llvm/Support/Debug.h"
|
#include "llvm/Support/Debug.h"
|
||||||
@ -233,6 +234,55 @@ bool Loop::isSafeToClone() const {
|
|||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
bool Loop::isAnnotatedParallel() const {
|
||||||
|
|
||||||
|
BasicBlock *latch = getLoopLatch();
|
||||||
|
if (latch == NULL)
|
||||||
|
return false;
|
||||||
|
|
||||||
|
MDNode *desiredLoopIdMetadata =
|
||||||
|
latch->getTerminator()->getMetadata("llvm.loop.parallel");
|
||||||
|
|
||||||
|
if (!desiredLoopIdMetadata)
|
||||||
|
return false;
|
||||||
|
|
||||||
|
// The loop branch contains the parallel loop metadata. In order to ensure
|
||||||
|
// that any parallel-loop-unaware optimization pass hasn't added loop-carried
|
||||||
|
// dependencies (thus converted the loop back to a sequential loop), check
|
||||||
|
// that all the memory instructions in the loop contain parallelism metadata
|
||||||
|
// that point to the same unique "loop id metadata" the loop branch does.
|
||||||
|
for (block_iterator BB = block_begin(), BE = block_end(); BB != BE; ++BB) {
|
||||||
|
for (BasicBlock::iterator II = (*BB)->begin(), EE = (*BB)->end();
|
||||||
|
II != EE; II++) {
|
||||||
|
|
||||||
|
if (!II->mayReadOrWriteMemory())
|
||||||
|
continue;
|
||||||
|
|
||||||
|
if (!II->getMetadata("llvm.mem.parallel_loop_access"))
|
||||||
|
return false;
|
||||||
|
|
||||||
|
// The memory instruction can refer to the loop identifier metadata
|
||||||
|
// directly or indirectly through another list metadata (in case of
|
||||||
|
// nested parallel loops). The loop identifier metadata refers to
|
||||||
|
// itself so we can check both cases with the same routine.
|
||||||
|
MDNode *loopIdMD =
|
||||||
|
dyn_cast<MDNode>(II->getMetadata("llvm.mem.parallel_loop_access"));
|
||||||
|
bool loopIdMDFound = false;
|
||||||
|
for (unsigned i = 0, e = loopIdMD->getNumOperands(); i < e; ++i) {
|
||||||
|
if (loopIdMD->getOperand(i) == desiredLoopIdMetadata) {
|
||||||
|
loopIdMDFound = true;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (!loopIdMDFound)
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
/// hasDedicatedExits - Return true if no exit block for the loop
|
/// hasDedicatedExits - Return true if no exit block for the loop
|
||||||
/// has a predecessor that is outside the loop.
|
/// has a predecessor that is outside the loop.
|
||||||
bool Loop::hasDedicatedExits() const {
|
bool Loop::hasDedicatedExits() const {
|
||||||
|
@ -2276,6 +2276,14 @@ void LoopVectorizationLegality::collectLoopUniforms() {
|
|||||||
}
|
}
|
||||||
|
|
||||||
bool LoopVectorizationLegality::canVectorizeMemory() {
|
bool LoopVectorizationLegality::canVectorizeMemory() {
|
||||||
|
|
||||||
|
if (TheLoop->isAnnotatedParallel()) {
|
||||||
|
DEBUG(dbgs()
|
||||||
|
<< "LV: A loop annotated parallel, ignore memory dependency "
|
||||||
|
<< "checks.\n");
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
typedef SmallVector<Value*, 16> ValueVector;
|
typedef SmallVector<Value*, 16> ValueVector;
|
||||||
typedef SmallPtrSet<Value*, 16> ValueSet;
|
typedef SmallPtrSet<Value*, 16> ValueSet;
|
||||||
// Holds the Load and Store *instructions*.
|
// Holds the Load and Store *instructions*.
|
||||||
|
Loading…
Reference in New Issue
Block a user