mirror of
https://github.com/c64scene-ar/llvm-6502.git
synced 2025-11-01 00:17:01 +00:00
Lower idempotent RMWs to fence+load
Summary: I originally tried doing this specifically for X86 in the backend in D5091, but it was rather brittle and generally running too late to be general. Furthermore, other targets may want to implement similar optimizations. So I reimplemented it at the IR-level, fitting it into AtomicExpandPass as it interacts with that pass (which could not be cleanly done before at the backend level). This optimization relies on a new target hook, which is only used by X86 for now, as the correctness of the optimization on other targets remains an open question. If it is found correct on other targets, it should be trivial to enable for them. Details of the optimization are discussed in D5091. Test Plan: make check-all + a new test Reviewers: jfb Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5422 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218455 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
@@ -1008,6 +1008,20 @@ public:
|
||||
return false;
|
||||
}
|
||||
|
||||
/// On some platforms, an AtomicRMW that never actually modifies the value
|
||||
/// (such as fetch_add of 0) can be turned into a fence followed by an
|
||||
/// atomic load. This may sound useless, but it makes it possible for the
|
||||
/// processor to keep the cacheline shared, dramatically improving
|
||||
/// performance. And such idempotent RMWs are useful for implementing some
|
||||
/// kinds of locks, see for example (justification + benchmarks):
|
||||
/// http://www.hpl.hp.com/techreports/2012/HPL-2012-68.pdf
|
||||
/// This method tries doing that transformation, returning the atomic load if
|
||||
/// it succeeds, and nullptr otherwise.
|
||||
/// If shouldExpandAtomicLoadInIR returns true on that load, it will undergo
|
||||
/// another round of expansion.
|
||||
virtual LoadInst *lowerIdempotentRMWIntoFencedLoad(AtomicRMWInst *RMWI) const {
|
||||
return nullptr;
|
||||
}
|
||||
//===--------------------------------------------------------------------===//
|
||||
// TargetLowering Configuration Methods - These methods should be invoked by
|
||||
// the derived class constructor to configure this object for the target.
|
||||
|
||||
Reference in New Issue
Block a user