[DAGCombiner] Slice a big load in two loads when the element are next to each

other in memory and the target has paired load and performs post-isel loads
combining.

E.g., this optimization will transform something like this:
 a = load i64* addr
 b = trunc i64 a to i32
 c = lshr i64 a, 32
 d = trunc i64 c to i32

into:
 b = load i32* addr1
 d = load i32* addr2
Where addr1 = addr2 +/- sizeof(i32), if the target supports paired load and
performs post-isel loads combining.

One should overload TargetLowering::hasPairedLoad to provide this information.
The default is false.

<rdar://problem/14477220>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192471 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
Quentin Colombet
2013-10-11 18:01:14 +00:00
parent 563c182839
commit c34693f6ef
3 changed files with 743 additions and 2 deletions

View File

@@ -1183,6 +1183,35 @@ public:
return false;
}
/// Return true if the target supplies and combines to a paired load
/// two loaded values of type LoadedType next to each other in memory.
/// RequiredAlignment gives the minimal alignment constraints that must be met to
/// be able to select this paired load.
///
/// This information is *not* used to generate actual paired loads, but it is used
/// to generate a sequence of loads that is easier to combine into a paired load.
/// For instance, something like this:
/// a = load i64* addr
/// b = trunc i64 a to i32
/// c = lshr i64 a, 32
/// d = trunc i64 c to i32
/// will be optimized into:
/// b = load i32* addr1
/// d = load i32* addr2
/// Where addr1 = addr2 +/- sizeof(i32).
///
/// In other words, unless the target performs a post-isel load combining, this
/// information should not be provided because it will generate more loads.
virtual bool hasPairedLoad(Type * /*LoadedType*/,
unsigned & /*RequiredAligment*/) const {
return false;
}
virtual bool hasPairedLoad(EVT /*LoadedType*/,
unsigned & /*RequiredAligment*/) const {
return false;
}
/// Return true if zero-extending the specific node Val to type VT2 is free
/// (either because it's implicitly zero-extended such as ARM ldrb / ldrh or
/// because it's folded such as X86 zero-extending loads).