Some enhancements for memcpy / memset inline expansion.

1. Teach it to use overlapping unaligned load / store to copy / set the trailing
   bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies.
2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g.
   x86 and ARM.
3. When memcpy from a constant string, do *not* replace the load with a constant
   if it's not possible to materialize an integer immediate with a single
   instruction (required a new target hook: TLI.isIntImmLegal()).
4. Use unaligned load / stores more aggressively if target hooks indicates they
   are "fast".
5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8.
   Also increase the threshold to something reasonable (8 for memset, 4 pairs
   for memcpy).

This significantly improves Dhrystone, up to 50% on ARM iOS devices.

rdar://12760078


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169791 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
Evan Cheng
2012-12-10 23:21:26 +00:00
parent 2b475922e6
commit 376642ed62
15 changed files with 299 additions and 85 deletions

View File

@@ -371,6 +371,16 @@ public:
return false;
}
/// isIntImmLegal - Returns true if the target can instruction select the
/// specified integer immediate natively (that is, it's materialized with one
/// instruction). The current *assumption* in isel is all of integer
/// immediates are "legal" and only the memcpy / memset expansion code is
/// making use of this. The rest of isel doesn't have proper cost model for
/// immediate materialization.
virtual bool isIntImmLegal(const APInt &/*Imm*/, EVT /*VT*/) const {
return true;
}
/// isShuffleMaskLegal - Targets can use this to indicate that they only
/// support *some* VECTOR_SHUFFLE operations, those with specific masks.
/// By default, if a target supports the VECTOR_SHUFFLE node, all mask values
@@ -678,12 +688,14 @@ public:
}
/// This function returns true if the target allows unaligned memory accesses.
/// of the specified type. This is used, for example, in situations where an
/// array copy/move/set is converted to a sequence of store operations. It's
/// use helps to ensure that such replacements don't generate code that causes
/// an alignment error (trap) on the target machine.
/// of the specified type. If true, it also returns whether the unaligned
/// memory access is "fast" in the second argument by reference. This is used,
/// for example, in situations where an array copy/move/set is converted to a
/// sequence of store operations. It's use helps to ensure that such
/// replacements don't generate code that causes an alignment error (trap) on
/// the target machine.
/// @brief Determine if the target supports unaligned memory accesses.
virtual bool allowsUnalignedMemoryAccesses(EVT) const {
virtual bool allowsUnalignedMemoryAccesses(EVT, bool *Fast = 0) const {
return false;
}