Switch the lowering of CTLZ_ZERO_UNDEF from a .td pattern back to the

X86ISelLowering C++ code. Because this is lowered via an xor wrapped
around a bsr, we want the dagcombine which runs after isel lowering to
have a chance to clean things up. In particular, it is very common to
see code which looks like:

  (sizeof(x)*8 - 1) ^ __builtin_clz(x)

Which is trying to compute the most significant bit of 'x'. That's
actually the value computed directly by the 'bsr' instruction, but if we
match it too late, we'll get completely redundant xor instructions.

The more naive code for the above (subtracting rather than using an xor)
still isn't handled correctly due to the dagcombine getting confused.

Also, while here fix an issue spotted by inspection: we should have been
expanding the zero-undef variants to the normal variants when there is
an 'lzcnt' instruction. Do so, and test for this. We don't want to
generate unnecessary 'bsr' instructions.

These two changes fix some regressions in encoding and decoding
benchmarks. However, there is still a *lot* to be improve on in this
type of code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147244 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
Chandler Carruth
2011-12-24 10:55:54 +00:00
parent c08e57c7c9
commit acc068e873
5 changed files with 93 additions and 12 deletions

View File

@@ -1761,12 +1761,3 @@ def : Pat<(cttz_zero_undef GR64:$src), (BSF64rr GR64:$src)>;
def : Pat<(cttz_zero_undef (loadi16 addr:$src)), (BSF16rm addr:$src)>;
def : Pat<(cttz_zero_undef (loadi32 addr:$src)), (BSF32rm addr:$src)>;
def : Pat<(cttz_zero_undef (loadi64 addr:$src)), (BSF64rm addr:$src)>;
def : Pat<(ctlz_zero_undef GR16:$src), (XOR16ri (BSR16rr GR16:$src), 15)>;
def : Pat<(ctlz_zero_undef GR32:$src), (XOR32ri (BSR32rr GR32:$src), 31)>;
def : Pat<(ctlz_zero_undef GR64:$src), (XOR64ri8 (BSR64rr GR64:$src), 63)>;
def : Pat<(ctlz_zero_undef (loadi16 addr:$src)),
(XOR16ri (BSR16rm addr:$src), 15)>;
def : Pat<(ctlz_zero_undef (loadi32 addr:$src)),
(XOR32ri (BSR32rm addr:$src), 31)>;
def : Pat<(ctlz_zero_undef (loadi64 addr:$src)),
(XOR64ri8 (BSR64rm addr:$src), 63)>;