Chandler Carruth 9b2d091a9c [InstCombine] Do an about-face on how LLVM canonicalizes (cast (load
...)) and (load (cast ...)): canonicalize toward the former.

Historically, we've tried to load using the type of the *pointer*, and
tried to match that type as closely as possible removing as many pointer
casts as we could and trading them for bitcasts of the loaded value.
This is deeply and fundamentally wrong.

Repeat after me: memory does not have a type! This was a hard lesson for
me to learn working on SROA.

There is only one thing that should actually drive the type used for
a pointer, and that is the type which we need to use to load from that
pointer. Matching up pointer types to the loaded value types is very
useful because it minimizes the physical size of the IR required for
no-op casts. Similarly, the only thing that should drive the type used
for a loaded value is *how that value is used*! Again, this minimizes
casts. And in fact, the *only* thing motivating types in any part of
LLVM's IR are the types used by the operations in the IR. We should
match them as closely as possible.

I've ended up removing some tests here as they were testing bugs or
behavior that is no longer present. Mostly though, this is just cleanup
to let the tests continue to function as intended.

The only fallout I've found so far from this change was SROA and I have
fixed it to not be impeded by the different type of load. If you find
more places where this change causes optimizations not to fire, those
too are likely bugs where we are assuming that the type of pointers is
"significant" for optimization purposes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220138 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-18 06:36:22 +00:00

21 lines
791 B
LLVM

; RUN: opt < %s -instcombine -S | FileCheck %s
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.10.0"
define internal i8* @descale_zero() {
entry:
; CHECK: load i8** inttoptr (i64 48 to i8**), align 16
; CHECK-NEXT: ret i8*
%i16_ptr = load i16** inttoptr (i64 48 to i16**), align 16
%num = load i64* inttoptr (i64 64 to i64*), align 64
%num_times_2 = shl i64 %num, 1
%num_times_2_plus_4 = add i64 %num_times_2, 4
%i8_ptr = bitcast i16* %i16_ptr to i8*
%i8_ptr_num_times_2_plus_4 = getelementptr i8* %i8_ptr, i64 %num_times_2_plus_4
%num_times_neg2 = mul i64 %num, -2
%num_times_neg2_minus_4 = add i64 %num_times_neg2, -4
%addr = getelementptr i8* %i8_ptr_num_times_2_plus_4, i64 %num_times_neg2_minus_4
ret i8* %addr
}