llvm-6502/test/CodeGen/X86/2008-06-13-VolatileLoadStore.ll

23 lines
1.0 KiB
LLVM
Raw Normal View History

; RUN: llvm-as < %s | llc -march=x86 -mattr=+sse2 | grep movsd | count 5
; RUN: llvm-as < %s | llc -march=x86 -mattr=+sse2 | grep movl | count 2
Disable some DAG combiner optimizations that may be wrong for volatile loads and stores. In fact this is almost all of them! There are three types of problems: (1) it is wrong to change the width of a volatile memory access. These may be used to do memory mapped i/o, in which case a load can have an effect even if the result is not used. Consider loading an i32 but only using the lower 8 bits. It is wrong to change this into a load of an i8, because you are no longer tickling the other three bytes. It is also unwise to make a load/store wider. For example, changing an i16 load into an i32 load is wrong no matter how aligned things are, since the fact of loading an additional 2 bytes can have i/o side-effects. (2) it is wrong to change the number of volatile load/stores: they may be counted by the hardware. (3) it is wrong to change a volatile load/store that requires one memory access into one that requires several. For example on x86-32, you can store a double in one processor operation, but to store an i64 requires two (two i32 stores). In a multi-threaded program you may want to bitcast an i64 to a double and store as a double because that will occur atomically, and be indivisible to other threads. So it would be wrong to convert the store-of-double into a store of an i64, because this will become two i32 stores - no longer atomic. My policy here is to say that the number of processor operations for an illegal operation is undefined. So it is alright to change a store of an i64 (requires at least two stores; but could be validly lowered to memcpy for example) into a store of double (one processor op). In short, if the new store is legal and has the same size then I say that the transform is ok. It would also be possible to say that transforms are always ok if before they were illegal, whether after they are illegal or not, but that's more awkward to do and I doubt it buys us anything much. However this exposed an interesting thing - on x86-32 a store of i64 is considered legal! That is because operations are marked legal by default, regardless of whether the type is legal or not. In some ways this is clever: before type legalization this means that operations on illegal types are considered legal; after type legalization there are no illegal types so now operations are only legal if they really are. But I consider this to be too cunning for mere mortals. Better to do things explicitly by testing AfterLegalize. So I have changed things so that operations with illegal types are considered illegal - indeed they can never map to a machine operation. However this means that the DAG combiner is more conservative because before it was "accidentally" performing transforms where the type was illegal because the operation was nonetheless marked legal. So in a few such places I added a check on AfterLegalize, which I suppose was actually just forgotten before. This causes the DAG combiner to do slightly more than it used to, which resulted in the X86 backend blowing up because it got a slightly surprising node it wasn't expecting, so I tweaked it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52254 91177308-0d34-0410-b5e6-96231b3b80d8
2008-06-13 19:07:40 +00:00
@atomic = global double 0.000000e+00 ; <double*> [#uses=1]
@atomic2 = global double 0.000000e+00 ; <double*> [#uses=1]
@anything = global i64 0 ; <i64*> [#uses=1]
@ioport = global i32 0 ; <i32*> [#uses=2]
define i16 @f(i64 %x, double %y) {
%b = bitcast i64 %x to double ; <double> [#uses=1]
volatile store double %b, double* @atomic ; one processor operation only
volatile store double 0.000000e+00, double* @atomic2 ; one processor operation only
%b2 = bitcast double %y to i64 ; <i64> [#uses=1]
volatile store i64 %b2, i64* @anything ; may transform to store of double
%l = volatile load i32* @ioport ; must not narrow
%t = trunc i32 %l to i16 ; <i16> [#uses=1]
%l2 = volatile load i32* @ioport ; must not narrow
%tmp = lshr i32 %l2, 16 ; <i32> [#uses=1]
%t2 = trunc i32 %tmp to i16 ; <i16> [#uses=1]
%f = add i16 %t, %t2 ; <i16> [#uses=1]
ret i16 %f
}