llvm-6502/test/CodeGen/X86/fold-vex.ll

; Use CPU parameters to ensure that a CPU-specific attribute is not overriding the AVX definition.

; RUN: llc < %s -mtriple=x86_64-unknown-unknown                  -mattr=+avx | FileCheck %s
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=corei7-avx             | FileCheck %s
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=btver2                 | FileCheck %s
; RUN: llc < %s -mtriple=x86_64-unknown-unknown                  -mattr=-avx | FileCheck %s --check-prefix=SSE
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=corei7-avx -mattr=-avx | FileCheck %s --check-prefix=SSE
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=btver2     -mattr=-avx | FileCheck %s --check-prefix=SSE

; No need to load unaligned operand from memory using an explicit instruction with AVX.
; The operand should be folded into the AND instr.

; With SSE, folding memory operands into math/logic ops requires 16-byte alignment
; unless specially configured on some CPUs such as AMD Family 10H.

define <4 x i32> @test1(<4 x i32>* %p0, <4 x i32> %in1) nounwind {
  %in0 = load <4 x i32>* %p0, align 2
  %a = and <4 x i32> %in0, %in1
  ret <4 x i32> %a

; CHECK-LABEL: @test1
; CHECK-NOT:   vmovups
; CHECK:       vandps (%rdi), %xmm0, %xmm0
; CHECK-NEXT:  ret

; SSE-LABEL: @test1
; SSE:       movups (%rdi), %xmm1
; SSE-NEXT:  andps %xmm1, %xmm0
; SSE-NEXT:  ret
}
Improve test to actually check for a folded load. This test was checking for lack of a "movaps" (an aligned load) rather than a "movups" (an unaligned load). It also included a store which complicated the checking. Add specific CPU runs to prevent subtarget feature flag overrides from inhibiting this optimization. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227972 91177308-0d34-0410-b5e6-96231b3b80d8 2015-02-03 15:37:18 +00:00			`; Use CPU parameters to ensure that a CPU-specific attribute is not overriding the AVX definition.`

Fix program crashes due to alignment exceptions generated for SSE memop instructions (PR22371). r224330 introduced a bug by misinterpreting the "FeatureVectorUAMem" bit. The commit log says that change did not affect anything, but that's not correct. That change allowed SSE instructions to have unaligned mem operands folded into math ops, and that's not allowed in the default specification for any SSE variant. The bug is exposed when compiling for an AVX-capable CPU that had this feature flag but without enabling AVX codegen. Another mistake in r224330 was not adding the feature flag to all AVX CPUs; the AMD chips were excluded. This is part of the fix for PR22371 ( http://llvm.org/bugs/show_bug.cgi?id=22371 ). This feature bit is SSE-specific, so I've renamed it to "FeatureSSEUnalignedMem". Changed the existing test case for the feature bit to reflect the new name and renamed the test file itself to better reflect the feature. Added runs to fold-vex.ll to check for the failing codegen. Note that the feature bit is not set by default on any CPU because it may require a configuration register setting to enable the enhanced unaligned behavior. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227983 91177308-0d34-0410-b5e6-96231b3b80d8 2015-02-03 17:13:04 +00:00			`; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx \| FileCheck %s`
			`; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=corei7-avx \| FileCheck %s`
			`; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=btver2 \| FileCheck %s`
			`; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-avx \| FileCheck %s --check-prefix=SSE`
			`; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=corei7-avx -mattr=-avx \| FileCheck %s --check-prefix=SSE`
			`; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=btver2 -mattr=-avx \| FileCheck %s --check-prefix=SSE`
Improve test to actually check for a folded load. This test was checking for lack of a "movaps" (an aligned load) rather than a "movups" (an unaligned load). It also included a store which complicated the checking. Add specific CPU runs to prevent subtarget feature flag overrides from inhibiting this optimization. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227972 91177308-0d34-0410-b5e6-96231b3b80d8 2015-02-03 15:37:18 +00:00
			`; No need to load unaligned operand from memory using an explicit instruction with AVX.`
			`; The operand should be folded into the AND instr.`
Some x86 instructions can load/store one of the operands to memory. On SSE, this memory needs to be aligned. When these instructions are encoded in VEX (on AVX) there is no such requirement. This changes the folding tables and removes the alignment restrictions from VEX-encoded instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171024 91177308-0d34-0410-b5e6-96231b3b80d8 2012-12-24 09:40:33 +00:00
Fix program crashes due to alignment exceptions generated for SSE memop instructions (PR22371). r224330 introduced a bug by misinterpreting the "FeatureVectorUAMem" bit. The commit log says that change did not affect anything, but that's not correct. That change allowed SSE instructions to have unaligned mem operands folded into math ops, and that's not allowed in the default specification for any SSE variant. The bug is exposed when compiling for an AVX-capable CPU that had this feature flag but without enabling AVX codegen. Another mistake in r224330 was not adding the feature flag to all AVX CPUs; the AMD chips were excluded. This is part of the fix for PR22371 ( http://llvm.org/bugs/show_bug.cgi?id=22371 ). This feature bit is SSE-specific, so I've renamed it to "FeatureSSEUnalignedMem". Changed the existing test case for the feature bit to reflect the new name and renamed the test file itself to better reflect the feature. Added runs to fold-vex.ll to check for the failing codegen. Note that the feature bit is not set by default on any CPU because it may require a configuration register setting to enable the enhanced unaligned behavior. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227983 91177308-0d34-0410-b5e6-96231b3b80d8 2015-02-03 17:13:04 +00:00			`; With SSE, folding memory operands into math/logic ops requires 16-byte alignment`
			`; unless specially configured on some CPUs such as AMD Family 10H.`

Improve test to actually check for a folded load. This test was checking for lack of a "movaps" (an aligned load) rather than a "movups" (an unaligned load). It also included a store which complicated the checking. Add specific CPU runs to prevent subtarget feature flag overrides from inhibiting this optimization. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227972 91177308-0d34-0410-b5e6-96231b3b80d8 2015-02-03 15:37:18 +00:00			`define <4 x i32> @test1(<4 x i32>* %p0, <4 x i32> %in1) nounwind {`
			`%in0 = load <4 x i32>* %p0, align 2`
			`%a = and <4 x i32> %in0, %in1`
			`ret <4 x i32> %a`
Some x86 instructions can load/store one of the operands to memory. On SSE, this memory needs to be aligned. When these instructions are encoded in VEX (on AVX) there is no such requirement. This changes the folding tables and removes the alignment restrictions from VEX-encoded instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171024 91177308-0d34-0410-b5e6-96231b3b80d8 2012-12-24 09:40:33 +00:00
Improve test to actually check for a folded load. This test was checking for lack of a "movaps" (an aligned load) rather than a "movups" (an unaligned load). It also included a store which complicated the checking. Add specific CPU runs to prevent subtarget feature flag overrides from inhibiting this optimization. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227972 91177308-0d34-0410-b5e6-96231b3b80d8 2015-02-03 15:37:18 +00:00			`; CHECK-LABEL: @test1`
			`; CHECK-NOT: vmovups`
			`; CHECK: vandps (%rdi), %xmm0, %xmm0`
			`; CHECK-NEXT: ret`
Fix program crashes due to alignment exceptions generated for SSE memop instructions (PR22371). r224330 introduced a bug by misinterpreting the "FeatureVectorUAMem" bit. The commit log says that change did not affect anything, but that's not correct. That change allowed SSE instructions to have unaligned mem operands folded into math ops, and that's not allowed in the default specification for any SSE variant. The bug is exposed when compiling for an AVX-capable CPU that had this feature flag but without enabling AVX codegen. Another mistake in r224330 was not adding the feature flag to all AVX CPUs; the AMD chips were excluded. This is part of the fix for PR22371 ( http://llvm.org/bugs/show_bug.cgi?id=22371 ). This feature bit is SSE-specific, so I've renamed it to "FeatureSSEUnalignedMem". Changed the existing test case for the feature bit to reflect the new name and renamed the test file itself to better reflect the feature. Added runs to fold-vex.ll to check for the failing codegen. Note that the feature bit is not set by default on any CPU because it may require a configuration register setting to enable the enhanced unaligned behavior. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227983 91177308-0d34-0410-b5e6-96231b3b80d8 2015-02-03 17:13:04 +00:00
			`; SSE-LABEL: @test1`
			`; SSE: movups (%rdi), %xmm1`
			`; SSE-NEXT: andps %xmm1, %xmm0`
			`; SSE-NEXT: ret`
Some x86 instructions can load/store one of the operands to memory. On SSE, this memory needs to be aligned. When these instructions are encoded in VEX (on AVX) there is no such requirement. This changes the folding tables and removes the alignment restrictions from VEX-encoded instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171024 91177308-0d34-0410-b5e6-96231b3b80d8 2012-12-24 09:40:33 +00:00			`}`