mirror of
https://github.com/c64scene-ar/llvm-6502.git
synced 2025-01-19 04:32:19 +00:00
6a8c7bf8e7
on X86 Atom. Some of our tests failed because the tail merging part of the BranchFolding pass was creating new basic blocks which did not contain live-in information. When the anti-dependency code in the Post-RA scheduler ran, it would sometimes rename the register containing the function return value because the fact that the return value was live-in to the subsequent block had been lost. To fix this, it is necessary to run the RegisterScavenging code in the BranchFolding pass. This patch makes sure that the register scavenging code is invoked in the X86 subtarget only when post-RA scheduling is being done. Post RA scheduling in the X86 subtarget is only done for Atom. This patch adds a new function to the TargetRegisterClass to control whether or not live-ins should be preserved during branch folding. This is necessary in order for the anti-dependency optimizations done during the PostRASchedulerList pass to work properly when doing Post-RA scheduling for the X86 in general and for the Intel Atom in particular. The patch adds and invokes the new function trackLivenessAfterRegAlloc() instead of using the existing requiresRegisterScavenging(). It changes BranchFolding.cpp to call trackLivenessAfterRegAlloc() instead of requiresRegisterScavenging(). It changes the all the targets that implemented requiresRegisterScavenging() to also implement trackLivenessAfterRegAlloc(). It adds an assertion in the Post RA scheduler to make sure that post RA liveness information is available when it is needed. It changes the X86 break-anti-dependencies test to use –mcpu=atom, in order to avoid running into the added assertion. Finally, this patch restores the use of anti-dependency checking (which was turned off temporarily for the 3.1 release) for Intel Atom in the Post RA scheduler. Patch by Andy Zhang! Thanks to Jakob and Anton for their reviews. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155395 91177308-0d34-0410-b5e6-96231b3b80d8
37 lines
1.2 KiB
LLVM
37 lines
1.2 KiB
LLVM
; Without list-burr scheduling we may not see the difference in codegen here.
|
|
; Use a subtarget that has post-RA scheduling enabled because the anti-dependency
|
|
; breaker requires liveness information to be kept.
|
|
; RUN: llc < %s -march=x86-64 -mcpu=atom -post-RA-scheduler -pre-RA-sched=list-burr -break-anti-dependencies=none > %t
|
|
; RUN: grep {%xmm0} %t | count 14
|
|
; RUN: not grep {%xmm1} %t
|
|
; RUN: llc < %s -march=x86-64 -mcpu=atom -post-RA-scheduler -break-anti-dependencies=critical > %t
|
|
; RUN: grep {%xmm0} %t | count 7
|
|
; RUN: grep {%xmm1} %t | count 7
|
|
|
|
define void @goo(double* %r, double* %p, double* %q) nounwind {
|
|
entry:
|
|
%0 = load double* %p, align 8
|
|
%1 = fadd double %0, 1.100000e+00
|
|
%2 = fmul double %1, 1.200000e+00
|
|
%3 = fadd double %2, 1.300000e+00
|
|
%4 = fmul double %3, 1.400000e+00
|
|
%5 = fadd double %4, 1.500000e+00
|
|
%6 = fptosi double %5 to i32
|
|
%7 = load double* %r, align 8
|
|
%8 = fadd double %7, 7.100000e+00
|
|
%9 = fmul double %8, 7.200000e+00
|
|
%10 = fadd double %9, 7.300000e+00
|
|
%11 = fmul double %10, 7.400000e+00
|
|
%12 = fadd double %11, 7.500000e+00
|
|
%13 = fptosi double %12 to i32
|
|
%14 = icmp slt i32 %6, %13
|
|
br i1 %14, label %bb, label %return
|
|
|
|
bb:
|
|
store double 9.300000e+00, double* %q, align 8
|
|
ret void
|
|
|
|
return:
|
|
ret void
|
|
}
|