This will extend the ranges of debug info variables in registers until they are
clobbered.
Fix 1: Don't mistake DBG_VALUE instructions referring to incoming arguments on
the stack with DBG_VALUE instructions referring to variables in the frame
pointer. This fixes the gdb test-suite failure.
Fix 2: Don't trace through copies to physical registers setting up call
arguments. These registers are call clobbered, and the source register is more
likely to be a callee-saved register that can be extended through the call
instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128114 91177308-0d34-0410-b5e6-96231b3b80d8
VFP Load/Store Multiple Instructions used to embed the IA/DB addressing mode within the
MC instruction; that has been changed so that now, for example, VSTMDDB_UPD and VSTMDIA_UPD
are two instructions. Update the ARMDisassemblerCore.cpp's DisassembleVFPLdStMulFrm()
to reflect the change.
Also add a test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128103 91177308-0d34-0410-b5e6-96231b3b80d8
Temporarily reverting these to see if we can get llvm-objdump to link. Hopefully this is not the problem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128097 91177308-0d34-0410-b5e6-96231b3b80d8
These ranges get completely jumbled by the post-ra scheduler, and it is not
really reasonable to expect it to make sense of them.
Instead, teach DwarfDebug to notice when user variables in registers are
clobbered, and terminate the ranges there.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128045 91177308-0d34-0410-b5e6-96231b3b80d8
gun as does. This makes it a lot easier to compare the output of both
as the addresses are now a lot closer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127972 91177308-0d34-0410-b5e6-96231b3b80d8
to have single return block (at least getting there) for optimizations. This
is general goodness but it would prevent some tailcall optimizations.
One specific case is code like this:
int f1(void);
int f2(void);
int f3(void);
int f4(void);
int f5(void);
int f6(void);
int foo(int x) {
switch(x) {
case 1: return f1();
case 2: return f2();
case 3: return f3();
case 4: return f4();
case 5: return f5();
case 6: return f6();
}
}
=>
LBB0_2: ## %sw.bb
callq _f1
popq %rbp
ret
LBB0_3: ## %sw.bb1
callq _f2
popq %rbp
ret
LBB0_4: ## %sw.bb3
callq _f3
popq %rbp
ret
This patch teaches codegenprep to duplicate returns when the return value
is a phi and where the phi operands are produced by tail calls followed by
an unconditional branch:
sw.bb7: ; preds = %entry
%call8 = tail call i32 @f5() nounwind
br label %return
sw.bb9: ; preds = %entry
%call10 = tail call i32 @f6() nounwind
br label %return
return:
%retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ]
ret i32 %retval.0
This allows codegen to generate better code like this:
LBB0_2: ## %sw.bb
jmp _f1 ## TAILCALL
LBB0_3: ## %sw.bb1
jmp _f2 ## TAILCALL
LBB0_4: ## %sw.bb3
jmp _f3 ## TAILCALL
rdar://9147433
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127953 91177308-0d34-0410-b5e6-96231b3b80d8
not have native support for this operation (such as X86).
The legalized code uses two vector INT_TO_FP operations and is faster
than scalarizing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127951 91177308-0d34-0410-b5e6-96231b3b80d8
The relevant instruction table entries were changed sometime ago to no longer take
<Rt2> as an operand. Modify ARMDisassemblerCore.cpp to accomodate the change and
add a test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127935 91177308-0d34-0410-b5e6-96231b3b80d8
- Emit mad instead of mad.rn for shader model 1.0
- Emit explicit mov.u32 instructions for reading global variables
- (most PTX instructions cannot take global variable immediates)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127895 91177308-0d34-0410-b5e6-96231b3b80d8
For example, on 32-bit architecture, don't promote all uses of the IV
to 64-bits just because one use is a 64-bit cast.
Alternate implementation of the patch by Arnaud de Grandmaison.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127884 91177308-0d34-0410-b5e6-96231b3b80d8
comparisons on x86. Essentially, the way this works is that SUB+SBB sets
the relevant flags the same way a double-width CMP would.
This is a substantial improvement over the generic lowering in LLVM. The output
is also shorter than the gcc-generated output; I haven't done any detailed
benchmarking, though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127852 91177308-0d34-0410-b5e6-96231b3b80d8
o A8.6.195 STR (register) -- Encoding T1
o A8.6.193 STR (immediate, Thumb) -- Encoding T1
It has been changed so that now they use different addressing modes
and thus different MC representation (Operand Infos). Modify the
disassembler to reflect the change, and add relevant tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127833 91177308-0d34-0410-b5e6-96231b3b80d8
rather than an int. Thankfully, this only causes LLVM to miss optimizations, not
generate incorrect code.
This just fixes the zext at the return. We still insert an i32 ZextAssert when
reading a function's arguments, but it is followed by a truncate and another i8
ZextAssert so it is not optimized.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127766 91177308-0d34-0410-b5e6-96231b3b80d8
plus the test where it used to break.", which broke Clang self-host of a
Debug+Asserts compiler, on OS X.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127763 91177308-0d34-0410-b5e6-96231b3b80d8
conforms to the ABI, but DAGCombine could in theory recognize the sequence of
zext asserts and truncates and generate incorrect code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127754 91177308-0d34-0410-b5e6-96231b3b80d8
chose is having a non-memcpy/memset use and being larger than any native integer
type. Originally I chose having an access of a size smaller than the total size
of the alloca, but this caused some minor issues on the spirit benchmark where
SRoA runs again after some inlining.
This fixes <rdar://problem/8613163>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127718 91177308-0d34-0410-b5e6-96231b3b80d8
1. The ARM Darwin *r9 call instructions were pseudo-ized recently.
Modify the ARMDisassemblerCore.cpp file to accomodate the change.
2. The disassembler was unnecessarily adding 8 to the sign-extended imm24:
imm32 = SignExtend(imm24:'00', 32); // A8.6.23 BL, BLX (immediate)
// Encoding A1
It has no business doing such. Removed the offending logic.
Add test cases to arm-tests.txt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127707 91177308-0d34-0410-b5e6-96231b3b80d8
v2 = bitcast v1
...
v3 = bitcast v2
...
= v3
=>
v2 = bitcast v1
...
= v1
if v1 and v3 are of in the same register class.
bitcast between i32 and fp (and others) are often not nops since they
are in different register classes. These bitcast instructions are often
left because they are in different basic blocks and cannot be
eliminated by dag combine.
rdar://9104514
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127668 91177308-0d34-0410-b5e6-96231b3b80d8
VEX prefixes are working for triadic AVX
instructions. This concludes the patch set to
enable AVX support for the X86 disassebler.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127647 91177308-0d34-0410-b5e6-96231b3b80d8
register operand was erroneously added. Remove an incorrect assert which triggers the bug.
rdar://problem/9131529
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127642 91177308-0d34-0410-b5e6-96231b3b80d8
Also more cleanly separate the ARM vs. Thumb functionality. Previously, the
encoding would be incorrect for some Thumb instructions (the indirect calls).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127637 91177308-0d34-0410-b5e6-96231b3b80d8
Go ahead and add them on when we might want to use them and let
later passes remove them.
Fixes rdar://9118569
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127518 91177308-0d34-0410-b5e6-96231b3b80d8
Optimize trivial branches in CodeGenPrepare, which often get created from the
lowering of objectsize intrinsics. Unfortunately, a number of tests were relying
on llc not optimizing trivial branches, so I had to add an option to allow them
to continue to test what they originally tested.
This fixes <rdar://problem/8785296> and <rdar://problem/9112893>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127498 91177308-0d34-0410-b5e6-96231b3b80d8
protector insertion not working correctly with unreachable code. Since that
revision was rolled out, this test doesn't actual fail before this fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127497 91177308-0d34-0410-b5e6-96231b3b80d8
lowering of objectsize intrinsics. Unfortunately, a number of tests were relying
on llc not optimizing trivial branches, so I had to add an option to allow them
to continue to test what they originally tested.
This fixes <rdar://problem/8785296> and <rdar://problem/9112893>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127459 91177308-0d34-0410-b5e6-96231b3b80d8
corresponding testcases back to the previous versions.
Fixes some performance regressions only seen on 32-bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127441 91177308-0d34-0410-b5e6-96231b3b80d8
after it has finished all of its reassociations, because its
habit of unlinking operands and holding them in a datastructure
while working means that it's not easy to determine when an
instruction is really dead until after all its regular work is
done. rdar://9096268.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127424 91177308-0d34-0410-b5e6-96231b3b80d8
This happens a lot in clang-compiled C++ code because it adds overflow checks to operator new[]:
unsigned *foo(unsigned n) { return new unsigned[n]; }
We can optimize away the overflow check on 64 bit targets because (uint64_t)n*4 cannot overflow.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127418 91177308-0d34-0410-b5e6-96231b3b80d8
The insufficient encoding information of the combined instruction confuses the decoder wrt
UQADD16. Add extra logic to recover from that.
Fixed an assert reported by Sean Callanan
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127354 91177308-0d34-0410-b5e6-96231b3b80d8
The damage done by physreg coalescing only depends on the number of instructions
the extended physreg live range covers. This fixes PR9438.
The heuristic is still luck-based, and physreg coalescing really should be
disabled completely. We need a register allocator with better hinting support
before that is possible.
Convert a test to FileCheck and force spilling by inserting an extra call. The
previous spilling behavior was dependent on misguided physreg coalescing
decisions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127351 91177308-0d34-0410-b5e6-96231b3b80d8
The test is derived from an old miscompilation of
MultiSource/Benchmarks/VersaBench/8b10b which is run regularly, so we are not
losing coverage.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127350 91177308-0d34-0410-b5e6-96231b3b80d8
When ExactBECount is a constant, use it for MaxBECount.
When MaxBECount cannot be computed, replace it with ExactBECount.
Fixes PR9424.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127342 91177308-0d34-0410-b5e6-96231b3b80d8
gave up when I realized I couldn't come up with a good name for what the
refactored function would be, to describe what it does.
This is PR9343 test12, which is test3 with arguments reordered. Whoops!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127318 91177308-0d34-0410-b5e6-96231b3b80d8
a union of a float, <2 x float>, and <4 x float>. This mostly comes up with the
use of vector intrinsics, especially in NEON when programmers know the layout of
the register file. This enables codegen to eliminate a lot of the subregister
traffic it would otherwise generate.
This commit only enables this for a small number of floating-point cases, but a
lot more integer cases. I assume this is okay for all ports, but I did not do
extensive testing of the quality of code involving i512 vectors and the like. If
there is a use case where this generates worse code than before, let me know and
we can scale it back.
This fixes <rdar://problem/9036264>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127317 91177308-0d34-0410-b5e6-96231b3b80d8
reachable uses, but there still might be uses in dead blocks. Use the
standard solution of replacing all the uses with undef. This is
a rare case because it's very sensitive to phase ordering in SimplifyCFG.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127299 91177308-0d34-0410-b5e6-96231b3b80d8
testcases accordingly. Some are currently xfailed and will be filed
as bugs to be fixed or understood.
Performance results:
roughly neutral on SPEC
some micro benchmarks in the llvm suite are up between 100 and 150%, only
a pair of regressions that are due to be investigated
john-the-ripper saw:
10% improvement in traditional DES
8% improvement in BSDI DES
59% improvement in FreeBSD MD5
67% improvement in OpenBSD Blowfish
14% improvement in LM DES
Small compile time impact.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127208 91177308-0d34-0410-b5e6-96231b3b80d8
the value splatted into every element. Extend this to getTrue and getFalse which
by providing new overloads that take Types that are either i1 or <N x i1>. Use
it in InstCombine to add vector support to some code, fixing PR8469!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127116 91177308-0d34-0410-b5e6-96231b3b80d8
It broke the llvm-gcc-native-mingw32 buildbot, and we need all of them to be green for the 2.9 branch.
Takumi, please reapply after we branch, preferably with a fix ;-)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127107 91177308-0d34-0410-b5e6-96231b3b80d8
possible. This goes into instcombine and instsimplify because instsimplify
doesn't need to check hasOneUse since it returns (almost exclusively) constants.
This fixes PR9343 #4#5 and #8!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127064 91177308-0d34-0410-b5e6-96231b3b80d8