indirect branches correctly. Under some circumstances, this led to the deletion
of basic blocks that were the destination of indirect branches. In that case it
left indirect branches to nowhere in the code.
This patch replaces, and is more general than either of the previous fixes for
indirect-branch-analysis issues, r181161 and r186461.
For other branches (not indirect) this refactor should have *almost* identical
behavior to the previous version. There are some corner cases where this
refactor is able to analyze blocks that the previous version could not (e.g.
this necessitated the update to thumb2-ifcvt2.ll).
<rdar://problem/14464830>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186735 91177308-0d34-0410-b5e6-96231b3b80d8
All changes were made by the following bash script:
find test/CodeGen -name "*.ll" | \
while read NAME; do
echo "$NAME"
grep -q "^; *RUN: *llc.*debug" $NAME && continue
grep -q "^; *RUN:.*llvm-objdump" $NAME && continue
grep -q "^; *RUN: *opt.*" $NAME && continue
TEMP=`mktemp -t temp`
cp $NAME $TEMP
sed -n "s/^define [^@]*@\([A-Za-z0-9_]*\)(.*$/\1/p" < $NAME | \
while read FUNC; do
sed -i '' "s/;\([A-Za-z0-9_-]*\)\([A-Za-z0-9_-]*\):\( *\)$FUNC[:]* *\$/;\1\2-LABEL:\3$FUNC:/g" $TEMP
done
sed -i '' "s/;\(.*\)-LABEL-LABEL:/;\1-LABEL:/" $TEMP
sed -i '' "s/;\(.*\)-NEXT-LABEL:/;\1-NEXT:/" $TEMP
sed -i '' "s/;\(.*\)-NOT-LABEL:/;\1-NOT:/" $TEMP
sed -i '' "s/;\(.*\)-DAG-LABEL:/;\1-DAG:/" $TEMP
mv $TEMP $NAME
done
This script catches a superset of the cases caught by the script associated with commit r186280. It initially found some false positives due to unusual constructs in a minority of tests; all such cases were disambiguated first in commit r186621.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186624 91177308-0d34-0410-b5e6-96231b3b80d8
Intrinsics already existed for the 64-bit variants, so these support operations
of size at most 32-bits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186392 91177308-0d34-0410-b5e6-96231b3b80d8
This patch enables calls to __aeabi_idivmod when in EABI mode,
by using the remainder value returned on registers (R1),
enabled by the ARM triple "none-eabi". Note that Darwin and
GNUEABI triples will continue lowering on GNU style, that is,
using the stack for the remainder.
Still need to add SREM/UREM support fix for 64-bit lowering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186390 91177308-0d34-0410-b5e6-96231b3b80d8
We can have a FrameSetup in one basic block and the matching FrameDestroy
in a different basic block when we have struct byval. In that case, SPAdj
is not zero at beginning of the basic block.
Modify PEI to correctly set SPAdj at beginning of each basic block using
DFS traversal. We used to assume SPAdj is 0 at beginning of each basic block.
PEI had an assert SPAdjCount || SPAdj == 0.
If we have a Destroy <n> followed by a Setup <m>, PEI will assert failure.
We can add an extra condition to make sure the pairs are matched:
The pairs start with a FrameSetup.
But since we are doing a much better job in the verifier, this patch removes
the check in PEI.
PR16393
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186364 91177308-0d34-0410-b5e6-96231b3b80d8
This update was done with the following bash script:
find test/CodeGen -name "*.ll" | \
while read NAME; do
echo "$NAME"
if ! grep -q "^; *RUN: *llc.*debug" $NAME; then
TEMP=`mktemp -t temp`
cp $NAME $TEMP
sed -n "s/^define [^@]*@\([A-Za-z0-9_]*\)(.*$/\1/p" < $NAME | \
while read FUNC; do
sed -i '' "s/;\(.*\)\([A-Za-z0-9_-]*\):\( *\)$FUNC: *\$/;\1\2-LABEL:\3$FUNC:/g" $TEMP
done
sed -i '' "s/;\(.*\)-LABEL-LABEL:/;\1-LABEL:/" $TEMP
sed -i '' "s/;\(.*\)-NEXT-LABEL:/;\1-NEXT:/" $TEMP
sed -i '' "s/;\(.*\)-NOT-LABEL:/;\1-NOT:/" $TEMP
sed -i '' "s/;\(.*\)-DAG-LABEL:/;\1-DAG:/" $TEMP
mv $TEMP $NAME
fi
done
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186280 91177308-0d34-0410-b5e6-96231b3b80d8
This was done with the following sed invocation to catch label lines demarking function boundaries:
sed -i '' "s/^;\( *\)\([A-Z0-9_]*\):\( *\)test\([A-Za-z0-9_-]*\):\( *\)$/;\1\2-LABEL:\3test\4:\5/g" test/CodeGen/*/*.ll
which was written conservatively to avoid false positives rather than false negatives. I scanned through all the changes and everything looks correct.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186258 91177308-0d34-0410-b5e6-96231b3b80d8
ARM paired GPR COPY was being lowered to two MOVr without CC. This
patch puts the CC back.
My test is a reduction of the case where I encountered the issue,
64-bit atomics use paired GPRs.
The issue only occurs with selectionDAG, FastISel doesn't encounter it
so I didn't bother calling it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186226 91177308-0d34-0410-b5e6-96231b3b80d8
This prevents the emission of DAG-generated vreg definitions after a
tail call be dropping them entirely (on the grounds that nothing could
use them anyway, and they interfere with O0 CodeGen).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185754 91177308-0d34-0410-b5e6-96231b3b80d8
A "pkhtb x, x, y asr #num" uses the lower 16 bits of "y asr #num" and packs them
in the bottom half of "x". An arithmetic and logic shift are only equivalent in
this context if the shift amount is 16. We would be shifting in ones into the
bottom 16bits instead of zeros if "y" is negative.
radar://14338767
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185712 91177308-0d34-0410-b5e6-96231b3b80d8
In the SelectionDAG immediate operands to inline asm are constructed as
two separate operands. The first is a constant of value InlineAsm::Kind_Imm
and the second is a constant with the value of the immediate.
In ARMDAGToDAGISel::SelectInlineAsm, if we reach an operand of Kind_Imm we
should skip over the next operand too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185688 91177308-0d34-0410-b5e6-96231b3b80d8
In the ARM back-end, build_vector nodes are lowered to a target specific
build_vector that uses floating point type.
This works well, unless the inserted bitcasts survive until instruction
selection. In that case, they incur moves between integer unit and floating
point unit that may result in inefficient code.
In other words, this conversion may introduce artificial dependencies when the
code leading to the build vector cannot be completed with a floating point type.
In particular, this happens when loads are not aligned.
Before this patch, in that case, the compiler generates general purpose loads
and creates the floating point vector from them, instead of directly using the
vector unit.
The patch uses a vector friendly sequence of code when the inserted bitcasts to
floating point survived DAGCombine.
This is done by a target specific DAGCombine that changes the target specific
build_vector into a sequence of insert_vector_elt that get rid of the bitcasts.
<rdar://problem/14170854>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185587 91177308-0d34-0410-b5e6-96231b3b80d8
Swift cores implement store barriers that are stronger than the ARM
specification but weaker than general barriers. They are, in fact, just about
enough to provide the ordering needed for atomic operations with release
semantics.
This patch makes use of that quirk.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185527 91177308-0d34-0410-b5e6-96231b3b80d8
Turns out I'd misread the architecture reference manual and thought
that was a load/store-store barrier, when it's not.
Thanks for pointing it out Eli!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185356 91177308-0d34-0410-b5e6-96231b3b80d8
I believe the full "dmb ish" barrier is not required to guarantee release
semantics for atomic operations. The weaker "dmb ishst" prevents previous
operations being reordered with a store executed afterwards, which is enough.
A key point to note (fortunately already correct) is that this barrier alone is
*insufficient* for sequential consistency, no matter how liberally placed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185339 91177308-0d34-0410-b5e6-96231b3b80d8
should expand ATOMIC_CMP_SWAP nodes the same way that it does for ATOMIC_SWAP.
Since ATOMIC_LOADs on some targets (e.g. older ARM variants) get legalized to
ATOMIC_CMP_SWAPs, the missing case had been causing i64 atomic loads to crash
during isel.
<rdar://problem/14074644>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185186 91177308-0d34-0410-b5e6-96231b3b80d8
This patch assigns paired GPRs for inline asm with
64-bit data on ARM. It's enabled for both ARM and Thumb to support modifiers
like %H, %Q, %R.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185169 91177308-0d34-0410-b5e6-96231b3b80d8
We were generating intrinsics for NEON fixed-point conversions that didn't
exist (e.g. float -> i16). There are two cases to consider:
+ iN is smaller than float. In this case we can do the conversion but need an
extend or truncate as well.
+ iN is larger than float. In this case using the NEON conversion would be
incorrect so we don't perform any combining.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185158 91177308-0d34-0410-b5e6-96231b3b80d8
No functionality change.
It should suffice to check the type of a debug info metadata, instead of
calling Verify. For cases where we know the type of a DI metadata, use
assert.
Also update testing cases to make them conform to the format of DI classes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185135 91177308-0d34-0410-b5e6-96231b3b80d8
A FastISel optimization was causing us to emit no information for such
parameters & when they go missing we end up emitting a different
function type. By avoiding that shortcut we not only get types correct
(very important) but also location information (handy) - even if it's
only live at the start of a function & may be clobbered later.
Reviewed/discussion by Evan Cheng & Dan Gohman.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184604 91177308-0d34-0410-b5e6-96231b3b80d8
it at the moment.
This allows to form more paired loads even when stack coloring pass destroys the
memoryoperand's value.
<rdar://problem/13978317>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184492 91177308-0d34-0410-b5e6-96231b3b80d8
value is zero.
This allows optmizations to kick in more easily.
Fix some test cases so that they remain meaningful (i.e., not completely dead
coded) when optimizations apply.
<rdar://problem/14096009> superfluous multiply by high part of zero-extended
value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184222 91177308-0d34-0410-b5e6-96231b3b80d8
The main advantages here are way better heuristics, taking into account not
just loop depth but also __builtin_expect and other static heuristics and will
eventually learn how to use profile info. Most of the work in this patch is
pushing the MachineBlockFrequencyInfo analysis into the right places.
This is good for a 5% speedup on zlib's deflate (x86_64), there were some very
unfortunate spilling decisions in its hottest loop in longest_match(). Other
benchmarks I tried were mostly neutral.
This changes register allocation in subtle ways, update the tests for it.
2012-02-20-MachineCPBug.ll was deleted as it's very fragile and the instruction
it looked for was gone already (but the FileCheck pattern picked up unrelated
stuff).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184105 91177308-0d34-0410-b5e6-96231b3b80d8
Rather than using the full power of target-specific addressing modes in
DBG_VALUEs with Frame Indicies, simply use Frame Index + Offset. This
reduces the complexity of debug info handling down to two
representations of values (reg+offset and frame index+offset) rather
than three or four.
Ideally we could ensure that frame indicies had been eliminated by the
time we reached an assembly or dwarf generation, but I haven't spent the
time to figure out where the FIs are leaking through into that & whether
there's a good place to convert them. Some FI+offset=>reg+offset
conversion is done (see PrologEpilogInserter, for example) which is
necessary for some SelectionDAG assumptions about registers, I believe,
but it might be possible to make this a more thorough conversion &
ensure there are no remaining FIs no matter how instruction selection
is performed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184066 91177308-0d34-0410-b5e6-96231b3b80d8
in functions which call __builtin_unwind_init()
__builtin_unwind_init() is an undocumented gcc intrinsic which has this effect,
and is used in libgcc_eh.
Goes part of the way toward fixing PR8541.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183984 91177308-0d34-0410-b5e6-96231b3b80d8
This is a resubmit of r182877, which was reverted because it broken
MCJIT tests on ARM. The patch leaves MCJIT on ARM as it was before: only
enabled for iOS. I've CC'ed people from the original review and revert.
FastISel was only enabled for iOS ARM and Thumb2, this patch enables it
for ARM (not Thumb2) on Linux and NaCl, but not MCJIT.
Thumb2 support needs a bit more work, mainly around register class
restrictions.
The patch punts to SelectionDAG when doing TLS relocation on non-Darwin
targets. I will fix this and other FastISel-to-SelectionDAG failures in
a separate patch.
The patch also forces FastISel to retain frame pointers: iOS always
keeps them for backtracking (so emitted code won't change because of
this), but Linux was getting much worse code that was incorrect when
using big frames (such as test-suite's lencod). I'll also fix this in a
later patch, it will probably require a peephole so that FastISel
doesn't rematerialize frame pointers back-to-back.
The test changes are straightforward, similar to:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130513/174279.html
They also add a vararg test that got dropped in that change.
I ran all of lnt test-suite on A15 hardware with --optimize-option=-O0
and all the tests pass. All the tests also pass on x86 make check-all. I
also re-ran the check-all tests that failed on ARM, and they all seem to
pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183966 91177308-0d34-0410-b5e6-96231b3b80d8
Since we have ARM unwind directive parser and assembler, we
can check the correctness in two stages:
1. From LLVM assembly (.ll) to ARM assembly (.s)
2. From ARM assembly (.s) to ELF object file (.o)
We already have several "*.s to *.o" test cases. This CL adds
some "*.ll to *.s" test cases and removes the redundant "*.ll to *.o"
test cases.
New test cases to check "*.ll to *.s" code generator:
- ehabi.ll: Check the correctness of the generated unwind directives.
- section-name.ll: Check the section name of functions.
Removed test cases:
- ehabi-mc-cantunwind.ll
(Covered by ehabi-cantunwind.ll, and eh-directive-cantunwind.s)
- ehabi-mc-compact-pr0.ll
(Covered by ehabi.ll, eh-compact-pr0.s, eh-directive-save.s, and
eh-directive-setfp.s)
- ehabi-mc-compact-pr1.ll
(Covered by ehabi.ll, eh-compact-pr1.s, eh-directive-save.s, and
eh-directive-setfp.s)
- ehabi-mc.ll
(Covered by ehabi.ll, and eh-directive-integrated-test.s)
- ehabi-mc-section-group.ll
(Covered by section-name.ll, and eh-directive-section-comdat.s)
- ehabi-mc-section.ll
(Covered by section-name.ll, and eh-directive-section.s)
- ehabi-mc-sh_link.ll
(Covered by eh-directive-text-section.s, and eh-directive-section.s)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183628 91177308-0d34-0410-b5e6-96231b3b80d8
Changes to ARM unwind opcode assembler:
* Fix multiple .save or .vsave directives. Besides, the
order is preserved now.
* For the directives which will generate multiple opcodes,
such as ".save {r0-r11}", the order of the unwind opcode
is fixed now, i.e. the registers with less encoding value
are popped first.
* Fix the $sp offset calculation. Now, we can use the
.setfp, .pad, .save, and .vsave directives at any order.
Changes to test cases:
* Add test cases to check the order of multiple opcodes
for the .save directive.
* Fix the incorrect $sp offset in the test case. The
stack pointer offset specified in the test case was
incorrect. (Changed test cases: ehabi-mc-section.ll and
ehabi-mc.ll)
* The opcode to restore $sp are slightly reordered. The
behavior are not changed, and the new output is same
as the output of GNU as. (Changed test cases:
eh-directive-pad.s and eh-directive-setfp.s)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183627 91177308-0d34-0410-b5e6-96231b3b80d8
instantiation issue with non-standard type.
Add a backend option to warn on a given stack size limit.
Option: -mllvm -warn-stack-size=<limit>
Output (if limit is exceeded):
warning: Stack size limit exceeded (<actual size>) in <functionName>.
The longer term plan is to hook that to a clang warning.
PR:4072
<rdar://problem/13987214>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183595 91177308-0d34-0410-b5e6-96231b3b80d8
Option: -mllvm -warn-stack-size=<limit>
Output (if limit is exceeded):
warning: Stack size limit exceeded (<actual size>) in <functionName>.
The longer term plan is to hook that to a clang warning.
PR:4072
<rdar://problem/13987214>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183552 91177308-0d34-0410-b5e6-96231b3b80d8
My recent ARM FastISel patch exposed this bug:
http://llvm.org/bugs/show_bug.cgi?id=16178
The root cause is that it can't select integer sext/zext pre-ARMv6 and
asserts out.
The current integer sext/zext code doesn't handle other cases gracefully
either, so this patch makes it handle all sext and zext from i1/i8/i16
to i8/i16/i32, with and without ARMv6, both in Thumb and ARM mode. This
should fix the bug as well as make FastISel faster because it bails to
SelectionDAG less often. See fastisel-ext.patch for this.
fastisel-ext-tests.patch changes current tests to always use reg-imm AND
for 8-bit zext instead of UXTB. This simplifies code since it is
supported on ARMv4t and later, and at least on A15 both should perform
exactly the same (both have exec 1 uop 1, type I).
2013-05-31-char-shift-crash.ll is a bitcode version of the above bug
16178 repro.
fast-isel-ext.ll tests all sext/zext combinations that ARM FastISel
should now handle.
Note that my ARM FastISel enabling patch was reverted due to a separate
failure when dealing with MCJIT, I'll fix this second failure and then
turn FastISel on again for non-iOS ARM targets.
I've tested "make check-all" on my x86 box, and "lnt test-suite" on A15
hardware.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183551 91177308-0d34-0410-b5e6-96231b3b80d8