Because the builtin longjmp implementation uses a CTR-based indirect jump, when
the control flow arrives at the builtin setjmp call, the CTR register has
necessarily been clobbered. Correspondingly, this adds CTR to the list of
implicit definitions of the builtin setjmp pseudo instruction.
We don't need to add CTR to the implicit definitions of builtin longjmp
because, even though it does clobber the CTR register, the control flow cannot
return to inside the loop unless there is also a builtin setjmp call.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186488 91177308-0d34-0410-b5e6-96231b3b80d8
This builds on some frame-lowering code that has existed since 2005 (r24224)
but was disabled in 2008 (r48188) because it needed base pointer support to
function correctly. This implementation follows the strategy suggested by Dale
Johannesen in r48188 where the following comment was added:
This does not currently work, because the delta between old and new stack
pointers is added to offsets that reference incoming parameters after the
prolog is generated, and the code that does that doesn't handle a variable
delta. You don't want to do that anyway; a better approach is to reserve
another register that retains to the incoming stack pointer, and reference
parameters relative to that.
And now we do exactly that. If we don't need a frame pointer, then we use r31
as a base pointer. If we do need a frame pointer, then we use r30 as a base
pointer. The base pointer retains the value of the stack pointer before it was
decremented in the prologue. We then use the base pointer to resolve all
negative frame indicies. The basic scheme follows that for base pointers in the
X86 backend.
We use a base pointer when we need to dynamically realign the incoming stack
pointer. This currently applies only to static objects (dynamic allocas with
large alignments, and base-pointer support in SjLj lowering will come in future
commits).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186478 91177308-0d34-0410-b5e6-96231b3b80d8
Use PMIN/PMAX for UGE/ULE vector comparions to reduce the number of required
instructions. This trick also works for UGT/ULT, but there is no advantage in
doing so. It wouldn't reduce the number of instructions and it would actually
reduce performance.
Reviewer: Ben
radar:5972691
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186432 91177308-0d34-0410-b5e6-96231b3b80d8
When truncating to a format with fewer mantissa bits, APFloat::convert
will perform a right shift of the mantissa by the difference of the
precision of the two formats. Usually, this will result in just the
mantissa bits needed for the target format.
One special situation is if the input number is denormal. In this case,
the right shift may discard significant bits. This is usually not a
problem, since truncating a denormal usually results in zero (underflow)
after normalization anyway, since the result format's exponent range is
usually smaller than the target format's.
However, there is one case where the latter property does not hold:
when truncating from ppc_fp128 to double. In particular, truncating
a ppc_fp128 whose first double of the pair is denormal should result
in just that first double, not zero. The current code however
performs an excessive right shift, resulting in lost result bits.
This is then caught in the APFloat::normalize call performed by
APFloat::convert and causes an assertion failure.
This patch checks for the scenario of truncating a denormal, and
attempts to (possibly partially) replace the initial mantissa
right shift by decrementing the exponent, if doing so will still
result in a valid *target format* exponent.
Index: test/CodeGen/PowerPC/pr16573.ll
===================================================================
--- test/CodeGen/PowerPC/pr16573.ll (revision 0)
+++ test/CodeGen/PowerPC/pr16573.ll (revision 0)
@@ -0,0 +1,11 @@
+; RUN: llc < %s | FileCheck %s
+
+target triple = "powerpc64-unknown-linux-gnu"
+
+define double @test() {
+ %1 = fptrunc ppc_fp128 0xM818F2887B9295809800000000032D000 to double
+ ret double %1
+}
+
+; CHECK: .quad -9111018957755033591
+
Index: lib/Support/APFloat.cpp
===================================================================
--- lib/Support/APFloat.cpp (revision 185817)
+++ lib/Support/APFloat.cpp (working copy)
@@ -1956,6 +1956,23 @@
X86SpecialNan = true;
}
+ // If this is a truncation of a denormal number, and the target semantics
+ // has larger exponent range than the source semantics (this can happen
+ // when truncating from PowerPC double-double to double format), the
+ // right shift could lose result mantissa bits. Adjust exponent instead
+ // of performing excessive shift.
+ if (shift < 0 && isFiniteNonZero()) {
+ int exponentChange = significandMSB() + 1 - fromSemantics.precision;
+ if (exponent + exponentChange < toSemantics.minExponent)
+ exponentChange = toSemantics.minExponent - exponent;
+ if (exponentChange < shift)
+ exponentChange = shift;
+ if (exponentChange < 0) {
+ shift -= exponentChange;
+ exponent += exponentChange;
+ }
+ }
+
// If this is a truncation, perform the shift before we narrow the storage.
if (shift < 0 && (isFiniteNonZero() || category==fcNaN))
lostFraction = shiftRight(significandParts(), oldPartCount, -shift);
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186409 91177308-0d34-0410-b5e6-96231b3b80d8
Previously an asm operand with no operand modifier would give the error
"invalid operand in inline asm".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186407 91177308-0d34-0410-b5e6-96231b3b80d8
Another patch in the series to make more use of R.SBG. This one extends
r186072 and r186073 to handle cases where the AND is inside the shift.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186399 91177308-0d34-0410-b5e6-96231b3b80d8
Intrinsics already existed for the 64-bit variants, so these support operations
of size at most 32-bits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186392 91177308-0d34-0410-b5e6-96231b3b80d8
This patch enables calls to __aeabi_idivmod when in EABI mode,
by using the remainder value returned on registers (R1),
enabled by the ARM triple "none-eabi". Note that Darwin and
GNUEABI triples will continue lowering on GNU style, that is,
using the stack for the remainder.
Still need to add SREM/UREM support fix for 64-bit lowering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186390 91177308-0d34-0410-b5e6-96231b3b80d8
We can have a FrameSetup in one basic block and the matching FrameDestroy
in a different basic block when we have struct byval. In that case, SPAdj
is not zero at beginning of the basic block.
Modify PEI to correctly set SPAdj at beginning of each basic block using
DFS traversal. We used to assume SPAdj is 0 at beginning of each basic block.
PEI had an assert SPAdjCount || SPAdj == 0.
If we have a Destroy <n> followed by a Setup <m>, PEI will assert failure.
We can add an extra condition to make sure the pairs are matched:
The pairs start with a FrameSetup.
But since we are doing a much better job in the verifier, this patch removes
the check in PEI.
PR16393
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186364 91177308-0d34-0410-b5e6-96231b3b80d8
PPCInstrInfo::insertSelect and PPCInstrInfo::canInsertSelect were computing the
common subclass of the true and false inputs, and then selecting either the
32-bit or the 64-bit isel variant based on the result of calling
PPC::GPRCRegClass.hasSubClassEq(RC) and PPC::G8RCRegClass.hasSubClassEq(RC)
(where RC is the common subclass). Unfortunately, this is not quite right: if
we have something like this:
%vreg8<def> = SELECT_CC_I8 %vreg4<kill>, %vreg7<kill>, %vreg6<kill>, 76;
G8RC_and_G8RC_NOX0:%vreg8 CRRC:%vreg4 G8RC_NOX0:%vreg7,%vreg6
then the common subclass of G8RC_and_G8RC_NOX0 and G8RC_NOX0 is G8RC_NOX0, and
G8RC_NOX0 is not a subclass of G8RC (because it also contains the ZERO8
pseudo-register). As a result, we also need to check the common subclass
against GPRC_NOR0 and G8RC_NOX0 explicitly.
This had not been a problem for clients of insertSelect that called
canInsertSelect first (because it had a compensating mistake), but insertSelect
is also used by the PPC pseudo-instruction expander, and this error was causing
a problem in that context.
This problem was found by csmith.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186343 91177308-0d34-0410-b5e6-96231b3b80d8
There is a comment at the top of DAGTypeLegalizer::PerformExpensiveChecks
which, in part, says:
// Note that these invariants may not hold momentarily when processing a node:
// the node being processed may be put in a map before being marked Processed.
Unfortunately, this assert would be valid only if the above-mentioned invariant
held unconditionally. This was causing llc to assert when, in fact,
everything was fine.
Thanks to Richard Sandiford for investigating this issue!
Fixes PR16562.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186338 91177308-0d34-0410-b5e6-96231b3b80d8
This update was done with the following bash script:
find test/CodeGen -name "*.ll" | \
while read NAME; do
echo "$NAME"
if ! grep -q "^; *RUN: *llc.*debug" $NAME; then
TEMP=`mktemp -t temp`
cp $NAME $TEMP
sed -n "s/^define [^@]*@\([A-Za-z0-9_]*\)(.*$/\1/p" < $NAME | \
while read FUNC; do
sed -i '' "s/;\(.*\)\([A-Za-z0-9_-]*\):\( *\)$FUNC: *\$/;\1\2-LABEL:\3$FUNC:/g" $TEMP
done
sed -i '' "s/;\(.*\)-LABEL-LABEL:/;\1-LABEL:/" $TEMP
sed -i '' "s/;\(.*\)-NEXT-LABEL:/;\1-NEXT:/" $TEMP
sed -i '' "s/;\(.*\)-NOT-LABEL:/;\1-NOT:/" $TEMP
sed -i '' "s/;\(.*\)-DAG-LABEL:/;\1-DAG:/" $TEMP
mv $TEMP $NAME
fi
done
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186280 91177308-0d34-0410-b5e6-96231b3b80d8
This was done with the following sed invocation to catch label lines demarking function boundaries:
sed -i '' "s/^;\( *\)\([A-Z0-9_]*\):\( *\)test\([A-Za-z0-9_-]*\):\( *\)$/;\1\2-LABEL:\3test\4:\5/g" test/CodeGen/*/*.ll
which was written conservatively to avoid false positives rather than false negatives. I scanned through all the changes and everything looks correct.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186258 91177308-0d34-0410-b5e6-96231b3b80d8
ARM paired GPR COPY was being lowered to two MOVr without CC. This
patch puts the CC back.
My test is a reduction of the case where I encountered the issue,
64-bit atomics use paired GPRs.
The issue only occurs with selectionDAG, FastISel doesn't encounter it
so I didn't bother calling it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186226 91177308-0d34-0410-b5e6-96231b3b80d8
I'm guessing the failure had something to do with the double precision
floating point constant used in the test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186191 91177308-0d34-0410-b5e6-96231b3b80d8
In particular:
movsbw %al, %ax --> cbtw
movswl %ax, %eax --> cwtl
movslq %eax, %rax --> cltq
According to Intel's manual those have the same performance characteristics but
come with a smaller encoding.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186174 91177308-0d34-0410-b5e6-96231b3b80d8
Normal (sext (setcc ...)) sequences are optimised into
(select_cc ..., -1, 0) by DAGCombiner::visitSIGN_EXTEND.
However, this is deliberately not done for vectors, and after
vector type legalization we have (sext_inreg (setcc ...)) instead.
I wondered about trying to extend DAGCombiner to handle this case too,
but it seemed to be a loss on some other targets I tried, even those for
which SETCC isn't "legal" and SELECT_CC is.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186149 91177308-0d34-0410-b5e6-96231b3b80d8
If the source of these instructions is spilled we should load the destination.
If the destination is spilled we should store the source.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186147 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch adds explicit calling convention types for the Win64 and
System V/x86-64 ABIs. This allows code to override the default, and use
the Win64 convention on a target that wants to use SysV (and
vice-versa). This is needed to implement the `ms_abi` and `sysv_abi` GNU
attributes.
Reviewers:
CC:
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186144 91177308-0d34-0410-b5e6-96231b3b80d8
We had patterns to match v4i32 immAllZerosV -> V_SET0, but not patterns for
v8i16 (which occurs in the test case) or v16i8. The same was true for
V_SETALLONES (so I added the associated patterns for those as well).
Another bug found by llvm-stress.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186108 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes a bug (found by csmith) at -O0 where we attempt to create a RLWIMI
with an out-of-range operand. Most uses of the isRunOfOnes function are guarded
by a condition that the value is not zero. This was not true in two places, and
in both places a zero input would result in an out-of-rage MB value (= 32).
To fix this, isRunOfOnes returns false on a zero input (and I've remove one
now-redundant guard).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186101 91177308-0d34-0410-b5e6-96231b3b80d8
RISBG can handle some ANDs for which no AND IMMEDIATE exists.
It also acts as a three-operand AND for some cases where an
AND IMMEDIATE could be used instead.
It might be worth adding a pass to replace RISBG with AND IMMEDIATE
in cases where the register operands end up being the same and where
AND IMMEDIATE is smaller.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186072 91177308-0d34-0410-b5e6-96231b3b80d8
When computing currently-live registers, the register scavenger excludes undef
uses. As a result, undef uses are ignored when computing the restore points of
registers spilled into the emergency slots. While the register scavenger
normally excludes from consideration, when scavenging, registers used by the
current instruction, we need to not exclude undef uses. Otherwise, we might end
up requiring more emergency spill slots than we have (in the case where the
undef use *is* the currently-spilled register).
Another bug found by llvm-stress.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186067 91177308-0d34-0410-b5e6-96231b3b80d8
I had thought that these tests could be target-neutral, but in practice this is
not the case (on some targets, like Hexagon and Darwin), they trigger an assert
(a different assert than the one that r186044 fixes).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186051 91177308-0d34-0410-b5e6-96231b3b80d8
Enough for the radeonsi driver to use it for calculating derivatives.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186012 91177308-0d34-0410-b5e6-96231b3b80d8