Some early revisions of the Cortex-A53 have an erratum (835769) whereby it is
possible for a 64-bit multiply-accumulate instruction in AArch64 state to
generate an incorrect result. The details are quite complex and hard to
determine statically, since branches in the code may exist in some
circumstances, but all cases end with a memory (load, store, or prefetch)
instruction followed immediately by the multiply-accumulate operation.
The safest work-around for this issue is to make the compiler avoid emitting
multiply-accumulate instructions immediately after memory instructions and the
simplest way to do this is to insert a NOP.
This patch implements such work-around in the backend, enabled via the option
-aarch64-fix-cortex-a53-835769.
The work-around code generation is not enabled by default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219603 91177308-0d34-0410-b5e6-96231b3b80d8
has been modular since r206822, and excluding it was leading to workarounds
such as the one in r219592, which this change removes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219593 91177308-0d34-0410-b5e6-96231b3b80d8
This patch improves support for commutative instructions in the x86 memory folding implementation by attempting to fold a commuted version of the instruction if the original folding fails - if that folding fails as well the instruction is 're-commuted' back to its original order before returning.
This mainly helps the stack inliner better fold reloads of 3 (or more) operand instructions (VEX encoded SSE etc.) but by performing this in the lowest foldMemoryOperandImpl implementation it also replaces the X86InstrInfo::optimizeLoadInstr version and is now used by FastISel too.
Differential Revision: http://reviews.llvm.org/D5701
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219584 91177308-0d34-0410-b5e6-96231b3b80d8
A helper routine, MultiplyOverflows, was a less efficient
reimplementation of APInt's smul_ov and umul_ov. While we are here,
clean up the code so it's more uniform.
No functionality change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219583 91177308-0d34-0410-b5e6-96231b3b80d8
On x86_64 this brings it from 80 bytes to 64 bytes. Also make any member
variables private and clean up uses to go through the existing accessors.
NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219573 91177308-0d34-0410-b5e6-96231b3b80d8
Consider the case where X is 2. (2 <<s 31)/s-2147483648 is zero but we
would fold to X. Note that this is valid when we are in the unsigned
domain because we require NUW: 2 <<u 31 results in poison.
This fixes PR21245.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219568 91177308-0d34-0410-b5e6-96231b3b80d8
consider:
C1 = INT_MIN
C2 = -1
C1 * C2 overflows without a doubt but consider the following:
%x = i32 INT_MIN
This means that (%X /s C1) is 1 and (%X /s C1) /s C2 is -1.
N. B. Move the unsigned version of this transform to InstSimplify, it
doesn't create any new instructions.
This fixes PR21243.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219567 91177308-0d34-0410-b5e6-96231b3b80d8
consider:
mul i32 nsw %x, -2147483648
this instruction will not result in poison if %x is 1
however, if we transform this into:
shl i32 nsw %x, 31
then we will be generating poison because we just shifted into the sign
bit.
This fixes PR21242.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219566 91177308-0d34-0410-b5e6-96231b3b80d8
getSmallConstantTripCount even when it isn't the exiting block.
I missed this in my first audit, very sorry. This was found in LNT and
elsewhere. I don't have a test case, but it was completely obvious from
inspection that this was the problem. I'll see if I can reduce a test
case, but I'm not really hopeful, and the value seems quite low.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219562 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: Implement the most basic form of conditional branches in Mips fast-isel.
Test Plan:
br1.ll
run 4 flavors of test-suite. mips32 r1/r2 and at -O0/O2
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: llvm-commits, rfuhler
Differential Revision: http://reviews.llvm.org/D5583
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219556 91177308-0d34-0410-b5e6-96231b3b80d8
routines and fix all of the bugs they expose.
I hit a test case that crashed even without these asserts due to passing
a non-exiting latch to the ExitingBlock parameter of the trip count
computation machinery. However, when I add the nice asserts, it turns
out we have plenty of coverage of these bugs, they just didn't manifest
in crashers.
The core problem seems to stem from an assumption that the latch *is*
the exiting block. While this is often true, and somewhat the "normal"
way to think about loops, it isn't necessarily true. The correct way to
call the trip count routines in a *generic* fashion (that is, without
a particular exit in mind) is to just use the loop's single exiting
block if it has one. The trip count can't be computed generically unless
it does. This works great for the loop vectorizer. The loop unroller
actually *wants* to select the latch when it has to chose between
multiple exits because for unrolling it is the latch trips that matter.
But if this is the desire, it needs to explicitly guard for non-exiting
latches and check for the generic trip count in that case.
I've added the asserts, and added convenience APIs for querying the trip
count generically that check for a single exit block. I've kept the APIs
consistent between computing trip count and trip multiples.
Thansk to Mark for the help debugging and tracking down the *right* fix
here!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219550 91177308-0d34-0410-b5e6-96231b3b80d8
The LLVM Lang Ref states for signed/unsigned int to float conversions:
"If the value cannot fit in the floating point value, the results are undefined."
And for FP to signed/unsigned int:
"If the value cannot fit in ty2, the results are undefined."
This matches the C definitions.
The existing behavior pins to infinity or a max int value, but that may just
lead to more confusion as seen in:
http://llvm.org/bugs/show_bug.cgi?id=21130
Returning undef will hopefully lead to a less silent failure.
Differential Revision: http://reviews.llvm.org/D5603
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219542 91177308-0d34-0410-b5e6-96231b3b80d8
1) Explicitly provide important arguments to llvm-symbolizer,
not relying on defaults.
2) Be more defensive about symbolizer output.
This might fix weird failures on ninja-x64-msvc-RA-centos6 buildbot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219541 91177308-0d34-0410-b5e6-96231b3b80d8
In fact, symbolization is now expected to work only on Linux and
FreeBSD/NetBSD, where we have dl_iterate_phdr and can learn the
main executable name without argv0 (it will be possible on BSD systems
after http://reviews.llvm.org/D5693 lands). #ifdef-out the code for
all the rest Unix systems.
Reviewed in http://reviews.llvm.org/D5610
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219534 91177308-0d34-0410-b5e6-96231b3b80d8
Currently this only functions to match simple cases
where ds_read2_* / ds_write2_* instructions can be used.
In the future it might match some of the other weird
load patterns, such as direct to LDS loads.
Currently enabled only with a subtarget feature to enable
easier testing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219533 91177308-0d34-0410-b5e6-96231b3b80d8
It also makes it more aggressive in querying range information by
adding a call to isKnownPredicateWithRanges to
isLoopBackedgeGuardedByCond and isLoopEntryGuardedByCond.
phabricator: http://reviews.llvm.org/D5638
Reviewed by: atrick, hfinkel
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219532 91177308-0d34-0410-b5e6-96231b3b80d8
is over a subset of condition codes.
This fixes the -Werror build which warns about use of uninitialized
variables in the default case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219531 91177308-0d34-0410-b5e6-96231b3b80d8
This invariant is violated (& the assertions fire) on some Objective C++
in the test-suite. Reverting while I investigate.
This reverts commit r219215.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219523 91177308-0d34-0410-b5e6-96231b3b80d8
I was quiet surprised to find this feature being used. Fortunately the uses
I found look fairly simple. In fact, they are just a very verbose version
of the regular ar commands.
Start implementing it then by parsing the script and setting the command
variables as if we had a regular command line.
This patch adds just enough support to create an empty archive and do a bit
of error checking. In followup patches I will implement at least addmod
and addlib.
From the description in the manual, even the more general case should not
be too hard to implement if needed. The features that don't map 1:1 to
the simple command line are
* Reading from multiple archives.
* Creating multiple archives.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219521 91177308-0d34-0410-b5e6-96231b3b80d8
ScalarEvolution in the presence of multiple exits. Previously all
loops exits had to have identical counts for a loop trip count to be
considered computable. This pessimization was implemented by calling
getBackedgeTakenCount(L) rather than getExitCount(L, ExitingBlock)
inside of ScalarEvolution::getSmallConstantTripCount() (see the FIXME
in the comments of that function). The pessimization was added to fix
a corner case involving undefined behavior (pr/16130). This patch more
precisely handles the undefined behavior case allowing the pessimization
to be removed.
ControlsExit replaces IsSubExpr to more precisely track the case where
undefined behavior is expected to occur. Because undefined behavior is
tracked more precisely we can remove MustExit from ExitLimit. MustExit
was used to track the case where the limit was computed potentially
assuming undefined behavior even if undefined behavior didn't necessarily
occur.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219517 91177308-0d34-0410-b5e6-96231b3b80d8