The usual CodeGenPrepare trickery, on a target-specific intrinsic.
Without this, the expansion of atomics will usually have the zext
be hoisted out of the loop, defeating the various patterns we have
to catch this precise case.
Differential Revision: http://reviews.llvm.org/D9930
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238054 91177308-0d34-0410-b5e6-96231b3b80d8
We changed the test to test non-constant values in r238049.
We can also use CHECK-NEXT to be a little stricter.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238052 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds a class for processing many recip codegen possibilities.
The TargetRecip class is intended to handle both command-line options to llc as well
as options passed in from a front-end such as clang with the -mrecip option.
The x86 backend is updated to use the new functionality.
Only -mcpu=btver2 with -ffast-math should see a functional change from this patch.
All other CPUs continue to *not* use reciprocal estimates by default with -ffast-math.
Differential Revision: http://reviews.llvm.org/D8982
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238051 91177308-0d34-0410-b5e6-96231b3b80d8
The problem was that I slipped a change required for shrink-wrapping, namely I
used getFirstTerminator instead of the getLastNonDebugInstr that was here before
the refactoring, whereas the surrounding code is not yet patched for that.
Original message:
[X86] Refactor the prologue emission to prepare for shrink-wrapping.
- Add a late pass to expand pseudo instructions (tail call and EH returns).
Instead of doing it in the prologue emission.
- Factor some static methods in X86FrameLowering to ease code sharing.
NFC.
Related to <rdar://problem/20821487>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238035 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is a 2nd attempt at committing the initial MIR serialization patch.
The first commit (r237708) made the incremental buildbots unstable and was
reverted in r237730. The original commit didn't add a terminating null
character to the LLVM IR source which was passed to LLParser, and this
sometimes caused the test 'llvmIR.mir' to fail with a parsing error because
the LLVM IR source didn't have a null character immediately after the end
and thus LLLexer encountered some garbage characters that ultimately caused
the error.
This commit also includes the other test fixes I committed in
r237712 (llc path fix) and r237723 (remove target triple) which
also got reverted in r237730.
--Original Commit Message--
MIR Serialization: print and parse LLVM IR using MIR format.
This commit is the initial commit for the MIR serialization project.
It creates a new library under CodeGen called 'MIR'. This new
library adds a new machine function pass that prints out the LLVM IR
using the MIR format. This pass is then added as a last pass when a
'stop-after' option is used in llc. The new library adds the initial
functionality for parsing of MIR files as well. This commit also
extends the llc tool so that it can recognize and parse MIR input files.
Reviewers: Duncan P. N. Exon Smith, Matthias Braun, Philip Reames
Differential Revision: http://reviews.llvm.org/D9616
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237954 91177308-0d34-0410-b5e6-96231b3b80d8
My recent patch to add support for ISA 2.07 vector pack/unpack
instructions didn't properly check for availability of the vpkudum
instruction when recognizing it as a special vector shuffle case.
This causes us to leave the vector shuffle in place (rather than
converting it to a vector permute) so that it can be recognized later
as a vpkudum, but that pattern is invalid for processors prior to
POWER8. Thus LLVM crashes with an "unable to select" message. We
observed this since one of our buildbots is configured to generate
code for a POWER7.
This patch fixes the problem by checking for availability of the
vpkudum instruction during custom lowering of vector shuffles.
I've added a test case variant for the vpkudum pattern when the
instruction isn't available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237952 91177308-0d34-0410-b5e6-96231b3b80d8
http://reviews.llvm.org/D9891
Following up on the VSX single precision loads and stores added earlier, this
adds support for elementary arithmetic operations on single precision values
in VSX registers. These instructions utilize the new VSSRC register class.
Instructions added:
xsaddsp
xsdivsp
xsmulsp
xsresp
xsrsqrtesp
xssqrtsp
xssubsp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237937 91177308-0d34-0410-b5e6-96231b3b80d8
Predicate UseAVX depricates pattern selection on AVX-512.
This predicate is necessary for DAG selection to select EVEX form.
But mapping SSE intrinsics to AVX-512 instructions is not ready yet.
So I replaced UseAVX with HasAVX for intrinsics patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237903 91177308-0d34-0410-b5e6-96231b3b80d8
This patch improves support for sign extension of the lower lanes of vectors of integers by making use of the SSE41 pmovsx* sign extension instructions where possible, and optimizing the sign extension by shifts on pre-SSE41 targets (avoiding the use of i64 arithmetic shifts which require scalarization).
It converts SIGN_EXTEND nodes to SIGN_EXTEND_VECTOR_INREG where necessary, that more closely matches the pmovsx* instruction than the default approach of using SIGN_EXTEND_INREG which splits the operation (into an ANY_EXTEND lowered to a shuffle followed by shifts) making instruction matching difficult during lowering. Necessary support for SIGN_EXTEND_VECTOR_INREG has been added to the DAGCombiner.
Differential Revision: http://reviews.llvm.org/D9848
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237885 91177308-0d34-0410-b5e6-96231b3b80d8
Ideally this is going to be and LLVM IR pass (shared, among others
with AArch64), but for the time being just enable it if consumers
ask us for optimization and not unconditionally.
Discussed with Tim Northover on IRC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237837 91177308-0d34-0410-b5e6-96231b3b80d8
In some cases it won't get cleaned up properly leading to crashes
downstream. PR23353.
Based on a patch by Davide Italiano.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237828 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
During icmp lowering it can happen that a constant value can be larger than expected (see the code around the change).
APInt::getMinSignedBits() must be checked again as the shift before can change the constant sign to positive.
I'm not sure it is the best fix possible though.
Test Plan: Regression test included.
Reviewers: resistor, chandlerc, spatel, hfinkel
Reviewed By: hfinkel
Subscribers: hfinkel, llvm-commits
Differential Revision: http://reviews.llvm.org/D9147
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237812 91177308-0d34-0410-b5e6-96231b3b80d8
fixed extract-insert i1 element,
load i1, zextload i1 should be with "and $1, %reg" to prevent loading garbage.
added a bunch of new tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237793 91177308-0d34-0410-b5e6-96231b3b80d8
It works, but I've noticed that I missed several callers of createMCAsmInfo()
and many don't have a TargetMachine to provide.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237792 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
-check-prefix replaces the default CHECK prefix rather than adding to it and
must be explicitly re-added.
Also added the N32 cases.
Reviewers: petarj
Reviewed By: petarj
Subscribers: tberghammer, llvm-commits
Differential Revision: http://reviews.llvm.org/D9668
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237790 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
For N32/N64, private labels begin with '.L' but for O32 they begin with '$'.
MCAsmInfo now has an initializer function which can be used to provide information from the TargetMachine to control the assembly syntax.
Reviewers: vkalintiris
Reviewed By: vkalintiris
Subscribers: jfb, sandeep, llvm-commits, rafael
Differential Revision: http://reviews.llvm.org/D9821
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237789 91177308-0d34-0410-b5e6-96231b3b80d8
This change implements support for lowering of the gc.relocates tied to the invoke statepoint.
This is acomplished by storing frame indices of the lowered values in "StatepointRelocatedValues" map inside FunctionLoweringInfo instead of storing them in per-basic block structure StatepointLowering.
After this change StatepointLowering is used only during "LowerStatepoint" call and it is not necessary to store it as a field in SelectionDAGBuilder anymore.
Differential Revision: http://reviews.llvm.org/D7798
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237786 91177308-0d34-0410-b5e6-96231b3b80d8
We know that _tls_index is zero for local-exec TLS variables because
they are always defined in the executable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237772 91177308-0d34-0410-b5e6-96231b3b80d8
The incremental buildbots entered a pass-fail cycle where during the fail
cycle one of the tests from this commit fails for an unknown reason. I
have reverted this commit and will investigate the cause of this problem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237730 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is the initial commit for the MIR serialization project.
It creates a new library under CodeGen called 'MIR'. This new
library adds a new machine function pass that prints out the LLVM IR
using the MIR format. This pass is then added as a last pass when a
'stop-after' option is used in llc. The new library adds the initial
functionality for parsing of MIR files as well. This commit also
extends the llc tool so that it can recognize and parse MIR input files.
Reviewers: Duncan P. N. Exon Smith, Matthias Braun, Philip Reames
Differential Revision: http://reviews.llvm.org/D9616
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237708 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The documentation writes vectors highest-index first whereas LLVM-IR writes
them lowest-index first. As a result, instructions defined in terms of
left_half() and right_half() had the halves reversed.
In addition to correcting them, they have been improved to allow shuffles
that use the same operand twice or in reverse order. For example, ilvev
used to accept masks of the form:
<0, n, 2, n+2, 4, n+4, ...>
but now accepts:
<0, 0, 2, 2, 4, 4, ...>
<n, n, n+2, n+2, n+4, n+4, ...>
<0, n, 2, n+2, 4, n+4, ...>
<n, 0, n+2, 2, n+4, 4, ...>
One further improvement is that splati.[bhwd] is now the preferred instruction
for splat-like operations. The other special shuffles are no longer used
for splats. This lead to the discovery that <0, 0, ...> would not cause
splati.[hwd] to be selected and this has also been fixed.
This fixes the enc-3des test from the test-suite on Mips64r6 with MSA.
Reviewers: vkalintiris
Reviewed By: vkalintiris
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9660
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237689 91177308-0d34-0410-b5e6-96231b3b80d8
This changes the ABI used on 32-bit x86 for passing vector arguments.
Historically, clang passes the first 4 vector arguments in-register, and additional vector arguments on the stack, regardless of platform. That is different from the behavior of gcc, icc, and msvc, all of which pass only the first 3 arguments in-register.
The 3-register convention is documented, unofficially, in Agner's calling convention guide, and, officially, in the recently released version 1.0 of the i386 psABI.
Darwin is kept as is because the OS X ABI Function Call Guide explicitly documents the current (4-register) behavior.
This fixes PR21510
Differential revision: http://reviews.llvm.org/D9644
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237682 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r237210.
Also fix X86/complex-fca.ll to match the code that we used to generate
on win32 and now generate everwhere to conform to SysV.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237639 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, they were forced to immediately follow the actual branch
instruction. This was usually OK (the LEAs actually accessing them got emitted
nearby, and weren't usually separated much afterwards). Unfortunately, a
sufficiently nasty phi elimination dumps many instructions right before the
basic block terminator, and this can increase the range too much.
This patch frees them up to be placed as usual by the constant islands pass,
and consequently has to slightly modify the form of TBB/TBH tables to refer to
a PC-relative label at the final jump. The other jump table formats were
already position-independent.
rdar://20813304
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237590 91177308-0d34-0410-b5e6-96231b3b80d8
This patch implements LLVM support for the ACLE special register intrinsics in
section 10.1, __arm_{w,r}sr{,p,64}.
This patch is intended to lower the read/write_register instrinsics, used to
implement the special register intrinsics in the clang patch for special
register intrinsics (see http://reviews.llvm.org/D9697), to ARM specific
instructions MRC,MCR,MSR etc. to allow reading an writing of coprocessor
registers in AArch32 and AArch64. This is done by inspecting the register
string passed to the intrinsic and then lowering to the appropriate
instruction.
Patch by Luke Cheeseman.
Differential Revision: http://reviews.llvm.org/D9699
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237579 91177308-0d34-0410-b5e6-96231b3b80d8
If some commits are happy, and some commits are sad, this is a sad commit. It
is sad because it restricts instruction scheduling to work around a binutils
linker bug, and moreover, one that may never be fixed. On 2012-05-21, GCC was
updated not to produce code triggering this bug, and now we'll do the same...
When resolving an address using the ELF ABI TOC pointer, two relocations are
generally required: one for the high part and one for the low part. Only
the high part generally explicitly depends on r2 (the TOC pointer). And, so,
we might produce code like this:
.Ltmp526:
addis 3, 2, .LC12@toc@ha
.Ltmp1628:
std 2, 40(1)
ld 5, 0(27)
ld 2, 8(27)
ld 11, 16(27)
ld 3, .LC12@toc@l(3)
rldicl 4, 4, 0, 32
mtctr 5
bctrl
ld 2, 40(1)
And there is nothing wrong with this code, as such, but there is a linker bug
in binutils (https://sourceware.org/bugzilla/show_bug.cgi?id=18414) that will
misoptimize this code sequence to this:
nop
std r2,40(r1)
ld r5,0(r27)
ld r2,8(r27)
ld r11,16(r27)
ld r3,-32472(r2)
clrldi r4,r4,32
mtctr r5
bctrl
ld r2,40(r1)
because the linker does not know (and does not check) that the value in r2
changed in between the instruction using the .LC12@toc@ha (TOC-relative)
relocation and the instruction using the .LC12@toc@l(3) relocation.
Because it finds these instructions using the relocations (and not by
scanning the instructions), it has been asserted that there is no good way
to detect the change of r2 in between. As a result, this bug may never be
fixed (i.e. it may become part of the definition of the ABI). GCC was
updated to add extra dependencies on r2 to instructions using the @toc@l
relocations to avoid this problem, and we'll do the same here.
This is done as a separate pass because:
1. These extra r2 dependencies are not really properties of the
instructions, but rather due to a linker bug, and maybe one day we'll be
able to get rid of them when targeting linkers without this bug (and,
thus, keeping the logic centralized here will make that
straightforward).
2. There are ISel-level peephole optimizations that propagate the @toc@l
relocations to some user instructions, and so the exta dependencies do
not apply only to a fixed set of instructions (without undesirable
definition replication).
The test case was reduced with the help of bugpoint, with minimal cleaning. I'm
looking forward to our upcoming MI serialization support, and with that, much
better tests can be created.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237556 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds support for the following new instructions in the
Power ISA 2.07:
vpksdss
vpksdus
vpkudus
vpkudum
vupkhsw
vupklsw
These instructions are available through the vec_packs, vec_packsu,
vec_unpackh, and vec_unpackl built-in interfaces. These are
lane-sensitive instructions, so the built-ins have different
implementations for big- and little-endian, and the instructions must
be marked as killing the vector swap optimization for now.
The first three instructions perform saturating pack operations. The
fourth performs a modulo pack operation, which means it can be
represented with a vector shuffle, and conversely the appropriate
vector shuffles may cause this instruction to be generated. The other
instructions are only generated via built-in support for now.
Appropriate tests have been added.
There is a companion patch to clang for the rest of this support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237499 91177308-0d34-0410-b5e6-96231b3b80d8
The new [SU]{MIN,MAX} SDNodes can be lowered directly to instructions for
most NEON datatypes - the big exclusion being v2i64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237455 91177308-0d34-0410-b5e6-96231b3b80d8
After converting a loop to a hardware loop, the pass should remove
any unnecessary instructions from the old compare-and-branch
code. This patch removes a dead constant assignment that was
used in the compare instruction.
Differential Revision: http://reviews.llvm.org/D9720
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237373 91177308-0d34-0410-b5e6-96231b3b80d8
If the loop trip count may underflow or wrap, the compiler should
not generate a hardware loop since the trip count will be
incorrect.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237365 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When we are trying to fill the delay slot of a call instruction, we must avoid
filler instructions that use the $ra register. This fixes the test
MultiSource/Applications/JM/lencod when we enable the forward delay slot filler.
Reviewers: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9670
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237362 91177308-0d34-0410-b5e6-96231b3b80d8