No functional change. This will be used by the new FMA intrinsic lowering
code.
We can probably add NO_EXC here as well, I am just not too familiar with this
part of AVX512 yet. We can add that later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215662 91177308-0d34-0410-b5e6-96231b3b80d8
This change further evolves the base class AVX512_masking in order to make it
suitable for the masking variants of the FMA instructions.
Besides AVX512_masking there is now a new base class that instructions
including FMAs can use: AVX512_masking_3src. With three-source (destructive)
instructions one of the sources is already tied to the destination. This
difference from AVX512_masking is captured by this new class. The common bits
between _masking and _masking_3src are broken out into a new super class
called AVX512_masking_common.
As with valign, there is some corresponding restructuring of the underlying
format classes. The idea is the same we want to derive from two classes
essentially: one providing the format bits and another format-independent
multiclass supplying the various masking and non-masking instruction variants.
Existing fma tests in avx512-fma*.ll provide coverage here for the non-masking
variants. For masking, the next patches in the series will add intrinsics and
intrinsic tests.
For AVX512_masking_3src to work, the (ins ...) dag has to be passed *without*
the leading source operand that is tied to dst ($src1). This is necessary to
properly construct the (ins ...) for the different variants. For the record,
I did check that if $src is mistakenly included, you do get a fairly intuitive
error message from the tablegen backend.
Part of <rdar://problem/17688758>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215660 91177308-0d34-0410-b5e6-96231b3b80d8
lowering scheme.
Currently, this just directly bails to the fallback path of splitting
the 256-bit vector into two 128-bit vectors, operating there, and then
joining the results back together. While the results are far from
perfect, they are *shockingly* good for what we're doing here. I'll be
layering the rest of the functionality on top of this piece by piece and
updating tests as I go.
Note that 256-bit vectors in this mode are still somewhat WIP. While
I think the code paths that I'm adding here are clean and good-to-go,
there are still a lot of 128-bit assumptions that I'll need to stomp out
as I march through the functional spread here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215637 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This pseudo-instruction allows the programmer to load an address from a symbolic expression into a register.
Patch by David Chisnall.
His work was sponsored by: DARPA, AFRL
I've made some minor changes to the original, such as improving the formatting and adding some comments, and I've also added a test case.
Reviewers: dsanders
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D4808
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215630 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
getCanHaveModuleDir() is renamed to isModuleDirectiveAllowed(), and
setCanHaveModuleDir() is renamed to forbidModuleDirective() since it is only
ever given a false argument.
Reviewers: vmedic
Reviewed By: vmedic
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D4885
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215628 91177308-0d34-0410-b5e6-96231b3b80d8
Certain functions such as objc_autoreleaseReturnValue have to be called as
tail-calls even at -O0. Since normal fast-isel doesn't emit calls as tail calls,
we have to fall back to SelectionDAG to select calls that are marked as tail.
<rdar://problem/17991614>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215600 91177308-0d34-0410-b5e6-96231b3b80d8
FastISel didn't take much advantage of the different addressing modes available
to it on AArch64. This commit allows the ComputeAddress method to recognize more
addressing modes that allows shifts and sign-/zero-extensions to be folded into
the memory operation itself.
For Example:
lsl x1, x1, #3 --> ldr x0, [x0, x1, lsl #3]
ldr x0, [x0, x1]
sxtw x1, w1
lsl x1, x1, #3 --> ldr x0, [x0, x1, sxtw #3]
ldr x0, [x0, x1]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215597 91177308-0d34-0410-b5e6-96231b3b80d8
In the large code model for X86 floating-point constants are placed in the
constant pool and materialized by loading from it. Since the constant pool
could be far away, a PC relative load might not work. Therefore we first
materialize the address of the constant pool with a movabsq and then load
from there the floating-point value.
Fixes <rdar://problem/17674628>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215595 91177308-0d34-0410-b5e6-96231b3b80d8
This mostly affects the i64 value type, which always resulted in an 15byte
mobavsq instruction to materialize any constant. The custom code checks the
value of the immediate and tries to use a different and smaller mov
instruction when possible.
This fixes <rdar://problem/17420988>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215593 91177308-0d34-0410-b5e6-96231b3b80d8
This change materializes now the value "0" from the zero register.
The zero register can be folded by several instruction, so no
materialization is need at all.
Fixes <rdar://problem/17924413>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215591 91177308-0d34-0410-b5e6-96231b3b80d8
This is a cleaner solution to the problem described in r215431.
When instructions are combined a dangling DBG_VALUE is removed.
This resolves bug 20598.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215587 91177308-0d34-0410-b5e6-96231b3b80d8
Split the constant materialization code into three separate helper functions for
Integer-, Floating-Point-, and GlobalValue-Constants.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215586 91177308-0d34-0410-b5e6-96231b3b80d8
This change is also in preparation for a future change to make sure that
the constant materialization uses MOVT/MOVW when available and not a load
from the constant pool.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215584 91177308-0d34-0410-b5e6-96231b3b80d8
getRegClassFor returns the incorrect register class when in Thumb2 mode.
This fix simply manually selects the register class as in the code just a few
lines above.
There is no test case for this code, because the code is currently
unreachable. This will be changed in a future commit and existing test
cases will exercise this code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215583 91177308-0d34-0410-b5e6-96231b3b80d8
This for some reason fixes v1i64 kernel arguments on pre-SI. This
currently breaks some other cases in the kernel-args.ll test for R600,
but I'm not particularly confident in the new output. VTX_READ_* are not
used for some of the scalarized cases, and the code reading from the
constant buffer doesn't make much sense to me.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215564 91177308-0d34-0410-b5e6-96231b3b80d8
Add header guards to files that were missing guards. Remove #endif comments
as they don't seem common in LLVM (we can easily add them back if we decide
they're useful)
Changes made by clang-tidy with minor tweaks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215558 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Moved some calls to setCanHaveModuleDir to the MipsTargetStreamer base class and removed the resulting empty functions from the MipsTargetELFStreamer class.
Also fixed a missing call to setCanHaveModuleDir in MipsTargetELFStreamer::emitDirectiveSetMicroMips.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: tomatabacu
Differential Revision: http://reviews.llvm.org/D4781
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215542 91177308-0d34-0410-b5e6-96231b3b80d8
Added avx512_movnt_vl multiclass for handling 256/128-bit forms of instruction.
Added encoding and lowering tests.
Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215536 91177308-0d34-0410-b5e6-96231b3b80d8
Patch by Matheus Almeida and Toma Tabacu
The lld test failure on the previous attempt to commit was caused by the
addition of the .pdr section causing the offsets it was checking to change.
This has been fixed by removing the .ent/.end directives from that test since
they weren't really needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215535 91177308-0d34-0410-b5e6-96231b3b80d8
As of r214452, isa<MemSDNode> will return true for nodes for which
isa<MemIntrinsicSDNode> will return true (classof now respects the actual class
hierarchy). So we no longer need to check for both MemIntrinsicSDNode and
MemSDNode separately.
No functionality change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215523 91177308-0d34-0410-b5e6-96231b3b80d8
one pesky test case correctly.
This test case caused the old code to infloop occilating between solving
the low-half and the high-half. The 'side balancing' part of
single-input v8 shuffle lowering didn't handle the one pattern which can
cause it to occilate. Fortunately the fuzz testing found this case.
Unfortuately it was *terrible* to handle. I'm really sorry for the
amount and density of the code here, I'd love suggestions on how to
simplify it. I feel like there *must* be a simpler form here, but after
a lot of days I've not found it. This is the only one I've found that
even works. I've added the one pesky test case along with some nice
comments explaining the core problem that we have to solve here.
So far this has survived approximately 32k test cases. More strenuous
fuzzing commencing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215519 91177308-0d34-0410-b5e6-96231b3b80d8
This implements PPCTargetLowering::getTgtMemIntrinsic for Altivec load/store
intrinsics. As with the construction of the MachineMemOperands for the
intrinsic calls used for unaligned load/store lowering, the only slight
complication is that we need to represent a larger memory range than the
loaded/stored value-type size (because the address is rounded down to an
aligned address, and we need to conservatively represent the entire possible
range of the actual access). This required adding an extra size field to
TargetLowering::IntrinsicInfo, and this was done in a way that required no
modifications to other targets (the size defaults to the store size of the
provided memory data type).
This fixes test/CodeGen/PowerPC/unal-altivec-wint.ll (so it can be un-XFAILed).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215512 91177308-0d34-0410-b5e6-96231b3b80d8
I think that this will scale better in most cases than adding a Pat<> for each
mapping from the intrinsic DAG to the intruction (i.e. rri, rrik, rrikz). We
can just lower to the SDNode and have the resulting DAG be matches by the DAG
patterns.
Alternatively (long term), we could keep the Pat<>s but generate them via the
new AVX512_masking multiclass. The difficulty is that in order to formulate
that we would have to concatenate DAGs. Currently this is only supported if
the operators of the input DAGs are identical.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215473 91177308-0d34-0410-b5e6-96231b3b80d8
v2: drop enum keyword
use correct extension mode
don't bother computing the sign in unsinged case
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215462 91177308-0d34-0410-b5e6-96231b3b80d8
v2: add tests
rename LowerSDIV24 to LowerSDIVREM24
handle the rem part in this function
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215460 91177308-0d34-0410-b5e6-96231b3b80d8
The combiner ignored DBG nodes when checking
the uses of a virtual register.
It combined a sequence like
%vreg1 = madd %vreg2, %vreg3,...
DBG_VALUE (%vreg1 ...)
%vreg4 = add %vreg1,...
to
%vreg4 = madd %vreg2, %vreg3
leaving behind a dangling DBG_VALUE with
a definition. This triggered an assertion
in the MachineTraceMetrics.cpp module.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215431 91177308-0d34-0410-b5e6-96231b3b80d8
Type::dump() doesn't print a newline, which makes for a poor
experience in a debugger. This looks like it was an ommission
considering Value::dump() two lines above, so I've changed Type to add
a newline as well.
Of the two in-tree callers, one added a newline anyway, and I've
updated the other one to use Type::print instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215421 91177308-0d34-0410-b5e6-96231b3b80d8
This saves us from having to copy a 64-bit 0 value into VGPRs for
BUFFER_* instruction which only have a 12-bit immediate offset.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215399 91177308-0d34-0410-b5e6-96231b3b80d8