Various masks on shufflevector instructions are recognizable as
specific PowerPC instructions (vector pack, vector merge, etc.).
There is existing code in PPCISelLowering.cpp to recognize the correct
patterns for big endian code. The masks for these instructions are
different for little endian code due to the big-endian numbering
employed by these instructions. This patch adds the recognition code
for little endian.
I've added a new test case test/CodeGen/PowerPC/vec_shuffle_le.ll for
this. The existing recognizer test (vec_shuffle.ll) is unnecessarily
verbose and difficult to read, so I felt it was better to add a new
test rather than modify the old one.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210536 91177308-0d34-0410-b5e6-96231b3b80d8
The code in PPCTargetLowering::PerformDAGCombine() that handles
unaligned Altivec vector loads generates a lvsl followed by a vperm.
As we've seen in numerous other places, the vperm instruction has a
big-endian bias, and this is fixed for little endian by complementing
the permute control vector and swapping the input operands. In this
case the lvsl is providing the permute control vector. Rather than
generating an lvsl and a complement operation, it is sufficient to
generate an lvsr instruction instead. Thus for LE code generation we
will generate an lvsr rather than an lvsl, and swap the other input
arguments on the vperm.
The existing test/CodeGen/PowerPC/vec_misalign.ll is updated to test
the code generation for PPC64 and PPC64LE, in addition to the existing
PPC32/G5 testing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210493 91177308-0d34-0410-b5e6-96231b3b80d8
The existing code in PPCTargetLowering::LowerMUL() for multiplying two
v16i8 values assumes that vector elements are numbered in big-endian
order. For little-endian targets, the vector element numbering is
reversed, but the vmuleub, vmuloub, and vperm instructions still
assume big-endian numbering. To account for this, we must adjust the
permute control vector and reverse the order of the input registers on
the vperm instruction.
The existing test/CodeGen/PowerPC/vec_mul.ll is updated to be executed
on powerpc64 and powerpc64le targets as well as the original powerpc
(32-bit) target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210474 91177308-0d34-0410-b5e6-96231b3b80d8
I saw at least a memory leak or two from inspection (on probably
untested error paths) and r206991, which was the original inspiration
for this change.
I ran this idea by Jim Grosbach a few weeks ago & he was OK with it.
Since it's a basically mechanical patch that seemed sufficient - usual
post-commit review, revert, etc, as needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210427 91177308-0d34-0410-b5e6-96231b3b80d8
This patch fixes a couple of lowering issues for little endian
PowerPC. The code for lowering BUILD_VECTOR contains a number of
optimizations that are only valid for big endian. For now, we disable
those optimizations for correctness. In the future, we will add
analogous optimizations that are correct for little endian.
When lowering a SHUFFLE_VECTOR to a VPERM operation, we again need to
make the now-familiar transformation of swapping the input operands
and complementing the permute control vector. Correctness of this
transformation is tested by the accompanying test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210336 91177308-0d34-0410-b5e6-96231b3b80d8
This is a preliminary patch for the PowerPC64LE support. In stage 1
of the vector support, we will support the VMX (Altivec) instruction
set, but will not yet support the VSX instructions. This is merely a
staging issue to provide functional vector support as soon as
possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210271 91177308-0d34-0410-b5e6-96231b3b80d8
This seems to match what gcc does for ppc and what every other llvm
backend does.
This is a fixed version of r209638. The difference is to avoid any change
in behavior for functions. The logic for using constant pools for function
addresseses is spread over a few places and we have to keep them in sync.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209821 91177308-0d34-0410-b5e6-96231b3b80d8
This matches gcc's behavior. It also seems natural given that aliases
contain other properties that govern how it is accessed (linkage,
visibility, dll storage).
Clang still has to be updated to expose this feature to C.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209759 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r209638 because it broke self-hosting on ppc64/Linux. (the
Clang-compiled TableGen would segfault because it jumped to an invalid address
from within _ZNK4llvm17ManagedStaticBase21RegisterManagedStaticEPFPvvEPFvS1_E
(which is within the command-line parameter registration process)).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209745 91177308-0d34-0410-b5e6-96231b3b80d8
In PPCISelLowering.cpp: PPCTargetLowering::LowerBUILD_VECTOR(), there
is an optimization for certain patterns to generate one or two vector
splats followed by a vector add or subtract. This operation is
represented by a VADD_SPLAT in the selection DAG. Prior to this
patch, it was possible for the VADD_SPLAT to be assigned the wrong
data type, causing incorrect code generation. This patch corrects the
problem.
Specifically, the code previously assigned the value type of the
BUILD_VECTOR node to the newly generated VADD_SPLAT node. This is
correct much of the time, but not always. The problem is that the
call to isConstantSplat() may return a SplatBitSize that is not the
same as the number of bits in the original element vector type. The
correct type to assign is a vector type with the same element bit size
as SplatBitSize.
The included test case shows an example of this, where the
BUILD_VECTOR node has a type of v16i8. The vector to be built is {0,
16, 0, 16, 0, 16, 0, 16, 0, 16, 0, 16, 0, 16, 0, 16}. isConstantSplat
detects that we can generate a splat of 16 for type v8i16, which is
the type we must assign to the VADD_SPLAT node. If we do not, we
generate a vspltisb of 8 and a vaddubm, which generates the incorrect
result {16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16,
16}. The correct code generation is a vspltish of 8 and a vadduhm.
This patch also corrected code generation for
CodeGen/PowerPC/2008-07-10-SplatMiscompile.ll, which had been marked
as an XFAIL, so we can remove the XFAIL from the test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209662 91177308-0d34-0410-b5e6-96231b3b80d8
This required updating the generated functions and TD file accordingly
to be pointers rather than const references.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209375 91177308-0d34-0410-b5e6-96231b3b80d8
a subtarget hook to enable. Unconditionally add to the pass pipeline
for targets that might want to use it. No functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209340 91177308-0d34-0410-b5e6-96231b3b80d8
The SplitIndexingFromLoad changes exposed a latent isel bug in the PowerPC64
backend. We matched an immediate offset with STWX8 even though it only
supports register offset.
The culprit is the complex-pattern predicate, SelectAddrIdx, which decides
that if the offset is not ISD::Constant it must be a register.
Many thanks to Bill Schmidt for testing this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209219 91177308-0d34-0410-b5e6-96231b3b80d8
- On ARM/ARM64 we get a vrev because the shuffle matching code is really smart. We still unroll anything that's not v4i32 though.
- On X86 we get a pshufb with SSSE3. Required more cleverness in isShuffleMaskLegal.
- On PPC we get a vperm for v8i16 and v4i32. v2i64 is unrolled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209123 91177308-0d34-0410-b5e6-96231b3b80d8
This is mostly a mechanical change changing all the call sites to the newer
chained-function construction pattern. This removes the horrible 15-parameter
constructor for the CallLoweringInfo in favour of setting properties of the call
via chained functions. No functional change beyond the removal of the old
constructors are intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209082 91177308-0d34-0410-b5e6-96231b3b80d8
member variable and sink the initialization of crbits into the
subtarget feature reset code.
No functional change, but this refactor will be used in a future
commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208726 91177308-0d34-0410-b5e6-96231b3b80d8
Support for the intrinsics that read from and write to global named registers
is added for r1, r2 and r13 (depending on the subtarget).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208509 91177308-0d34-0410-b5e6-96231b3b80d8
The counter-loops formation pass needs to know what operations might be
function calls (because they can't appear in counter-based loops). On PPC32,
128-bit shifts might be runtime calls (even though you can't use __int128 on
PPC32, it seems that SROA might form them).
Fixes PR19709.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208501 91177308-0d34-0410-b5e6-96231b3b80d8
The fix itself is fairly simple: move getAccessVariant to MCValue so that we
replace the old weak expression evaluation with the far more general
EvaluateAsRelocatable.
This then requires that EvaluateAsRelocatable stop when it finds a non
trivial reference kind. And that in turn requires the ELF writer to look
harder for weak references.
Last but not least, this found a case where we were being bug by bug
compatible with gas and accepting an invalid input. I reported pr19647
to track it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207920 91177308-0d34-0410-b5e6-96231b3b80d8
This is similar to the 'tail' marker, except that it guarantees that
tail call optimization will occur. It also comes with convervative IR
verification rules that ensure that tail call optimization is possible.
Reviewers: nicholas
Differential Revision: http://llvm-reviews.chandlerc.com/D3240
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207143 91177308-0d34-0410-b5e6-96231b3b80d8
I discovered this const-hole while attempting to coalesnce the Symbol
and SymbolMap data structures. There's some pending issues with that,
but I figured this change was easy to flush early.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207124 91177308-0d34-0410-b5e6-96231b3b80d8
For now it contains a single flag, SanitizeAddress, which enables
AddressSanitizer instrumentation of inline assembly.
Patch by Yuri Gorshenin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206971 91177308-0d34-0410-b5e6-96231b3b80d8
system headers above the includes of generated '.inc' files that
actually contain code. In a few targets this was already done pretty
consistently, but it wasn't done *really* consistently anywhere. It is
strictly cleaner IMO and necessary in a bunch of places where the
DEBUG_TYPE is referenced from the generated code. Consistency with the
necessary places trumps. Hopefully the build bots are OK with the
movement of intrin.h...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206838 91177308-0d34-0410-b5e6-96231b3b80d8
behavior based on other files defining DEBUG_TYPE, which means it cannot
define DEBUG_TYPE at all. This is actually better IMO as it forces folks
to define relevant DEBUG_TYPEs for their files. However, it requires all
files that currently use DEBUG(...) to define a DEBUG_TYPE if they don't
already. I've updated all such files in LLVM and will do the same for
other upstream projects.
This still leaves one important change in how LLVM uses the DEBUG_TYPE
macro going forward: we need to only define the macro *after* header
files have been #include-ed. Previously, this wasn't possible because
Debug.h required the macro to be pre-defined. This commit removes that.
By defining DEBUG_TYPE after the includes two things are fixed:
- Header files that need to provide a DEBUG_TYPE for some inline code
can do so by defining the macro before their inline code and undef-ing
it afterward so the macro does not escape.
- We no longer have rampant ODR violations due to including headers with
different DEBUG_TYPE definitions. This may be mostly an academic
violation today, but with modules these types of violations are easy
to check for and potentially very relevant.
Where necessary to suppor headers with DEBUG_TYPE, I have moved the
definitions below the includes in this commit. I plan to move the rest
of the DEBUG_TYPE macros in LLVM in subsequent commits; this one is big
enough.
The comments in Debug.h, which were hilariously out of date already,
have been updated to reflect the recommended practice going forward.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206822 91177308-0d34-0410-b5e6-96231b3b80d8