The instcomine logic to handle vpermilvar's pd and 256 variants was incorrect.
The _256 variants have indexes into the individual 128 bit lanes and in all
cases it also has to mask out unused bits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207577 91177308-0d34-0410-b5e6-96231b3b80d8
This patch changes the vectorization remarks to also inform when
vectorization is possible but not beneficial.
Added tests to exercise some loop remarks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207574 91177308-0d34-0410-b5e6-96231b3b80d8
Since the ARM64 backend doesn't implement support for the old JIT those tests are failing when the regression tests are run on an AArch64 host.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207530 91177308-0d34-0410-b5e6-96231b3b80d8
This patch centralizes the handling of the thumb bit around
MCStreamer::isThumbFunc and makes isThumbFunc handle aliases.
This fixes a corner case, but the main advantage is having just one
way to check if a MCSymbol is thumb or not. This should still be
refactored to be ARM only, but at least now it is just one predicate
that has to be refactored instead of 3 (isThumbFunc,
ELF_Other_ThumbFunc, and SF_ThumbFunc).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207522 91177308-0d34-0410-b5e6-96231b3b80d8
There are no patterns for this. This was already fixed for ARM64 but I forgot
to apply it to AArch64 too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207515 91177308-0d34-0410-b5e6-96231b3b80d8
clang directly from the LLVM test suite! That doesn't work. I've
followed up on the review thread to try and get a viable solution sorted
out, but trying to get the tree clean here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207462 91177308-0d34-0410-b5e6-96231b3b80d8
When evaluating an assembly expression for a relocation, we want to
stop at MCSymbols that are in the symbol table, even if they are variables.
This is needed since the semantics may require that the relocation use them.
That is not the case when computing the value of a symbol in the symbol table.
There are no relocations in this case and we have to keep going until we hit
a section or find out that the expression doesn't have an assembly time
value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207445 91177308-0d34-0410-b5e6-96231b3b80d8
While refactoring out constructScopeDIE into two functions I realized we
were emitting DW_AT_object_pointer in the inlined subroutine when we
didn't need to (GCC doesn't, and the abstract subprogram definition has
the information already).
So here's the refactoring and the bug fix. This is one step of
refactoring to remove some subtle memory ownership semantics. It turns
out the original constructScopeDIE returned ownership in its return
value in some cases and not in others. The split into two functions now
separates those two semantics - further cleanup (unique_ptr, etc) will
follow.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207441 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r207287, reapplying r207286.
I'm hoping that declaring an explicit struct and instantiating
`addBlockEdges()` directly works around the GCC crash from r207286.
This is a lot more boilerplate, though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207438 91177308-0d34-0410-b5e6-96231b3b80d8
The symbol table itself has no relocations, so it is not possible to represent
things like
a = undefined + 1
With the patch we just omit these variables. That matches the behaviour of the
gnu assembler.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207419 91177308-0d34-0410-b5e6-96231b3b80d8
Someone couldn't bear to have a completely orthogonal set of floating-point
registers, so we've got some instructions that only accept v0-v15 (coming in
ARMv9, V128_prime: you're allowed v2, v3, v5, v7, ...).
Anyway, we were permitting even the out of range registers during assembly
(CodeGen handled it correctly). This adds a diagnostic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207412 91177308-0d34-0410-b5e6-96231b3b80d8
by avoiding inlining massive switches merely because they have no
instructions in them. These switches still show up where we fail to form
lookup tables, and in those cases they are actually going to cause
a very significant code size hit anyways, so inlining them is not the
right call. The right way to fix any performance regressions stemming
from this is to enhance the switch-to-lookup-table logic to fire in more
places.
This makes PR19499 about 5x less bad. It uncovers a second compile time
problem in that test case that is unrelated (surprisingly!).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207403 91177308-0d34-0410-b5e6-96231b3b80d8
odd to have the output of 'llvm-ar tv' depend on the format of
TimeValue::str(), but that's what we have today. If anyone needs the
output to remain compatible with GNU ar or old versions of llvm-ar, just
shout and I'll switch the code to manually format its times.
Note that there isn't a portable format -- Mac and GNU have different
formats at least (thanks Rafael!) so...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207387 91177308-0d34-0410-b5e6-96231b3b80d8
Introduce support for WoA PE/COFF object file emission from LLVM. Add the new
target specific PE/COFF Streamer (ARMWinCOFFStreamer) that handles the ARM
specific behaviour of PE/COFF object emission. ARM exception information is not
yet emitted and is a TODO item.
The ARM specific object writer (ARMWinCOFFObjectWriter) handles the ARM specific
relocation handling in conjunction with the WinCOFFObjectWriter in the MC layer.
The MC layer needs to be updated to deal with the relocation adjustments.
Branch relocations are adjusted by 4 bytes (unlikely their ELF counterparts).
Minor tweaks to switch multiple conditional checks into equivalent switch
statements. The ObjectFileInfo is updated to relax the object file setup for
Windows COFF. Move the architecture checks into an assertion. Windows COFF is
currently only supported on x86, x86_64, and ARM (thumb). Rather than
defaulting to ELF, we will refuse to generate an object file. This is better
though as you do not get an (arbitrary) object file which is different from the
request.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207345 91177308-0d34-0410-b5e6-96231b3b80d8
Since there's no way to ensure the type unit in the .dwo and the type
unit skeleton in the .o are correlated, this cannot work.
This implementation is a bit inefficient for a few reasons, called out
in comments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207323 91177308-0d34-0410-b5e6-96231b3b80d8
This gets us pretty code for divs of i16 vectors. Turn the existing
intrinsics into the corresponding nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207317 91177308-0d34-0410-b5e6-96231b3b80d8
Otherwise the legalizer would just scalarize everything. Support for
mulhi in the targets isn't that great yet so on most targets we get
exactly the same scalarized output. Add a test for x86 vector udiv.
I had to disable the mulhi nodes on ARM because there aren't any patterns
for it. As far as I know ARM has instructions for getting the high part of
a multiply so this should be fixed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207315 91177308-0d34-0410-b5e6-96231b3b80d8