Summary:
Implement materialize of floating point literals in Mips Fast-Isel
Reopened version of D3659
Test Plan: simplestorefp1.ll
Reviewers: dsanders
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D4071
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210546 91177308-0d34-0410-b5e6-96231b3b80d8
This patch slightly changes the algorithm introduced at revision 210477
to fix a problem where the algorithm was producing incorrect code for
the VEX.256 encoded versions of horizontal add/sub.
For these cases, we now try to split the two 256-bit vectors into
128-bit chunks before emitting horizontal add/sub dag nodes.
Added a new test case into haddsub-2.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210545 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, the basic block was searched for future uses of the base register,
and if necessary any writeback to the base register was reset using a SUB
instruction (e.g. before calling a function) just before such a use. However,
this step happened *before* the merged LDM/STM instruction was built. So if
there was (e.g.) a function call directly after the not-yet-formed LDM/STM,
the pass would first insert a SUB instruction to reset the base register,
and then (at the same location, incorrectly) insert the LDM/STM itself.
This patch fixes PR19972. Patch by Moritz Roth.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210542 91177308-0d34-0410-b5e6-96231b3b80d8
The SelectionDAG bad a special case for ISD::SELECT_CC, where it would
allow targets to specify:
setOperationAction(ISD::SELECT_CC, MVT::Other, Expand);
to indicate that they wanted to expand ISD::SELECT_CC for all types.
This wasn't applied correctly everywhere, and it makes writing new
DAG patterns with ISD::SELECT_CC difficult.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210541 91177308-0d34-0410-b5e6-96231b3b80d8
Various masks on shufflevector instructions are recognizable as
specific PowerPC instructions (vector pack, vector merge, etc.).
There is existing code in PPCISelLowering.cpp to recognize the correct
patterns for big endian code. The masks for these instructions are
different for little endian code due to the big-endian numbering
employed by these instructions. This patch adds the recognition code
for little endian.
I've added a new test case test/CodeGen/PowerPC/vec_shuffle_le.ll for
this. The existing recognizer test (vec_shuffle.ll) is unnecessarily
verbose and difficult to read, so I felt it was better to add a new
test rather than modify the old one.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210536 91177308-0d34-0410-b5e6-96231b3b80d8
inverted condition codes (CINC, CINV, CNEG, CSET, and CSETM).
Matching aliases based on "immediate classes", when disassembling,
wasn't previously supported, hence adding MCOperandPredicate
into class Operand, and implementing the support for it
in AsmWriterEmitter.
The parsing for those aliases was already custom, so just adding
the missing condition into AArch64AsmParser::parseCondCode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210528 91177308-0d34-0410-b5e6-96231b3b80d8
As Ana Pazos pointed out, these have to be restored to their incoming values
before a function returns; i.e. before the tail call. So they can't be used
correctly as the destination register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210525 91177308-0d34-0410-b5e6-96231b3b80d8
The C++ and C semantics of the compare_and_swap operations actually
require us to return a boolean "success" value. In LLVM terms this
means a second comparison of the output of "cmpxchg" against the input
desired value.
However, x86's "cmpxchg" instruction sets all flags for the comparison
formed, so we can skip any secondary comparison. (N.b. this isn't true
for cmpxchg8b/16b, which only set ZF).
rdar://problem/13201607
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210523 91177308-0d34-0410-b5e6-96231b3b80d8
Previously we were abandonning the attempt, leading to some combination of
extra work (when selection of a load/store fails completely) and inferior code
(when this leads to a real memcpy call instead of inlining).
rdar://problem/17187463
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210520 91177308-0d34-0410-b5e6-96231b3b80d8
We were hitting an assert if FastISel couldn't create the load or store we
requested. Currently this happens for large frame-local addresses, though
CodeGen could be improved there.
rdar://problem/17187463
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210519 91177308-0d34-0410-b5e6-96231b3b80d8
This improves the X86 cost model for small constants with large types. Before
this commit we would even hoist trivial constants such as i96 2.
This is related to <rdar://problem/17070936>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210504 91177308-0d34-0410-b5e6-96231b3b80d8
The code in PPCTargetLowering::PerformDAGCombine() that handles
unaligned Altivec vector loads generates a lvsl followed by a vperm.
As we've seen in numerous other places, the vperm instruction has a
big-endian bias, and this is fixed for little endian by complementing
the permute control vector and swapping the input operands. In this
case the lvsl is providing the permute control vector. Rather than
generating an lvsl and a complement operation, it is sufficient to
generate an lvsr instruction instead. Thus for LE code generation we
will generate an lvsr rather than an lvsl, and swap the other input
arguments on the vperm.
The existing test/CodeGen/PowerPC/vec_misalign.ll is updated to test
the code generation for PPC64 and PPC64LE, in addition to the existing
PPC32/G5 testing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210493 91177308-0d34-0410-b5e6-96231b3b80d8
Don't terminate location ranges for register-described variables
at the end of machine basic block if this register is never modified
in the function body, except for the prologue and epilogue. Prologue
location is guessed by FrameSetup flags on MachineInstructions, while
epilogue location is deduced from debug locations of instructions
in the basic blocks ending with return instructions.
This patch is mostly targeted to fix non-trivial debug locations for
variables addressed via stack and frame pointers.
It is not really a generic fix. We can still produce poor debug info
for register-described variables if this register *is* modified somewhere
in the function, but in unrelated places. This might be the case for the debug
info in optimized binaries (e.g. for local variables in inlined functions).
LiveDebugVariables pass in CodeGen attempts to fix this problem by adjusting
DBG_VALUE instructions, but this pass is tied to greedy register allocator,
which is used in optimized builds only. Proper fix would likely involve
generalizing LiveDebugVariables to all register allocators. See more discussion
in http://reviews.llvm.org/D3933 review thread.
I'm proceeding with this patch to fix immediate severe problems and
important cases, e.g. fix completely broken debug info with AddressSanitizer
and fix PR19307 (missing debug info for by-value std::string arguments).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210492 91177308-0d34-0410-b5e6-96231b3b80d8
The armv7-windows-itanium environment is nearly identical to the MSVC ABI. It
has a few divergences, mostly revolving around the use of the Itanium ABI for
C++. VLA support is one of the extensions that are amongst the set of the
extensions.
This adds support for proper VLA emission for this environment. This is
somewhat similar to the handling for __chkstk emission on X86 and the large
stack frame emission for ARM. The invocation style for chkstk is still
controlled via the -mcmodel flag to clang.
Make an explicit note that this is an extension.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210489 91177308-0d34-0410-b5e6-96231b3b80d8
Originally this similar was initiated by Björn Steinbrink here:
http://reviews.llvm.org/D3437
Bug itself has been fixed by principal changes in MergeFunctions. Though
special checks for functions merging are still actual. And the test has
been accepted with slight modifications.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210486 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds new target specific combine rules to identify horizontal
add/sub idioms from BUILD_VECTOR dag nodes.
This patch also teaches the DAGCombiner how to canonicalize sequences of
insert_vector_elt dag nodes according to the following rule:
(insert_vector_elt (insert_vector_elt A, I0), I1) ->
(insert_vecto_elt (insert_vector_elt A, I1), I0)
This new canonicalization rule only triggers if the inner insert_vector
dag node has exactly one use; also, both indices must be known constants,
and I1 < I0.
This last rule made it possible to write a simpler algorithm to identify
horizontal add/sub patterns because now we don't have to worry about the
ordering of insert_vector_elt dag nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210477 91177308-0d34-0410-b5e6-96231b3b80d8
The existing code in PPCTargetLowering::LowerMUL() for multiplying two
v16i8 values assumes that vector elements are numbered in big-endian
order. For little-endian targets, the vector element numbering is
reversed, but the vmuleub, vmuloub, and vperm instructions still
assume big-endian numbering. To account for this, we must adjust the
permute control vector and reverse the order of the input registers on
the vperm instruction.
The existing test/CodeGen/PowerPC/vec_mul.ll is updated to be executed
on powerpc64 and powerpc64le targets as well as the original powerpc
(32-bit) target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210474 91177308-0d34-0410-b5e6-96231b3b80d8
This patch teaches the backend how to check for the 'NoSignedWrap' flag on
binary operations to improve the emission of 'test' instructions.
If the result of a binary operation is known not to overflow we know that
resetting the Overflow flag is unnecessary and so we can avoid emitting
the test instruction.
Patch by Marcello Maggioni.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210468 91177308-0d34-0410-b5e6-96231b3b80d8
This patch modifies SelectionDAGBuilder to construct SDNodes with associated
NoSignedWrap, NoUnsignedWrap and Exact flags coming from IR BinaryOperator
instructions.
Added a new SDNode type called 'BinaryWithFlagsSDNode' to allow accessing
nsw/nuw/exact flags during codegen.
Patch by Marcello Maggioni.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210467 91177308-0d34-0410-b5e6-96231b3b80d8
Instructions from __nodebug__ functions don't have file:line
information even when inlined into no-nodebug functions. As a result,
intrinsics (SSE and other) from <*intrin.h> clang headers _never_
have file:line information.
With this change, an instruction without !dbg metadata gets one from
the call instruction when inlined.
Fixes PR19001.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210459 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes a crash on MMX intrinsics, as well as a corner case in handling of
all unsigned pack intrinsics.
PR19953.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210454 91177308-0d34-0410-b5e6-96231b3b80d8
For each array index that is in the form of zext(a), convert it to sext(a)
if we can prove zext(a) <= max signed value of typeof(a). The conversion
helps to split zext(x + y) into sext(x) + sext(y).
Reviewed in http://reviews.llvm.org/D4060
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210444 91177308-0d34-0410-b5e6-96231b3b80d8
inbounds are not necessary in these two tests. zext(a +nuw b) = zext(a) +
zext(b) should hold with or without inbounds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210437 91177308-0d34-0410-b5e6-96231b3b80d8
Before, we where looking at the size of the pointer type that specifies the
location from which to load the element. This did not make any sense at all.
This change fixes a bug in the delinearization where we failed to delinerize
certain load instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210435 91177308-0d34-0410-b5e6-96231b3b80d8
1) The commit was made despite profound lack of understanding:
"I did not understand the comment about using dyn_cast instead of isa. I will
commit as is and make the update after. You can explain what you meant to me."
Commit first, understand later isn't OK.
2) Review comments were simply ignored:
"Can you edit the summary to describe what the patch is for? It appears to be
a list of commits at the moment."
3) The patch got LGTM'd off-list without any indication of readiness.
4) The public mailing list was excluded from patch review so all of this was
hidden from the community.
This reverts commit r210414.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210424 91177308-0d34-0410-b5e6-96231b3b80d8
link.exe requires that the text section has the IMAGE_SCN_MEM_16BIT flag set.
Otherwise, it will treat the function as ARM. If this occurs, then jumps to the
function will fail, switching from thumb to ARM mode execution.
With this change, it is possible to link using the MSVC linker as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210415 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
start to do simple constants
finish simplestore
add test case
format
Merge branch 'master' into 1756_8
Add basic functionality for assignment of ints. This creates a lot of core infrastructure in which to add, with little effort, quite a bit more to mips fast-isel
Merge branch 'master' into 1756_8
Add basic functionality for assignment of ints. This creates a lot of core infrastructure in which to add, with little effort, quite a bit more to mips fast-isel
in progress
finish integer materialize
test cases
test cases
in progress
Finish up fast-isel materialize for ints.
Finish materialize for ints
test cases
simplestorei.ll
Merge branch 'master' into 1756_8
fix fp constants for fast-isel
Merge branch '1758_1' of dmz-portal.mips.com:llvm into 1758_1
in progress
lastest for fp materialization
clean up
Merge branch 'master' into 1758_1
formatting
add test case
finish test case
Merge branch 'master' into 1758_2
Test Plan:
simplestore.ll
simplestore.ll
Reviewers: dsanders
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D3659
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210414 91177308-0d34-0410-b5e6-96231b3b80d8
Rather than requiring ARM support for the ELF tests (which is odd), move the
tests that require ARM into a subdirectory to use lit to disable them if the
support is not present. Play this game to prevent disabling the ELF tests on
the Windows build bots as they have caught issues in the past with interactions
between various platforms.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210408 91177308-0d34-0410-b5e6-96231b3b80d8
GAS documents the .type directive as having an optional comma following the key
symbol name when using the STT_<TYPE_IN_UPPER_CASE> form. However, it treats
the comma as optional in all cases. This makes the IAS support both forms of
inputs. Furthermore, the prefixed forms take either the upper case name or the
lower case alias.
The tests are split into two separate sets as the hash character serves as a
comment character on x86, which is tested in the second set by using arm-elf
which uses the at symbol as a comment character.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210407 91177308-0d34-0410-b5e6-96231b3b80d8