If an fmul was introduced by lowering, it wouldn't be folded
into a multiply by a constant since the earlier combine would
have replaced the fmul with the fadd.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216932 91177308-0d34-0410-b5e6-96231b3b80d8
Also fix a small copy-paste bug in X86ISelLowering where Chain should
have been used in place of DAG.getEntryToken().
Fixes PR20828.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216929 91177308-0d34-0410-b5e6-96231b3b80d8
This removes static initializers from the backends which generate this data, and also makes this struct match the other Tablegen generated structs in behaviour
Reviewed by Andy Trick and Chandler C
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216919 91177308-0d34-0410-b5e6-96231b3b80d8
When I recommitted r208640 (in r216898) I added an exclusion for TargetConstant
offsets, as there is no guarantee that a backend can handle them on generic
ADDs (even if it generates them during address-mode matching) -- and,
specifically, applying this transformation directly with TargetConstants caused
a self-hosting failure on PPC64. Ignoring all TargetConstants, however, is less
than ideal. Instead, for non-opaque constants, we can convert them into regular
constants for use with the generated ADD (or SUB).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216908 91177308-0d34-0410-b5e6-96231b3b80d8
I reverted r208640 in r209747 because r208640 broke self-hosting on PPC64. The
underlying cause of the failure is that pre-inc loads with increments
represented by ISD::TargetConstants were being transformed into ISD:::ADDs with
ISD::TargetConstant operands. PPC doesn't have a pattern for those, and so they
were selected as invalid r+r adds.
This recommits r208640, rebased and with an exclusion for ISD::TargetConstant
increments. This behavior seems correct, although in the future we might want
to ask the target to split out the indexing that uses ISD::TargetConstants.
Unfortunately, I don't yet have small test case where the relevant invalid
'add' instruction is not itself dead (and thus eliminated by
DeadMachineInstructionElim -- sometimes bugpoint is too good at removing things)
Original commit message (by Adam Nemet):
Right now the load may not get DCE'd because of the side-effect of updating
the base pointer.
This can happen if we lower a read-modify-write of an illegal larger type
(e.g. i48) such that the modification only affects one of the subparts (the
lower i32 part but not the higher i16 part). See the testcase.
In order to spot the dead load we need to revisit it when SimplifyDemandedBits
decided that the value of the load is masked off. This is the
CommitTargetLoweringOpt piece.
I checked compile time with ARM64 by sending SPEC bitcode files through llc.
No measurable change.
Fixes <rdar://problem/16031651>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216898 91177308-0d34-0410-b5e6-96231b3b80d8
The structures for Windows unwinding are shared across multiple platforms.
Indicate the encoding to be used for the particular target. Use this to switch
the unwind emitter instantiated by the AsmPrinter.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216895 91177308-0d34-0410-b5e6-96231b3b80d8
Move the Windows unwind information emitter into a separate header. This is not
related to DWARF based emission. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216894 91177308-0d34-0410-b5e6-96231b3b80d8
implicit uses of the whole register when a sub register is defined.
Now the same iterator is used in the rematerilization loop as in the
spill loop later.
Patch provided by Mikael Holmen.
This fix was proposed and reviewed by Quentin Colombet,
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/076135.html.
Unfortunately, this error in the rematerilization code has only been
seen in a large test case for an out-of-tree target, and is probably
hard to reproduce on an in-tree target. Therefore, no testcase is
provided.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216873 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Fixes a FIXME in MachineSinking. Instead of using the simple heuristics
in isPostDominatedBy, use the real MachinePostDominatorTree. The old
heuristics caused instructions to sink unnecessarily, and might create
register pressure.
Test Plan:
Added a NVPTX codegen test to verify that our change is in effect. It also
shows the unnecessary register pressure caused by over-sinking. Updated
affected tests in AArch64 and X86.
Reviewers: eliben, meheff, Jiangning
Reviewed By: Jiangning
Subscribers: jholewinski, aemerson, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D4814
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216862 91177308-0d34-0410-b5e6-96231b3b80d8
DW_TAG_lexical_scopes inform debuggers about the instruction range for
which a given variable (or imported declaration/module/etc) is valid. If
the scope doesn't itself contain any such entities, it's a waste of
space and should be omitted.
We were correctly doing this for entirely empty leaves, but not for
intermediate nodes.
Reduces total (not just debug sections) .o file size for a bootstrap
-gmlt LLVM by 22% and bootstrap -gmlt clang executable by 13%. The wins
for a full -g build will be less as a % (and in absolute terms), but
should still be substantial - with some of that win being fewer
relocations, thus more substantiall reducing link times than fewer bytes
alone would have.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216861 91177308-0d34-0410-b5e6-96231b3b80d8
This makes the emptiness of the scope with regards to variables and
nested scopes is the same as with regards to imported entities. Just
check if we had nothing at all before we build the node.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216840 91177308-0d34-0410-b5e6-96231b3b80d8
First of many steps to improve lexical scope construction (to omit
trivial lexical scopes - those without any direct variables). To that
end it's easier not to create imported entities directly into the
lexical scope node, but to build them, then add them if necessary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216838 91177308-0d34-0410-b5e6-96231b3b80d8
When sinking an instruction it might be moved past the original last use of one
of its operands. This last use has the kill flag set and the verifier will
obviously complain about this.
Before Machine Sinking (AArch64):
%vreg3<def> = ASRVXr %vreg1, %vreg2<kill>
%XZR<def> = SUBSXrs %vreg4, %vreg1<kill>, 160, %NZCV<imp-def>
...
After Machine Sinking:
%XZR<def> = SUBSXrs %vreg4, %vreg1<kill>, 160, %NZCV<imp-def>
...
%vreg3<def> = ASRVXr %vreg1, %vreg2<kill>
This fix clears all the kill flags in all instruction that use the same operands
as the instruction that is being sunk.
This fixes rdar://problem/18180996.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216803 91177308-0d34-0410-b5e6-96231b3b80d8
specifier and change the default behavior to only emit the
DW_AT_accessibility(public) attribute when the isPublic() is explicitly
set.
rdar://problem/18154959
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216799 91177308-0d34-0410-b5e6-96231b3b80d8
Rushed when I realized I hadn't committed the FreeDeleter for a clang
change I'd committed, and didn't check that I had things lying around in
my client.
Apologies for the noise.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216792 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
If a variadic function body contains a musttail call, then we copy all
of the remaining register parameters into virtual registers in the
function prologue. We track the virtual registers through the function
body, and add them as additional registers to pass to the call. Because
this is all done in virtual registers, the register allocator usually
gives us good code. If the function does a call, however, it will have
to spill and reload all argument registers (ew).
Forwarding regparms on x86_32 is not implemented because most compilers
don't support varargs in 32-bit with regparms.
Reviewers: majnemer
Subscribers: aemerson, llvm-commits
Differential Revision: http://reviews.llvm.org/D5060
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216780 91177308-0d34-0410-b5e6-96231b3b80d8
The code in SelectionDAG::getMemset for some reason assumes the value passed to
memset is an i32. This breaks the generated code for targets that only have
registers smaller than 32 bits because the value might get split into multiple
registers by the calling convention. See the test for the MSP430 target included
in the patch for an example.
This patch ensures that nothing is assumed about the type of the value. Instead,
the type is taken from the selected overload of the llvm.memset intrinsic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216716 91177308-0d34-0410-b5e6-96231b3b80d8
On MachO, putting a symbol that doesn't start with a 'L' or 'l' in one of the
__TEXT,__literal* sections prevents the linker from merging the context of the
section.
Since private GVs are the ones the get mangled to start with 'L' or 'l', we now
only put those on the __TEXT,__literal* sections.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216682 91177308-0d34-0410-b5e6-96231b3b80d8
was marked custom. The target independent DAG combine has no way to know if
the shuffles it is introducing are ones that the target could support or not.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216678 91177308-0d34-0410-b5e6-96231b3b80d8
Completes what was started in r216611 and r216623.
Used const refs instead of pointers; not sure if one is preferable to the other.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216672 91177308-0d34-0410-b5e6-96231b3b80d8
The included test case would fail, because the MI PHI node would have two
operands from the same predecessor.
This problem occurs when a switch instruction couldn't be selected. This happens
always, because there is no default switch support for FastISel to begin with.
The problem was that FastISel would first add the operand to the PHI nodes and
then fall-back to SelectionDAG, which would then in turn add the same operands
to the PHI nodes again.
This fix removes these duplicate PHI node operands by reseting the
PHINodesToUpdate to its original state before FastISel tried to select the
instruction.
This fixes <rdar://problem/18155224>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216640 91177308-0d34-0410-b5e6-96231b3b80d8
Currently instructions are folded very aggressively for AArch64 into the memory
operation, which can lead to the use of killed operands:
%vreg1<def> = ADDXri %vreg0<kill>, 2
%vreg2<def> = LDRBBui %vreg0, 2
... = ... %vreg1 ...
This usually happens when the result is also used by another non-memory
instruction in the same basic block, or any instruction in another basic block.
This fix teaches hasTrivialKill to not only check the LLVM IR that the value has
a single use, but also to check if the register that represents that value has
already been used. This can happen when the instruction with the use was folded
into another instruction (in this particular case a load instruction).
This fixes rdar://problem/18142857.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216634 91177308-0d34-0410-b5e6-96231b3b80d8
FastEmitInst_ri was constraining the first operand without checking if it is
a virtual register. Use constrainOperandRegClass as all the other
FastEmitInst_* functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216613 91177308-0d34-0410-b5e6-96231b3b80d8
This teaches the AArch64 backend to deal with the operations required
to deal with the operations on v4f16 and v8f16 which are exposed by
NEON intrinsics, plus the add, sub, mul and div operations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216555 91177308-0d34-0410-b5e6-96231b3b80d8
This combine is essentially combining target-specific nodes back into target
independent nodes that it "knows" will be combined yet again by a target
independent DAG combine into a different set of target-independent nodes that
are legal (not custom though!) and thus "ok". This seems... deeply flawed. The
crux of the problem is that we don't combine un-legalized shuffles that are
introduced by legalizing other operations, and thus we don't see a very
profitable combine opportunity. So the backend just forces the input to that
combine to re-appear.
However, for this to work, the conditions detected to re-form the unlegalized
nodes must be *exactly* right. Previously, failing this would have caused poor
code (if you're lucky) or a crasher when we failed to select instructions.
After r215611 we would fall back into the legalizer. In some cases, this just
"fixed" the crasher by produces bad code. But in the test case added it caused
the legalizer and the dag combiner to iterate forever.
The fix is to make the alignment checking in the x86 side of things match the
alignment checking in the generic DAG combine exactly. This isn't really a
satisfying or principled fix, but it at least make the code work as intended.
It also highlights that it would be nice to detect the availability of under
aligned loads for a given type rather than bailing on this optimization. I've
left a FIXME to document this.
Original commit message for r215611 which covers the rest of the chang:
[SDAG] Fix a case where we would iteratively legalize a node during
combining by replacing it with something else but not re-process the
node afterward to remove it.
In a truly remarkable stroke of bad luck, this would (in the test case
attached) end up getting some other node combined into it without ever
getting re-processed. By adding it back on to the worklist, in addition
to deleting the dead nodes more quickly we also ensure that if it
*stops* being dead for any reason it makes it back through the
legalizer. Without this, the test case will end up failing during
instruction selection due to an and node with a type we don't have an
instruction pattern for.
It took many million runs of the shuffle fuzz tester to find this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216537 91177308-0d34-0410-b5e6-96231b3b80d8