The default implementation of getCmpSelInstrCost, which provides the cost of
icmp/fcmp/select instructions, did not deal sensibly with illegal vector types
that were scalarized. We'd ask for the legalization cost of the vector type,
which would return something like (4, f64) given an input of <4 x double>, and
we'd then check the TLI status of the ISD opcode on that scalar type. This would
result in querying (ISD::VSELECT, f64), for example. Amusingly enough,
ISD::VSELECT on scalar types is marked as Legal by default (as with most other
operations), and most backends never change this because VSELECT is never
generated on scalars. However, seeing the resulting operation as Legal, we'd
neglect to add the scalarization cost before returning. The result is that we'd
grossly under-estimate the cost of cmps/selects on illegal vector types.
Now, if type legalization clearly results in scalarization, we skip the early
return and add the scalarization cost.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217859 91177308-0d34-0410-b5e6-96231b3b80d8
On MachO, and MachO only, we cannot have a truly empty function since that
breaks the linker logic for atomizing the section.
When we are emitting a frame pointer, the presence of an unreachable will
create a cfi instruction pointing past the last instruction. This is perfectly
fine. The FDE information encodes the pc range it applies to. If some tool
cannot handle this, we should explicitly say which bug we are working around
and only work around it when it is actually relevant (not for ELF for example).
Given the unreachable we could omit the .cfi_def_cfa_register, but then
again, we could also omit the entire function prologue if we wanted to.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217801 91177308-0d34-0410-b5e6-96231b3b80d8
introduced in r217629.
We were returning the old sext instead of the new zext as the promoted instruction!
Thanks Joerg Sonnenberger for the test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217800 91177308-0d34-0410-b5e6-96231b3b80d8
pointer to a dead function. To make sure it's valid, doFinalization nullptrs
RewindFunction just like the constructor and so it will be found on next run.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217737 91177308-0d34-0410-b5e6-96231b3b80d8
enabled. A good chunk of the MIsNeedChainEdge() is logic that is valid and should be applied even for targets
that are not using for alias analysis.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217706 91177308-0d34-0410-b5e6-96231b3b80d8
vectors.
e.g. when promoting ctlz from <2 x i32> to <2 x i64> we have to fixup
the result by 32 bits, not 64. PR20917.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217671 91177308-0d34-0410-b5e6-96231b3b80d8
And since it /looked/ like the DwarfStrSectionSym was unused, I tried
removing it - but then it turned out that DwarfStringPool was
reconstructing the same label (and expecting it to have already been
emitted) and uses that.
So I kept it around, but wanted to pass it in to users - since it seemed
a bit silly for DwarfStringPool to have it passed in and returned but
itself have no use for it. The only two users don't handle strings in
both .dwo and .o files so they only ever need the one symbol - no need
to keep it (and have an unused symbol) in the DwarfStringPool used for
fission/.dwo.
Refactor a bunch of accelerator table usage to remove duplication so I
didn't have to touch 4-5 callers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217628 91177308-0d34-0410-b5e6-96231b3b80d8
Do
(shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2)
This is already done for multiplies, but since multiplies
by powers of two are turned into shifts, we also need
to handle it here.
This might want checks for isLegalAddImmediate to avoid
transforming an add of a legal immediate with one that isn't.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217610 91177308-0d34-0410-b5e6-96231b3b80d8
This is an extension of the change made with r215820:
http://llvm.org/viewvc/llvm-project?view=revision&revision=215820
That patch allowed combining of splatted vector FP constants that are multiplied.
This patch allows combining non-uniform vector FP constants too by relaxing the
check on the type of vector. Also, canonicalize a vector fmul in the
same way that we already do for scalars - if only one operand of the fmul is a
constant, make it operand 1. Otherwise, we miss potential folds.
This fold is also done by -instcombine, but it's possible that extra
fmuls may have been generated during lowering.
Differential Revision: http://reviews.llvm.org/D5254
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217599 91177308-0d34-0410-b5e6-96231b3b80d8
So that the two operations in DwarfDebug couldn't get separated (because
I accidentally separated them in some work in progress), put them
together. While we're here, move DwarfUnit::addRange to
DwarfCompileUnit, since it's not relevant to type units.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217468 91177308-0d34-0410-b5e6-96231b3b80d8
PrevSection/PrevCU are used to detect holes in the address range of a CU
to ensure the DW_AT_ranges does not include those holes. When we see a
function with no debug info, though it may be in the same range as the
prior and subsequent functions, there should be a gap in the CU's
ranges. By setting PrevCU to null in that case, the range would not be
extended to cover the gap.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217466 91177308-0d34-0410-b5e6-96231b3b80d8
This solves the problem of having a kill flag inside a loop
with a definition of the register prior to the loop:
%vreg368<def> ...
Inside loop:
%vreg520<def> = COPY %vreg368
%vreg568<def,tied1> = add %vreg341<tied0>, %vreg520<kill>
=> was coalesced into =>
%vreg568<def,tied1> = add %vreg341<tied0>, %vreg368<kill>
MachineVerifier then complained:
*** Bad machine code: Virtual register killed in block, but needed live out. ***
The kill flag for %vreg368 is incorrect, and is cleared by this patch.
This is similar to the clearing done at the end of
MachineSinking::SinkInstruction().
Patch provided by Jonas Paulsson.
Reviewed by Quentin Colombet and Juergen Ributzka.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217427 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, fast-isel would not clean up after failing to select a call
instruction, because it would have called flushLocalValueMap() which moves
the insertion point, making SavedInsertPt in selectInstruction() invalid.
Fixing this by making SavedInsertPt a member variable, and having
flushLocalValueMap() update it.
This removes some redundant code at -O0, and more importantly fixes PR20863.
Differential Revision: http://reviews.llvm.org/D5249
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217401 91177308-0d34-0410-b5e6-96231b3b80d8
It's probably not a huge deal to not do this - if we could, maybe the
address could be reused by a subprogram low_pc and avoid an extra
relocation, but it's just one per CU at best.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217338 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r217211.
Both the bfd ld and gold outputs were valid. They were using a Rela relocation,
so the value present in the relocated location was not used, which caused me
to misread the output.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217264 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes PR20523.
When spilling variables onto the stack, spillVirtReg() is setting the
parent pointer of the cloned DBG_VALUE intrinsic for the stack location
to the parent pointer of the original intrinsic. MachineInstr parent
pointers should however always point to the parent basic block.
MBB is shadowing the MBB member variable. The instruction still ends up
being inserted into the right basic block, because it's inserted after MI
which serves as the iterator.
I failed at constructing a reliable testcase for this, see
http://llvm.org/bugs/show_bug.cgi?id=20523 for a large testcases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217260 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This fixes a long standing issue where we would emit many little .text
sections and only one .pdata and .xdata section. Now we generate one
.pdata / .xdata pair per .text section and associate them correctly.
Fixes PR19667.
Reviewers: majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5181
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217176 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r216803, because it might have broken the buildbot.
The issue is tracked in PR20842.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217120 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Split shouldExpandAtomicInIR() into different versions for Stores/Loads/RMWs/CmpXchgs.
Makes runOnFunction cleaner (no more redundant checking/casting), and will help moving
the X86 backend to this pass.
This requires a way of easily detecting which instructions are atomic.
I followed the pattern of mayReadFromMemory, mayWriteOrReadMemory, etc.. in making
isAtomic() a method of Instruction implemented by a switch on the opcodes.
Test Plan: make check
Reviewers: jfb
Subscribers: mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D5035
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217080 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes two latent bugs:
- There was no fence inserted before expanded seq_cst load (unsound on Power)
- There was only a fence release before seq_cst stores (again unsound, in particular on Power)
It is not even clear if this is correct on ARM swift processors (where release fences are
DMB ishst instead of DMB ish). This behaviour is currently preserved on ARM Swift
as it is not clear whether it is incorrect. I would love to get documentation stating
whether it is correct or not.
These two bugs were not triggered because Power is not (yet) using this pass, and these
behaviours happen to be (mostly?) working on ARM
(although they completely butchered the semantics of the llvm IR).
See:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075821.html
for an example of the problems that can be caused by the second of these bugs.
I couldn't see a way of fixing these in a completely target-independent way without
adding lots of unnecessary fences on ARM, hence the target-dependent parts of this
patch.
This patch implements the new target-dependent parts only for ARM (the default
of not doing anything is enough for AArch64), other architectures will use this
infrastructure in later patches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217076 91177308-0d34-0410-b5e6-96231b3b80d8
This is the final round of renaming. This changes tblgen to emit lower-case
function names for FastEmitInst_* and FastEmit_*, and updates all its uses
in the source code.
Reviewed by Eric
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217075 91177308-0d34-0410-b5e6-96231b3b80d8
Things got a little bit messy over the years and it is time for a little bit
spring cleaning.
This first commit is focused on the FastISel base class itself. It doxyfies all
comments, C++11fies the code where it makes sense, renames internal methods to
adhere to the coding standard, and clang-formats the files.
Reviewed by Eric
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217060 91177308-0d34-0410-b5e6-96231b3b80d8
Approved by Jim Grosbach, Lang Hames, Rafael Espindola.
This reinstates commits r215111, 215115, 215116, 215117, 215136.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216982 91177308-0d34-0410-b5e6-96231b3b80d8
Add -use-cfl-aa (and -use-cfl-aa-in-codegen) to add CFL AA in the default pass
managers (for easy testing).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216978 91177308-0d34-0410-b5e6-96231b3b80d8
This allows the target to disable target-independent instruction selection and
jump directly into the target-dependent instruction selection code.
This can be beneficial for targets, such as AArch64, which could emit much
better code, but never got a chance to do so, because the target-independent
instruction selector was able to find an instruction sequence.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216947 91177308-0d34-0410-b5e6-96231b3b80d8
If an fmul was introduced by lowering, it wouldn't be folded
into a multiply by a constant since the earlier combine would
have replaced the fmul with the fadd.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216932 91177308-0d34-0410-b5e6-96231b3b80d8
Also fix a small copy-paste bug in X86ISelLowering where Chain should
have been used in place of DAG.getEntryToken().
Fixes PR20828.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216929 91177308-0d34-0410-b5e6-96231b3b80d8