This turned up a bug in clang where arguments were emitted with
duplicate argument numbers (see r215227).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215228 91177308-0d34-0410-b5e6-96231b3b80d8
floating point exceptions, added use of flag to fold potentially exception
raising floating point math in selection DAG. No functionality change, as
targets have to explicitly ask for this behavior and none does today.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215222 91177308-0d34-0410-b5e6-96231b3b80d8
__stack_chk_guard.
Handle the case where the pointer operand of the load instruction that loads the
stack guard is not a global variable but instead a bitcast.
%StackGuard = load i8** bitcast (i64** @__stack_chk_guard to i8**)
call void @llvm.stackprotector(i8* %StackGuard, i8** %StackGuardSlot)
Original test case provided by Ana Pazos.
This fixes PR20558.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215167 91177308-0d34-0410-b5e6-96231b3b80d8
Due to an unnecessary special case, inlined arguments that happened to
be from the same function as they were inlined into were misclassified
as non-inline arguments and would overwrite the non-inlined arguments.
Assert that we never overwrite a function's arguments, and stop
misclassifying inlined arguments as non-inline arguments to fix this
issue.
Excuse the rather crappy test case - handcrafted IR might do better, or
someone who understands better how to tickle the inliner to create a
recursive inlining situation like this (though it may also be necessary
to tickle the variable in a particular way to cause it to be recorded in
the MMI side table and go down this particular path for location
information).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215157 91177308-0d34-0410-b5e6-96231b3b80d8
be deleted. This will be reapplied as soon as possible and before
the 3.6 branch date at any rate.
Approved by Jim Grosbach, Lang Hames, Rafael Espindola.
This reverts commits r215111, 215115, 215116, 215117, 215136.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215154 91177308-0d34-0410-b5e6-96231b3b80d8
Re-commit of r214832,r21469 with a work-around that
avoids the previous problem with gcc build compilers
The work-around is to use SmallVector instead of ArrayRef
of basic blocks in preservesResourceLen()/MachineCombiner.cpp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215151 91177308-0d34-0410-b5e6-96231b3b80d8
BranchFolderPass was not correctly setting the basic block branch weights when
tail-merging created or merged blocks. This patch recomutes the weights of
tail-merged blocks using the following formula:
branch_weight(merged block to successor j) =
sum(block_frequency(bb) * branch_probability(bb -> j))
bb is a block that is in the set of merged blocks.
<rdar://problem/16256423>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215135 91177308-0d34-0410-b5e6-96231b3b80d8
I am sure we will be finding bits and pieces of dead code for years to
come, but this is a good start.
Thanks to Lang Hames for making MCJIT a good replacement!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215111 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r214761.
Revert while Reid investigates & provides a reproduction for an
assertion failure for this on Windows.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214999 91177308-0d34-0410-b5e6-96231b3b80d8
to get the subtarget and that's accessible from the MachineFunction
now. This helps clear the way for smaller changes where we getting
a subtarget will require passing in a MachineFunction/Function as
well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214988 91177308-0d34-0410-b5e6-96231b3b80d8
In r210492 the logic of calculateDbgValueHistory was changed to end
register variable live ranges at the end of MBB conditionally on
the fact that the register was or not clobbered by the function body.
This requires an initial scan of all the operands of the function
to collect all clobbered registers. In a second pass over all
instructions, we compare this set with the set of clobbered
registers for the current MachineInstruction. This modification
incurred a compilation time regression on some benchmarks: the
debug info emission phase takes ~10% more time.
While a small performance hit is unavoidable due to the initial
scan requirement, we can improve the situation by avoiding to
create too many temporary sets and just use lambdas to work directly
on the result of the initial scan.
Fixes <rdar://problem/17884104>
Patch by Frederic Riss!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214987 91177308-0d34-0410-b5e6-96231b3b80d8
The handling of the epilogue is best expressed as an early exit and
there is no reason to look for register defs in DbgValue MIs.
Patch by Frederic Riss!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214986 91177308-0d34-0410-b5e6-96231b3b80d8
Otherwise we can end up with an argument frame size that is not a
multiple of stack slot size, which is very awkward.
This fixes PR20547, which was a bug in x86_64 Sys V vararg handling.
However, it's much easier to test this with x86 callee-cleanup
functions, which previously ended in "retl $6" instead of "retl $8".
This does affect behavior of all backends, but it presumably fixes the
same bug in all of them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214980 91177308-0d34-0410-b5e6-96231b3b80d8
This patch addresses 2 FIXME comments that I added to CriticalAntiDepBreaker while fixing PR20020.
Initialize an MCSubRegIterator and an MCRegAliasIterator to include the self reg.
Assuming that works as advertised, there should be functional difference with this patch, just less code.
Also, remove the associated asserts - we're setting those values just before, so the asserts don't do anything meaningful.
Differential Revision: http://reviews.llvm.org/D4566
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214973 91177308-0d34-0410-b5e6-96231b3b80d8
This was coming in weird debug info that had variables (and hence
debug_locs) but was in GMLT mode (because it was missing the 13th field
of the compile_unit metadata) so no ranges were constructed. We should
always have at least one range for any CU with a debug_loc in it -
because the range should cover the debug_loc.
The assertion just ensures that the "!= 1" range case inside the
subsequent loop doesn't get entered for the case where there are no
ranges at all, which should never reach here in the first place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214939 91177308-0d34-0410-b5e6-96231b3b80d8
This simplifies construction and usage while making the data structure
smaller. It was a holdover from the days when we didn't have a separate
DebugLocList and all we had was a flat list of DebugLocEntries.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214933 91177308-0d34-0410-b5e6-96231b3b80d8
Allow vector fabs operations on bitcasted constant integer values to be optimized
in the same way that we already optimize scalar fabs.
So for code like this:
%bitcast = bitcast i64 18446744069414584320 to <2 x float> ; 0xFFFF_FFFF_0000_0000
%fabs = call <2 x float> @llvm.fabs.v2f32(<2 x float> %bitcast)
%ret = bitcast <2 x float> %fabs to i64
Instead of generating something like this:
movabsq (constant pool loadi of mask for sign bits)
vmovq (move from integer register to vector/fp register)
vandps (mask off sign bits)
vmovq (move vector/fp register back to integer return register)
We should generate:
mov (put constant value in return register)
I have also removed a redundant clause in the first 'if' statement:
N0.getOperand(0).getValueType().isInteger()
is the same thing as:
IntVT.isInteger()
Testcases for x86 and ARM added to existing files that deal with vector fabs.
One existing testcase for x86 removed because it is no longer ideal.
For more background, please see:
http://reviews.llvm.org/D4770
And:
http://llvm.org/bugs/show_bug.cgi?id=20354
Differential Revision: http://reviews.llvm.org/D4785
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214892 91177308-0d34-0410-b5e6-96231b3b80d8
shorter/easier and have the DAG use that to do the same lookup. This
can be used in the future for TargetMachine based caching lookups from
the MachineFunction easily.
Update the MIPS subtarget switching machinery to update this pointer
at the same time it runs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214838 91177308-0d34-0410-b5e6-96231b3b80d8
This code is completely wrong. It is also dead, as if it were to *ever*
run, it would crash. Fortunately, after my work to the combiner, it is
at least *possible* to reach the code, and llvm-stress has found a test
case. Thanks to Patrick for reporting.
It would be really good if anyone who remembers how this code works and
what it was intended to do could add some more obvious test coverage
instead of my completely contrived and reduced test case. My test case
was so brittle I left a bread crumb comment in it to help the next
person to stumble on it and not know what it was actually testing for.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214785 91177308-0d34-0410-b5e6-96231b3b80d8
Originally reverted in r213432 with flakey failures on an ASan self-host
build. After reduction it seems to be the same issue fixed in r213805
(ArgPromo + DebugInfo: Handle updating debug info over multiple
applications of argument promotion) and r213952 (by having
LiveDebugVariables strip dbg_value intrinsics in functions that are not
described by debug info). Though I cannot explain why this failure was
flakey...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214761 91177308-0d34-0410-b5e6-96231b3b80d8
combines) until they are legal.
Doing it the old way could, when the stars align *just* right, cause
a node to get into the combine set prior to being legalized. Then, when
the same node showed up as an operand to another node later on (but not
so much later on that it had been deleted as dead) we would fail to add
it back to the worklist thinking it had already been combined. This
would in turn cause it to not be legalized. Fortunately, we can also
walk the operands looking for uncombined (and thus potentially
un-legalized) nodes late. It will still ensure that we walk all operands
of all nodes and send all of them through both the legalizer without
changes and the combiner at least once. (Which was the original goal of
this).
I have a test case for this bug, but it is terribly brittle. For
example, it will stop finding the bug the moment I enable the new
shuffle lowering. I don't yet have any test case that reliably exercises
this bug, and it isn't clear that it will be possible to craft one. It
is entirely possible that with the new shuffle lowering the two forms of
doing this are precisely equivalent. That doesn't mean we shouldn't take
the more conservative approach of insisting on things in the combined
set having survived the legalizer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214673 91177308-0d34-0410-b5e6-96231b3b80d8
GCC 4.8.2 objects to the tautological condition in the assert as the unsigned
value is guaranteed to be >= 0. Simplify the assertion by dropping the
tautological condition.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214671 91177308-0d34-0410-b5e6-96231b3b80d8
This is intended to be the minimal change needed to fix PR20354 ( http://llvm.org/bugs/show_bug.cgi?id=20354 ). The check for a vector operation was wrong; we need to check that the fabs itself is not a vector operation.
This patch will not generate the optimal code. A constant pool load and 'and' op will be generated instead of just returning a value that we can calculate in advance (as we do for the scalar case). I've put a 'TODO' comment for that here and expect to have that patch ready soon.
There is a very similar optimization that we can do in visitFNEG, so I've put another 'TODO' there and expect to have another patch for that too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214670 91177308-0d34-0410-b5e6-96231b3b80d8
sequence - target independent framework
When the DAGcombiner selects instruction sequences
it could increase the critical path or resource len.
For example, on arm64 there are multiply-accumulate instructions (madd,
msub). If e.g. the equivalent multiply-add sequence is not on the
crictial path it makes sense to select it instead of the combined,
single accumulate instruction (madd/msub). The reason is that the
conversion from add+mul to the madd could lengthen the critical path
by the latency of the multiply.
But the DAGCombiner would always combine and select the madd/msub
instruction.
This patch uses machine trace metrics to estimate critical path length
and resource length of an original instruction sequence vs a combined
instruction sequence and picks the faster code based on its estimates.
This patch only commits the target independent framework that evaluates
and selects code sequences. The machine instruction combiner is turned
off for all targets and expected to evolve over time by gradually
handling DAGCombiner pattern in the target specific code.
This framework lays the groundwork for fixing
rdar://16319955
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214666 91177308-0d34-0410-b5e6-96231b3b80d8
The combiner was creating Q-register loads and stores, which then had to be spilled because there are no callee-save Q registers!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214634 91177308-0d34-0410-b5e6-96231b3b80d8
so using a single helper which adds operands back onto the worklist.
Several places didn't rigorously do this but a couple already did.
Factoring them together and doing it rigorously is important to delete
things recursively early on in the combiner and get a chance to see
accurate hasOneUse values. While no existing test cases change, an
upcoming patch to add DAG combining logic for PSHUFB requires this to
work correctly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214623 91177308-0d34-0410-b5e6-96231b3b80d8
during DAGCombine in certain circumstances. Unfortunately, the circumstances required
to trigger the issue seem to require a pretty specific interaction of DAGCombines,
and I haven't been able to find a testcase that reproduces on X86, ARM, or AArch64.
The functionality added here is replicated in essentially every other DAG combine,
so it seems pretty obviously correct.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214622 91177308-0d34-0410-b5e6-96231b3b80d8
variables (for example, by-value struct arguments passed in registers, or
large integer values split across several smaller registers).
On the IR level, this adds a new type of complex address operation OpPiece
to DIVariable that describes size and offset of a variable fragment.
On the DWARF emitter level, all pieces describing the same variable are
collected, sorted and emitted as DWARF expressions using the DW_OP_piece
and DW_OP_bit_piece operators.
http://reviews.llvm.org/D3373
rdar://problem/15928306
What this patch doesn't do / Future work:
- This patch only adds the backend machinery to make this work, patches
that change SROA and SelectionDAG's type legalizer to actually create
such debug info will follow. (http://reviews.llvm.org/D2680)
- Making the DIVariable complex expressions into an argument of dbg.value
will reduce the memory footprint of the debug metadata.
- The sorting/uniquing of pieces should be moved into DebugLocEntry,
to facilitate the merging of multi-piece entries.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214576 91177308-0d34-0410-b5e6-96231b3b80d8
fromulation of the node, which isn't really the desired behavior from
within the combiner or legalizer, but is necessary within ISel. I've
added a hopefully helpful comment and fixed the only two places where
this took place.
Yet another step toward the combiner and legalizer not needing to use
update listeners with virtual calls to manage the worklists behind
legalization and combining.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214574 91177308-0d34-0410-b5e6-96231b3b80d8
This lifts the (very few) places the legalizer would delete dead nodes
into the outer loop around the legalizer. This is significantly simpler
because it doesn't require the legalizer itself to manage the iterator
validity, and it doesn't require the legalizer to be a DAG update
listener in order to remove things from the legalized set. It also makes
the interface much less contrived for the case of the legalizer running
inside the last phase of DAG combining.
I'm working on centralizing the deletion of nodes during both legalizing
and combining as much as possible. My hope is to remove the need for DAG
update listeners from the combiner next, which would remove a costly
virtual dispatch chain on every deletion. This in turn should allow us
to more aggressively delete DAG nodes during combining which will in
turn allow us to combine more aggressively by exposing the actual nodes
which have single users to the combine phases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214546 91177308-0d34-0410-b5e6-96231b3b80d8
This change adds code to explicitly mark a function which requires runtime stack realignment as not having a fixed frame size in the StackMap section. As it happens, this is not actually a functional change. The size that would be reported without the check is also "-1", but as far as I can tell, that's an accident. The code change makes this explicit.
Note: There's a separate bug in handling of stackmaps and patchpoints in functions which need dynamic frame realignment. The current code assumes that offsets can be calculated from RBP, but realigned frames must use RSP. (There's a variable gap between RBP and the spill slots.) This change set does not address that issue.
Reviewers: atrick, ributzka
Differential Revision: http://reviews.llvm.org/D4572
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214534 91177308-0d34-0410-b5e6-96231b3b80d8
Altivec vector loads on PowerPC have an interesting property: They always load
from an aligned address (by rounding down the address actually provided if
necessary). In order to generate an actual unaligned load, you can generate two
load instructions, one with the original address, one offset by one vector
length, and use a special permutation to extract the bytes desired.
When this was originally implemented, I generated these two loads using regular
ISD::LOAD nodes, now marked as aligned. Unfortunately, there is a problem with
this:
The alignment of a load does not contribute to its identity, and SDNodes
are uniqued. So, imagine that we have some unaligned load, L1, that is not
aligned. The routine will create two loads, L1(aligned) and (L1+16)(aligned).
Further imagine that there had already existed a load (L1+16)(unaligned) with
the same chain operand as the load L1. When (L1+16)(aligned) is created as part
of the lowering of L1, this load *is* also the (L1+16)(unaligned) node, just
now marked as aligned (because the new alignment overwrites the old). But the
original users of (L1+16)(unaligned) now get the data intended for the
permutation yielding the data for L1, and (L1+16)(unaligned) no longer exists
to get its own permutation-based expansion. This was PR19991.
A second potential problem has to do with the MMOs on these loads, which can be
used by AA during instruction scheduling to break chain-based dependencies. If
the new "aligned" loads get the MMO from the original unaligned load, this does
not represent the fact that it will load data from below the original address.
Normally, this would not matter, but this load might be combined with another
load pair for a previous vector, and then the dependency on the otherwise-
ignored lower bytes can matter.
To fix both problems, instead of generating the necessary loads using regular
ISD::LOAD instructions, ppc_altivec_lvx intrinsics are used instead. These are
provided with MMOs with a conservative address range.
Unfortunately, I no longer have a failing test case (since PR19991 was
reported, other changes in CodeGen have forced this bug back into hiding it
again). Nevertheless, this should fix the underlying problem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214481 91177308-0d34-0410-b5e6-96231b3b80d8
Currently when DAGCombine converts loads feeding a switch into a switch of
addresses feeding a load the new load inherits the isInvariant flag of the left
side. This is incorrect since invariant loads can be reordered in cases where it
is illegal to reoarder normal loads.
This patch adds an isInvariant parameter to getExtLoad() and updates all call
sites to pass in the data if they have it or false if they don't. It also
changes the DAGCombine to use that data to make the right decision when
creating the new load.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214449 91177308-0d34-0410-b5e6-96231b3b80d8
This is a follow-up to the activity in the bug at
http://llvm.org/bugs/show_bug.cgi?id=18663 . The underlying issue has
to do with how the KILL pseudo-instruction is handled. I defer to
Hal/Jakob/Uli for additional details and background.
This will disable the (bad?) assert, add an associated fixme comment,
and add a pair of tests.
The code change and the pr18663-2.ll test are copied from the referenced
bug. That test does not immediately fail in my environment, but I have
added the pr18663.ll test which does.
(Comment from Hal)
to provide everyone else with some context, this assert was not bad when
it was written. At that time, we only generated KILL pseudo instructions
around subregister copies. This logic, unfortunately, had its own problems.
In r199797, the relevant logic in MachineCopyPropagation was replaced to
generate KILLs for other kinds of copies too. This change in semantics broke
this now-problematic assumption in AggressiveAntiDepBreaker. The
AggressiveAntiDepBreaker really needs a proper cleanup to deal with the
change, but removing the assert (which just allows the function to return
false) is a safe conservative behavior, and should do for the time being.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214429 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes a mistake where I accidentially dropped the upper 32bit of a
64bit pointer during FastISel lowering of the patchpoint intrinsic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214367 91177308-0d34-0410-b5e6-96231b3b80d8
DAGCombine may choose to rewrite graphs where two loads feed a select into
graphs where a select of two addresses feed a load. While it sanity checks the
loads to make sure they are broadly equivalent it currently just uses the
alignment restriction of the left node. In cases where the right node has
stronger alignment requiresment this may lead to bad codegen, such as generating
an aligned load where an unaligned load is required. This patch makes the
combine generate a load with an alignment that is the same as whichever is more
restrictive of the two alignments.
Tests included.
rdar://17762530
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214322 91177308-0d34-0410-b5e6-96231b3b80d8
the jump instruction table pass. First, the verifier is already built
into all the tools. The test case is adapted to just run llvm-as
demonstrating that we still catch the broken module. Second, the
verifier is *extremely* slow. This was responsible for very significant
compile time regressions.
If you have deployed a Clang binary anywhere from r210280 to this
commit, you really want to re-deploy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214287 91177308-0d34-0410-b5e6-96231b3b80d8