allowing the memcpy to be eliminated.
Unfortunately, the requirements on byval's without explicit
alignment are really weak and impossible to predict in the
mid-level optimizer, so this doesn't kick in much with current
frontends. The fix is to change clang to set alignment on all
byval arguments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119916 91177308-0d34-0410-b5e6-96231b3b80d8
so don't claim they are. They are allocated using DAG.getNode, so attempts
to access MemSDNode fields results in reading off the end of the allocated
memory. This fixes crashes with "llc -debug" due to debug code trying to
print MemSDNode fields for these barrier nodes (since the crashes are not
deterministic, use valgrind to see this). Add some nasty checking to try
to catch this kind of thing in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119901 91177308-0d34-0410-b5e6-96231b3b80d8
if all the operands of the PHI are equivalent. This allows CodeGenPrepare to undo
unprofitable PRE transforms.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119853 91177308-0d34-0410-b5e6-96231b3b80d8
DAGCombine from making an illegal transformation of bitcast of a scalar to a
vector into a scalar_to_vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119819 91177308-0d34-0410-b5e6-96231b3b80d8
This is a sorted interval map data structure for small keys and values with
automatic coalescing and bidirectional iteration over coalesced intervals.
Except for coalescing intervals, it provides similar functionality to std::map.
It is however much more compact for small keys and values, and hopefully faster
too.
The container object itself can hold the first few intervals without any
allocations, then it switches to a cache conscious B+-tree representation. A
recycling allocator can be shared between many containers, even between
containers holding different types.
The IntervalMap is initially intended to be used with SlotIndex intervals for:
- Backing store for LiveIntervalUnion that is smaller and faster than std::set.
- Backing store for LiveInterval with less overhead than std::vector for typical
intervals and O(N log N) merging of large intervals. 99% of virtual registers
need 4 entries or less and would benefit from the small object optimization.
- Backing store for LiveDebugVariable which doesn't exist yet, but will track
debug variables during register allocation.
This is a work in progress. Missing items are:
- Performance metrics.
- erase().
- insert() shrinkage.
- clear().
- More performance metrics.
- Simplification and detemplatization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119787 91177308-0d34-0410-b5e6-96231b3b80d8
MCStreamer instead of just MCObjectStreamer. Address changes cannot
be as efficient as we have to use DW_LNE_set_addres, but at least
most of the logic is shared.
This will be used so that, with CodeGen still using EmitDwarfLocDirective,
llvm-gcc is able to produce debug_line sections without needing an
assembler that supports .loc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119777 91177308-0d34-0410-b5e6-96231b3b80d8
lr" instruction cannot be tested just yet. It requires matching a "condition
code", but adding one of those makes things go south quickly...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119774 91177308-0d34-0410-b5e6-96231b3b80d8
This is a sorted interval map data structure for small keys and values with
automatic coalescing and bidirectional iteration over coalesced intervals.
Except for coalescing intervals, it provides similar functionality to std::map.
It is however much more compact for small keys and values, and hopefully faster
too.
The container object itself can hold the first few intervals without any
allocations, then it switches to a cache conscious B+-tree representation. A
recycling allocator can be shared between many containers, even between
containers holding different types.
The IntervalMap is initially intended to be used with SlotIndex intervals for:
- Backing store for LiveIntervalUnion that is smaller and faster than std::set.
- Backing store for LiveInterval with less overhead than std::vector for typical
intervals and O(N log N) merging of large intervals. 99% of virtual registers
need 4 entries or less and would benefit from the small object optimization.
- Backing store for LiveDebugVariable which doesn't exist yet, but will track
debug variables during register allocation.
This is a work in progress. Missing items are:
- Performance metrics.
- erase().
- insert() shrinkage.
- clear().
- More performance metrics.
- Simplification and detemplatization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119772 91177308-0d34-0410-b5e6-96231b3b80d8
Note: lo16AllZero remains in ARMInstrInfo.td - It can be factored out when Thumb movt is repaired.
Existing tests cover this update.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119760 91177308-0d34-0410-b5e6-96231b3b80d8
This function was being called from two different places for completely
unrelated reasons. During type legalization, it was called to expand 64-bit
shift operations. During operation legalization, it was called to handle
Neon vector shifts. The vector shift code was not written to check for
illegal types, since it was assumed to be only called after type legalization.
Fixed this by splitting off the 64-bit shift expansion into a separate
function. I don't have a particular testcase for this; I just noticed it
by inspection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119738 91177308-0d34-0410-b5e6-96231b3b80d8
if the extension types were not the same. The result was that if you
fed a select with sext and zext loads, as in the testcase, then it
would get turned into a zext (or sext) of the select, which is wrong
in the cases when it should have been an sext (resp. zext). Reported
and diagnosed by Sebastien Deldon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119728 91177308-0d34-0410-b5e6-96231b3b80d8
preserves LCSSA form out of ScalarEvolution and into the LoopInfo
class. Use it to check that SimplifyInstruction simplifications
are not breaking LCSSA form. Fixes PR8622.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119727 91177308-0d34-0410-b5e6-96231b3b80d8
Remove movePastCSLoadStoreOps and associated code for simple pointer
increments. Update routines that depended upon other opcodes for save/restore.
Adjust all testcases accordingly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119725 91177308-0d34-0410-b5e6-96231b3b80d8
this was a tree of hashtables, and a query recursed into the table for the immediate dominator ad infinitum
if the initial lookup failed. This led to really bad performance on tall, narrow CFGs.
We can instead replace it with what is conceptually a multimap of value numbers to leaders (actually
represented by a hashtable with a list of Value*'s as the value type), and then
determine which leader from that set to use very cheaply thanks to the DFS numberings maintained by
DominatorTree. Because there are typically few duplicates of a given value, this scan tends to be
quite fast. Additionally, we use a custom linked list and BumpPtr allocation to avoid any unnecessary
allocation in representing the value-side of the multimap.
This change brings with it a 15% (!) improvement in the total running time of GVN on 403.gcc, which I
think is pretty good considering that includes all the "real work" being done by MemDep as well.
The one downside to this approach is that we can no longer use GVN to perform simple conditional progation,
but that seems like an acceptable loss since we now have LVI and CorrelatedValuePropagation to pick up
the slack. If you see conditional propagation that's not happening, please file bugs against LVI or CVP.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119714 91177308-0d34-0410-b5e6-96231b3b80d8
instruction. Any that may be expanded otherwise by MC lowering should
override this value. rdar://8683274
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119713 91177308-0d34-0410-b5e6-96231b3b80d8
refusing to optimize two memcpy's like this:
copy A <- B
copy C <- A
if it couldn't prove that noalias(B,C). We can eliminate
the copy by producing a memmove instead of memcpy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119694 91177308-0d34-0410-b5e6-96231b3b80d8
there is no need to check to see if the source and dest of a memcpy are noalias,
behavior is undefined if not.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119691 91177308-0d34-0410-b5e6-96231b3b80d8
if it is passed as a byval argument. The byval argument will just be a
read, so it is safe to read from the original global instead. This allows
us to promote away the %agg.tmp alloca in PR8582
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119686 91177308-0d34-0410-b5e6-96231b3b80d8
sahf movl 344(%rdi),%r14d
we used to produce:
t.s:2:1: error: unexpected token in argument list
^
we now produce:
t.s:1:11: error: unexpected token in argument list
sahf movl 344(%rdi),%r14d
^
rdar://8581401
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119676 91177308-0d34-0410-b5e6-96231b3b80d8
and testing is easier. A good example is the unknown-location.ll test that
now can just look for ".loc 1 0 0". We also don't use a DW_LNE_set_address for
every address change anymore.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119613 91177308-0d34-0410-b5e6-96231b3b80d8
memset; we may need it to decide between MOVAPS and MOVUPS
later. Adjust a test that was looking for wrong code.
PR 3866 / 8675131.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119605 91177308-0d34-0410-b5e6-96231b3b80d8
Some of these maps may merge in the future, but for now it's convenient to have
a utility function for them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119587 91177308-0d34-0410-b5e6-96231b3b80d8
memoize the results. This improves compile time in code which highly complex
expressions which get queried many times.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119584 91177308-0d34-0410-b5e6-96231b3b80d8
It is generally not sufficient to check if the starting offset is in range
of the maximum offset that can be efficiently used for the target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119565 91177308-0d34-0410-b5e6-96231b3b80d8
This makes it more clear that the symbol is an internal, compiler-generated
name and gives a little more description about its contents.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119564 91177308-0d34-0410-b5e6-96231b3b80d8
It was mistakenly looking at the pointer type when checking for the size of
global variables. This is a partial fix for Radar 8673120.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119563 91177308-0d34-0410-b5e6-96231b3b80d8
needs to be checked that this won't break LCSSA form.
Change the existing checking method to a more direct one:
rather than seeing if all predecessors belong to the loop,
check that the replacing value is either not in any loop or
is in a loop that contains the phi node.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119556 91177308-0d34-0410-b5e6-96231b3b80d8
and xor. The 32-bit move immediates can be hoisted out of loops by machine
LICM but the isel hacks were preventing them.
Instead, let peephole optimization pass recognize registers that are defined by
immediates and the ARM target hook will fold the immediates in.
Other changes include 1) do not fold and / xor into cmp to isel TST / TEQ
instructions if there are multiple uses. This happens when the 'and' is live
out, machine sink would have sinked the computation and that ends up pessimizing
code. The peephole pass would recognize situations where the 'and' can be
toggled to define CPSR and eliminate the comparison anyway.
2) Move peephole pass to after machine LICM, sink, and CSE to avoid blocking
important optimizations.
rdar://8663787, rdar://8241368
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119548 91177308-0d34-0410-b5e6-96231b3b80d8
instructions out of InstCombine and into InstructionSimplify. While
there, introduce an m_AllOnes pattern to simplify matching with integers
and vectors with all bits equal to one.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119536 91177308-0d34-0410-b5e6-96231b3b80d8
hasConstantValue. I was leery of using SimplifyInstruction
while the IR was still in a half-baked state, which is the
reason for delaying the simplification until the IR is fully
cooked.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119494 91177308-0d34-0410-b5e6-96231b3b80d8
phi node itself if it occurs in an unreachable basic block. Protect
against this. Hopefully this will fix some more buildbots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119493 91177308-0d34-0410-b5e6-96231b3b80d8
simplified to itself (this can only happen in unreachable blocks).
Change it to return null instead. Hopefully this will fix some
buildbot failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119490 91177308-0d34-0410-b5e6-96231b3b80d8
SrcMgrDiagHandler, we can improve clang diagnostics for inline asm:
instead of reporting them on a source line of the original line,
we can report it on the correct line wherever the string literal came
from. For something like this:
void foo() {
asm("push %rax\n"
".code32\n");
}
we used to get this: (note that the line in t.c isn't helpful)
t.c:4:7: error: warning: ignoring directive for now
asm("push %rax\n"
^
<inline asm>:2:1: note: instantiated into assembly here
.code32
^
now we get:
t.c:5:8: error: warning: ignoring directive for now
".code32\n"
^
<inline asm>:2:1: note: instantiated into assembly here
.code32
^
Note that we're pointing to line 5 properly now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119488 91177308-0d34-0410-b5e6-96231b3b80d8
cookie argument to the SourceMgr diagnostic stuff. This cleanly separates
LLVMContext's inlineasm handler from the sourcemgr error handling
definition, increasing type safety and cleaning things up.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119486 91177308-0d34-0410-b5e6-96231b3b80d8
instructions have to distinguish between lists of single- and double-precision
registers in order for the ASM matcher to do a proper job. In all other
respects, a list of single- or double-precision registers are the same as a list
of GPR registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119460 91177308-0d34-0410-b5e6-96231b3b80d8
class, uses DominatorTree which is an analysis. This change moves all of
the tricky hasConstantValue logic to SimplifyInstruction, and replaces it
with a very simple literal implementation. I already taught users of
hasConstantValue that need tricky stuff to use SimplifyInstruction instead.
I didn't update InlineFunction because the IR looks like it might be in a
funky state at the point it calls hasConstantValue, which makes calling
SimplifyInstruction dangerous since it can in theory do a lot of tricky
reasoning. This may be a pessimization, for example in the case where
all phi node operands are either undef or a fixed constant.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119459 91177308-0d34-0410-b5e6-96231b3b80d8
systematically, CollapsePhi will always return null here. Note
that CollapsePhi did an extra check, isSafeReplacement, which
the SimplifyInstruction logic does not do. I think that check
was bogus - I guess we will soon find out! (It was originally
added in commit 41998 without a testcase).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119456 91177308-0d34-0410-b5e6-96231b3b80d8
"getRegisterListOpValue" logic. If the registers are double or single precision,
the value returned is suitable for VLDM/VSTM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119435 91177308-0d34-0410-b5e6-96231b3b80d8
a different pass, the complicated interaction between cmov expansion
and fast isel is no longer a concern.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119400 91177308-0d34-0410-b5e6-96231b3b80d8
Next: Add support for the !HasDotLocAndDotFile case to the MCAsmStreamer
and then switch codegen to use it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119384 91177308-0d34-0410-b5e6-96231b3b80d8
easier to debug, and to avoid complications when the CFG changes
in the middle of the instruction selection process.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119382 91177308-0d34-0410-b5e6-96231b3b80d8
Always spill the full representative register at any point where any subregister
is live.
This fixes PR8620 which caused the old logic to get confused and not spill
anything at all.
The fundamental problem here is that the coalescer is too aggressive about
physical register coalescing. It sometimes makes it impossible to allocate
registers without these emergency spills.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119375 91177308-0d34-0410-b5e6-96231b3b80d8
The system API's will be shifted over to returning an error_code, and returning
other return values as out parameters to the function.
Code that needs to check error conditions will use the errc enum values which
are the same as the posix_errno defines (EBADF, E2BIG, etc...), and are
compatable with the error codes in WinError.h due to some magic in system_error.
An example would be:
if (error_code ec = KillEvil("Java")) { // error_code can be converted to bool.
handle_error(ec);
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119360 91177308-0d34-0410-b5e6-96231b3b80d8
over a phi node by applying it to each operand may be wrong if the
operation and the phi node are mutually interdependent (the testcase
has a simple example of this). So only do this transform if it would
be correct to perform the operation in each predecessor of the block
containing the phi, i.e. if the other operands all dominate the phi.
This should fix the FFMPEG snow.c regression reported by İsmail Dönmez.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119347 91177308-0d34-0410-b5e6-96231b3b80d8