We do use a very small set of physical registers, so account for
them in the virtual register encoding between MachineInstr and MC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187799 91177308-0d34-0410-b5e6-96231b3b80d8
This change converts the NVPTX target to use the MC infrastructure
instead of directly emitting MachineInstr instances. This brings
the target more up-to-date with LLVM TOT, and should fix PR15175
and PR15958 (libNVPTXInstPrinter is empty) as a side-effect.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187798 91177308-0d34-0410-b5e6-96231b3b80d8
fix for: Bug 16694 - ExecutionEngine/test-interp-vec-loadstore.ll failing on powerpc-darwin8 (http://llvm.org/bugs/show_bug.cgi?id=16694)
The ExecutionEngine/test-interp-vec-loadstore.ll test has been failing on powerpc-darwin8 (on other platforms it passed)
the reason of fail was wrong output by printf. this output is checked by FileCheck, but on little-endian powerpc the output numeric data were printed inside out and FileCheck reported fail.
the printfs have been replaced by checking data inside test and numeric output has been replaced by the text output like : "int test passed, float test passed". The text output is checked by FileCheck.
the dependency on data layout has been removed.
done by Yuri Veselov (Intel)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187791 91177308-0d34-0410-b5e6-96231b3b80d8
This change came about primarily because of two issues in the existing code.
Niether of:
define i64 @test1(i64 %val) {
%in = trunc i64 %val to i32
tail call i32 @ret32(i32 returned %in)
ret i64 %val
}
define i64 @test2(i64 %val) {
tail call i32 @ret32(i32 returned undef)
ret i32 42
}
should be tail calls, and the function sameNoopInput is responsible. The main
problem is that it is completely symmetric in the "tail call" and "ret" value,
but in reality different things are allowed on each side.
For these cases:
1. Any truncation should lead to a larger value being generated by "tail call"
than needed by "ret".
2. Undef should only be allowed as a source for ret, not as a result of the
call.
Along the way I noticed that a mismatch between what this function treats as a
valid truncation and what the backends see can lead to invalid calls as well
(see x86-32 test case).
This patch refactors the code so that instead of being based primarily on
values which it recurses into when necessary, it starts by inspecting the type
and considers each fundamental slot that the backend will see in turn. For
example, given a pathological function that returned {{}, {{}, i32, {}}, i32}
we would consider each "real" i32 in turn, and ask if it passes through
unchanged. This is much closer to what the backend sees as a result of
ComputeValueVTs.
Aside from the bug fixes, this eliminates the recursion that's going on and, I
believe, makes the bulk of the code significantly easier to understand. The
trade-off is the nasty iterators needed to find the real types inside a
returned value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187787 91177308-0d34-0410-b5e6-96231b3b80d8
This patch just uses a peephole test for "add; compare; branch" sequences
within a single block. The IR optimizers already convert loops to
decrement-and-branch-on-nonzero form in some cases, so even this
simplistic test triggers many times during a clang bootstrap and
projects/test-suite run. It looks like there are still cases where we
need to more strongly prefer branches on nonzero though. E.g. I saw a
case where a loop that started out with a check for 0 ended up with a
check for -1. I'll try to look at that sometime.
I ended up adding the Reference class because MachineInstr::readsRegister()
doesn't check for subregisters (by design, as far as I could tell).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187723 91177308-0d34-0410-b5e6-96231b3b80d8
helper functions. This can be optimized out later when the remaining
parts of the helper function work is moved into the Mips16HardFloat pass.
For now it forces us to use the 32 bit save/restore instructions instead
of the 16 bit ones.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187712 91177308-0d34-0410-b5e6-96231b3b80d8
Note that this will require a recent version of the linker for Darwin
builds with LTO to pass these tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187711 91177308-0d34-0410-b5e6-96231b3b80d8
Due to the weird and wondeful usual arithmetic conversions, some
calculations involving negative values were getting performed in
uint32_t and then promoted to int64_t, which is really not a good
idea.
Patch by Katsuhiro Ueno.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187703 91177308-0d34-0410-b5e6-96231b3b80d8
Internally, the PowerPC backend names the 32-bit GPRs R[0-9]+, and names the
64-bit parent GPRs X[0-9]+. When matching inline assembly constraints with
explicit register names, on PPC64 when an i64 MVT has been requested, we need
to follow gcc's convention of using r[0-9]+ to refer to the 64-bit (parent)
registers.
At some point, we'll probably want to arrange things so that the generic code
in TargetLowering uses the AsmName fields declared in *RegisterInfo.td in order
to match these inline asm register constraints. If we do that, this change can
be reverted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187693 91177308-0d34-0410-b5e6-96231b3b80d8
This patch fixes the multiple breakages on ARM test-suite after the SLP
vectorizer was introduced by default on O3. The problem was an illegal
vector type on ARMTTI::getCmpSelInstrCost() <3 x i1> which is not simple.
The guard protects this code from breaking (cause of the problems) but
doesn't fix the issue that is generating the odd vector in the first
place, which also needs to be investigated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187658 91177308-0d34-0410-b5e6-96231b3b80d8
Function attributes are the future! So just query whether we want to realign the
stack directly from the function instead of through a random target options
structure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187618 91177308-0d34-0410-b5e6-96231b3b80d8
This is actually an LLVM bug in the way it generates signatures for these
when soft float is enabled. For example, floor ends up having the signature
of int64(int64). The signature part is not the same as where the actual
parameter types are recorded, and those ARE of course int64(int64) when
soft float is enabled. (Yes, Mips16 hard float uses soft float but with
different runtime rounes but then has to interoperate with Mips32 using
normal floating point). This logic will eventually be moved to the
Mips16HardFloat pass so it's not worth sorting out these issues in LLVM
since nobody but Mips16 cares about these signatures, as far as I know,
and even I won't eventually either.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187613 91177308-0d34-0410-b5e6-96231b3b80d8
Also remove checking of llvm.dbg.sp since it is not used in generating dwarf.
Current state of Finder:
DebugInfoFinder tries to list all debug info MDNodes used in a module. To
list debug info MDNodes used by an instruction, DebugInfoFinder provides
processDeclare, processValue and processLocation to handle DbgDeclareInst,
DbgValueInst and DbgLoc attached to instructions. processModule will go
through all DICompileUnits in llvm.dbg.cu and list debug info MDNodes
used by the CUs.
TODO:
1> Finder has a list of CUs, SPs, Types, Scopes and global variables. We
need to add a list of variables that are used by DbgDeclareInst and
DbgValueInst.
2> MDString fields should be null or isa<MDString> and MDNode fields should be
null or isa<MDNode>. We currently use empty string or int 0 to represent null.
3> Go though Verify functions and make sure that they check field types.
4> Clean up existing testing cases to remove llvm.dbg.sp and make sure each
testing case has a llvm.dbg.cu.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187609 91177308-0d34-0410-b5e6-96231b3b80d8
This is another case where internalize hides a symbol that is needed by
a loadable module. I am currently investigating a proper fix but this patch
will get our buildbot to pass in the meantime. <rdar://problem/14578094>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187601 91177308-0d34-0410-b5e6-96231b3b80d8
* Added R600_Reg64 class
* Added T#Index#.XY registers definition
* Added v2i32 register reads from parameter and global space
* Added f32 and i32 elements extraction from v2f32 and v2i32
* Added v2i32 -> v2f32 conversions
Tom Stellard:
- Mark vec2 operations as expand. The addition of a vec2 register
class made them all legal.
Patch by: Dmitry Cherkassov
Signed-off-by: Dmitry Cherkassov <dcherkassov@gmail.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187582 91177308-0d34-0410-b5e6-96231b3b80d8
This also fixes a bug in the predication of LR to LOCR: I'd forgotten
that with these in-place instruction builds, the implicit operands need
to be added manually. I think this was latent until now, but is tested
by int-cmp-45.c. It also adds a CC valid mask to STOC, again tested by
int-cmp-45.c.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187573 91177308-0d34-0410-b5e6-96231b3b80d8
Convert >= 1 to > 0, etc. Using comparison with zero isn't a win on its own,
but it exposes more opportunities for CC reuse (the next patch).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187571 91177308-0d34-0410-b5e6-96231b3b80d8
Patch by Ana Pazos.
- Completed implementation of instruction formats:
AdvSIMD three same
AdvSIMD modified immediate
AdvSIMD scalar pairwise
- Completed implementation of instruction classes
(some of the instructions in these classes
belong to yet unfinished instruction formats):
Vector Arithmetic
Vector Immediate
Vector Pairwise Arithmetic
- Initial implementation of instruction formats:
AdvSIMD scalar two-reg misc
AdvSIMD scalar three same
- Intial implementation of instruction class:
Scalar Arithmetic
- Initial clang changes to support arm v8 intrinsics.
Note: no clang changes for scalar intrinsics function name mangling yet.
- Comprehensive test cases for added instructions
To verify auto codegen, encoding, decoding, diagnosis, intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187567 91177308-0d34-0410-b5e6-96231b3b80d8
1) They should never be inlined.
2) A naming inconsistency with gcc mips16
3) Stubs should not have the global attribute
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187555 91177308-0d34-0410-b5e6-96231b3b80d8
While the .td entry is nice and all, it takes a pretty gross hack in
ARMAsmParser::ParseInstruction() because of handling of other "subs"
instructions to get it to match. Ran it by Jim Grosbach and he said it was
about what he expected to make this work given the existing code.
rdar://14214063
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187530 91177308-0d34-0410-b5e6-96231b3b80d8
The loop optimizers were assuming that scales > 1 were OK. I think this
is actually a bug in TargetLoweringBase::isLegalAddressingMode(),
since it seems to be trying to reject anything that isn't r+i or r+r,
but it has no default case for scales other than 0, 1 or 2. Implementing
the hook for z means that z can no longer test any change there though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187497 91177308-0d34-0410-b5e6-96231b3b80d8
Extend r187495 to conditional loads. I split this out because the
easiest way seemed to be to force a particular operand order in
SystemZISelDAGToDAG.cpp.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187496 91177308-0d34-0410-b5e6-96231b3b80d8
System z branches have a mask to select which of the 4 CC values should
cause the branch to be taken. We can invert a branch by inverting the mask.
However, not all instructions can produce all 4 CC values, so inverting
the branch like this can lead to some oddities. For example, integer
comparisons only produce a CC of 0 (equal), 1 (less) or 2 (greater).
If an integer EQ is reversed to NE before instruction selection,
the branch will test for 1 or 2. If instead the branch is reversed
after instruction selection (by inverting the mask), it will test for
1, 2 or 3. Both are correct, but the second isn't really canonical.
This patch therefore keeps track of which CC values are possible
and uses this when inverting a mask.
Although this is mostly cosmestic, it fixes undefined behavior
for the CIJNLH in branch-08.ll. Another fix would have been
to mask out bit 0 when generating the fused compare and branch,
but the point of this patch is that we shouldn't need to do that
in the first place.
The patch also makes it easier to reuse CC results from other instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187495 91177308-0d34-0410-b5e6-96231b3b80d8
r187116 moved compare-and-branch generation from the instruction-selection
pass to the peephole optimizer (via optimizeCompare). It turns out that even
this is a bit too early. Fused compare-and-branch instructions don't
interact well with predication, where a CC result is needed. They also
make it harder to reuse the CC side-effects of earlier instructions
(not yet implemented, but the subject of a later patch).
Another problem was that the AnalyzeBranch family of routines weren't
handling compares and branches, so we weren't able to reverse the fused
form in cases where we would reverse a separate branch. This could have
been fixed by extending AnalyzeBranch, but given the other problems,
I've instead moved the fusing to the long-branch pass, which is also
responsible for the opposite transformation: splitting out-of-range
compares and branches into separate compares and long branches.
I've added a test for the AnalyzeBranch problem. A test for the
predication problem is included in the next patch, which fixes a bug
in the choice of CC mask.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187494 91177308-0d34-0410-b5e6-96231b3b80d8
r186399 aggressively used the RISBG instruction for immediate ANDs,
both because it can handle some values that AND IMMEDIATE can't,
and because it allows the destination register to be different from
the source. I realized later while implementing the distinct-ops
support that it would be better to leave the choice up to
convertToThreeAddress() instead. The AND IMMEDIATE form is shorter
and is less likely to be cracked.
This is a problem for 32-bit ANDs because we assume that all 32-bit
operations will leave the high word untouched, whereas RISBG used in
this way will either clear the high word or copy it from the source
register. The patch uses the z196 instruction RISBLG for this instead.
This means that z10 will be restricted to NILL, NILH and NILF for
32-bit ANDs, but I think that should be OK for now. Although we're
using z10 as the base architecture, the optimization work is going
to be focused more on z196 and zEC12.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187492 91177308-0d34-0410-b5e6-96231b3b80d8
All insertf*/extractf* functions replaced with insert/extract since we have insertf and inserti forms.
Added lowering for INSERT_VECTOR_ELT / EXTRACT_VECTOR_ELT for 512-bit vectors.
Added lowering for EXTRACT/INSERT subvector for 512-bit vectors.
Added a test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187491 91177308-0d34-0410-b5e6-96231b3b80d8
Call into ComputeMaskedBits to figure out which bits are set on both add
operands and determine if the value is a power-of-two-or-zero or not.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187445 91177308-0d34-0410-b5e6-96231b3b80d8
It will now only convert the arguments / return value and call
the underlying function if the types are able to be bitcasted.
This avoids using fp<->int conversions that would occur before.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187444 91177308-0d34-0410-b5e6-96231b3b80d8
When simplifying a (or (and B A) (and C ~A)) to a (VBSL A B C) ensure that the
bitwidth of the second operands to both ands match before comparing the negation
of the values.
Split the check of the value of the second operands to the ands. Move the cast
and variable declaration slightly higher to make it slightly easier to follow.
Bug-Id: 16700
Signed-off-by: Saleem Abdulrasool <compnerd@compnerd.org>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187404 91177308-0d34-0410-b5e6-96231b3b80d8
build_vector is lowered to REG_SEQUENCE, which is something the register
allocator does a good job at optimizing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187397 91177308-0d34-0410-b5e6-96231b3b80d8
This patch prevents the following combine when the input vector is used more
than once.
insert_vector_elt (build_vector elt0, ..., eltN), NewEltIdx, idx
=>
build_vector elt0, ..., NewEltIdx, ..., eltN
The reasons are:
- Building a vector may be expensive, so try to reuse the existing part of a
vector instead of creating a new one (think big vectors).
- elt0 to eltN now have two users instead of one. This may prevent some other
optimizations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187396 91177308-0d34-0410-b5e6-96231b3b80d8
The problem is due to the section name being explicitly mentioned in
the IR and differing between the two platforms.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187394 91177308-0d34-0410-b5e6-96231b3b80d8
update testcase to make sure we generate debug info for walrus
by adding a non-trivial constructor and verify that we don't
emit an ODR signature for the type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187393 91177308-0d34-0410-b5e6-96231b3b80d8
32-bit symbols have "_" as global prefix, but when forming the name of
COMDAT sections this prefix is ignored. The current behavior assumes that
this prefix is always present which is not the case for 64-bit and names
are truncated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187356 91177308-0d34-0410-b5e6-96231b3b80d8
If no other operation is specified, 's' becomes an operation instead of an
modifier. The s operation just creates a symbol table. It is the same as
running ranlib.
We assume the archive was created by a sane ar (like llvm-ar or gnu ar) and
if the symbol table is present, then it is current. We use that to optimize
the most common case: a broken build system that thinks it has to run ranlib.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187353 91177308-0d34-0410-b5e6-96231b3b80d8
Single-slash encoded entries do not require a terminating null. This bumps
the maximum table size from ~1MB to ~9.5MB.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187352 91177308-0d34-0410-b5e6-96231b3b80d8
Also always add DIType, DISubprogram and DIGlobalVariable to the list
in DebugInfoFinder without checking them, so we can verify them later
on.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187285 91177308-0d34-0410-b5e6-96231b3b80d8
This change makes test with RUN lines like
RUN: opt ... | FileCheck
fail if opt fails, even if it prints what FileCheck wants. Enabling this
found some interesting cases of broken tests that were not being noticed
because opt (or some other tool) was crashing late.
Pipefail is used when the shell supports it or when using the internal
python based tester.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187261 91177308-0d34-0410-b5e6-96231b3b80d8
also worthwhile for it to look through FP extensions and truncations, whose
application commutes with fneg.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187249 91177308-0d34-0410-b5e6-96231b3b80d8
We used to call Verify before adding DICompileUnit to the list, and now we
remove the check and always add DICompileUnit to the list in DebugInfoFinder,
so we can verify them later on.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187237 91177308-0d34-0410-b5e6-96231b3b80d8
type units.
Initially this support is used in the computation of an ODR checker
for C++. For now we're attaching it to the DIE, but in the future
it will be attached to the type unit.
This also starts breaking out types into the separation for type
units, but without actually splitting the DIEs.
In preparation for hashing the DIEs this adds a DIEString type
that contains a StringRef with the string contained at the label.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187213 91177308-0d34-0410-b5e6-96231b3b80d8
CustomLowerNode was not being called during SplitVectorOperand,
meaning custom legalization could not be used by targets.
This also adds a test case for NVPTX that depends on this custom
legalization.
Differential Revision: http://llvm-reviews.chandlerc.com/D1195
Attempt to fix the buildbots by making the X86 test I just added platform independent
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187202 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit 187198. It broke the bots.
The soft float test probably needs a -triple because of name differences.
On the hard float test I am getting a "roundss $1, %xmm0, %xmm0", instead of
"vroundss $1, %xmm0, %xmm0, %xmm0".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187201 91177308-0d34-0410-b5e6-96231b3b80d8
CustomLowerNode was not being called during SplitVectorOperand,
meaning custom legalization could not be used by targets.
This also adds a test case for NVPTX that depends on this custom
legalization.
Differential Revision: http://llvm-reviews.chandlerc.com/D1195
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187198 91177308-0d34-0410-b5e6-96231b3b80d8
robust. It now uses an InstVisitor and worklist to actually walk the
uses of the Alloca transitively and detect the pattern which we can
directly promote: loads & stores of the whole alloca and instructions we
can completely ignore.
Also, with this new implementation teach both the predicate for testing
whether we can promote and the promotion engine itself to use the same
code so we no longer have strange divergence between the two code paths.
I've added some silly test cases to demonstrate that we can handle
slightly more degenerate code patterns now. See the below for why this
is even interesting.
Performance impact: roughly 1% regression in the performance of SROA or
ScalarRepl on a large C++-ish test case where most of the allocas are
basically ready for promotion. The reason is because of silly redundant
work that I've left FIXMEs for and which I'll address in the next
commit. I wanted to separate this commit as it changes the behavior.
Once the redundant work in removing the dead uses of the alloca is
fixed, this code appears to be faster than the old version. =]
So why is this useful? Because the previous requirement for promotion
required a *specific* visit pattern of the uses of the alloca to verify:
we *had* to look for no more than 1 intervening use. The end goal is to
have SROA automatically detect when an alloca is already promotable and
directly hand it to the mem2reg machinery rather than trying to
partition and rewrite it. This is a 25% or more performance improvement
for SROA, and a significant chunk of the delta between it and
ScalarRepl. To get there, we need to make mem2reg actually capable of
promoting allocas which *look* promotable to SROA without have SROA do
tons of work to massage the code into just the right form.
This is actually the tip of the iceberg. There are tremendous potential
savings we can realize here by de-duplicating work between mem2reg and
SROA.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187191 91177308-0d34-0410-b5e6-96231b3b80d8
The bitcode representation attribute kinds are encoded into / decoded from
should be independent of the current set of LLVM attributes and their position
in the AttrKind enum. This patch explicitly encodes attributes to fixed bitcode
values.
With this patch applied, LLVM does not silently misread attributes written by
LLVM 3.3. We also enhance the decoding slightly such that an error message is
printed if an unknown AttrKind encoding was dected.
Bonus: Dropping bitcode attributes from AttrKind is now easy, as old AttrKinds
do not need to be kept to support the Bitcode reader.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187186 91177308-0d34-0410-b5e6-96231b3b80d8
structure not just a pointer. This implements that and thus fixes va_copy
on PPC32. Fixes#15286. Both bug and patch by Florian Zeitz!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187158 91177308-0d34-0410-b5e6-96231b3b80d8
Back in r140220 we removed the autoconf code that would set LLVMCC_OPTION
since it was only used by the test-suite. This patch now removes code
that would only be used if LLVMCC_OPTION was set.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187154 91177308-0d34-0410-b5e6-96231b3b80d8
The previous change to local live range allocation also suppressed
eviction of local ranges. In rare cases, this could result in more
expensive register choices. This commit actually revives a feature
that I added long ago: check if live ranges can be reassigned before
eviction. But now it only happens in rare cases of evicting a local
live range because another local live range wants a cheaper register.
The benefit is improved code size for some benchmarks on x86 and armv7.
I measured no significant compile time increase and performance
changes are noise.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187140 91177308-0d34-0410-b5e6-96231b3b80d8
Also avoid locals evicting locals just because they want a cheaper register.
Problem: MI Sched knows exactly how many registers we have and assumes
they can be colored. In cases where we have large blocks, usually from
unrolled loops, greedy coloring fails. This is a source of
"regressions" from the MI Scheduler on x86. I noticed this issue on
x86 where we have long chains of two-address defs in the same live
range. It's easy to see this in matrix multiplication benchmarks like
IRSmk and even the unit test misched-matmul.ll.
A fundamental difference between the LLVM register allocator and
conventional graph coloring is that in our model a live range can't
discover its neighbors, it can only verify its neighbors. That's why
we initially went for greedy coloring and added eviction to deal with
the hard cases. However, for singly defined and two-address live
ranges, we can optimally color without visiting neighbors simply by
processing the live ranges in instruction order.
Other beneficial side effects:
It is much easier to understand and debug regalloc for large blocks
when the live ranges are allocated in order. Yes, global allocation is
still very confusing, but it's nice to be able to comprehend what
happened locally.
Heuristics could be added to bias register assignment based on
instruction locality (think late register pairing, banks...).
Intuituvely this will make some test cases that are on the threshold
of register pressure more stable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187139 91177308-0d34-0410-b5e6-96231b3b80d8
For two intrinsics 'llvm.nvvm.texsurf.handle' and 'llvm.nvvm.texsurf.handle.internal',
TableGen was emitting matching code like:
if (Name.startswith("llvm.nvvm.texsurf.handle")) ...
if (Name.startswith("llvm.nvvm.texsurf.handle.internal")) ...
We can never match "llvm.nvvm.texsurf.handle.internal" here because it will
always be erroneously matched by the first condition.
The fix is to sort the intrinsic names and emit them in reverse order.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187119 91177308-0d34-0410-b5e6-96231b3b80d8
Before the patch we took advantage of the fact that the compare and
branch are glued together in the selection DAG and fused them together
(where possible) while emitting them. This seemed to work well in practice.
However, fusing the compare so early makes it harder to remove redundant
compares in cases where CC already has a suitable value. This patch
therefore uses the peephole analyzeCompare/optimizeCompareInstr pair of
functions instead.
No behavioral change intended, but it paves the way for a later patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187116 91177308-0d34-0410-b5e6-96231b3b80d8
As with the stores, these instructions can trap when the condition is false,
so they are only used for things like (cond ? x : *ptr).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187112 91177308-0d34-0410-b5e6-96231b3b80d8
These instructions are allowed to trap even if the condition is false,
so for now they are only used for "*ptr = (cond ? x : *ptr)"-style
constructs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187111 91177308-0d34-0410-b5e6-96231b3b80d8
Make sure the context and type fields are MDNodes. We will generate
verification errors if those fields are non-empty strings.
Fix testing cases to make them pass the verifier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187106 91177308-0d34-0410-b5e6-96231b3b80d8
The language reference says that:
"If a symbol appears in the @llvm.used list, then the compiler,
assembler, and linker are required to treat the symbol as if there is
a reference to the symbol that it cannot see"
Since even the linker cannot see the reference, we must assume that
the reference can be using the symbol table. For example, a user can add
__attribute__((used)) to a debug helper function like dump and use it from
a debugger.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187103 91177308-0d34-0410-b5e6-96231b3b80d8
There's no need to specify a flag to omit frame pointer elimination on non-leaf
nodes...(Honestly, I can't parse that option out.) Use the function attribute
stuff instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187093 91177308-0d34-0410-b5e6-96231b3b80d8
Prior to this patch, IfConverter may widen the cases where a sequence of
instructions were executed because of the way it uses nested predicates. This
result in incorrect execution.
For instance, Let A be a basic block that flows conditionally into B and B be a
predicated block.
B can be predicated with A.BrToBPredicate into A iff B.Predicate is less
"permissive" than A.BrToBPredicate, i.e., iff A.BrToBPredicate subsumes
B.Predicate.
The IfConverter was checking the opposite: B.Predicate subsumes
A.BrToBPredicate.
<rdar://problem/14379453>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187071 91177308-0d34-0410-b5e6-96231b3b80d8
The change r187019 has fixed multiple relocations in dynamic linker for
MIPS, so now this test passes for MIPS.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187053 91177308-0d34-0410-b5e6-96231b3b80d8
Improve the Finder to handle context of a DIVariable used by DbgValueInst.
Fix testing cases to make them pass the verifier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187052 91177308-0d34-0410-b5e6-96231b3b80d8
schedule an alloca for another iteration in SROA. This only showed up
with a mixture of promotable and unpromotable selects and phis. Added
a test case for this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187031 91177308-0d34-0410-b5e6-96231b3b80d8
pending speculation for a phi node. The problem here is that we were
using growth of the specluation set as an indicator of whether
speculation would occur, and if the phi node is already in the set we
don't see it grow. This is a symptom of the fact that this signal is
a total hack.
Unfortunately, I couldn't really come up with a non-hacky way of
signaling that promotion remains valid *after* speculation occurs, such
that we only speculate when all else looks good for promotion. In the
end, I went with at least a much more explicit approach of doing the
work of queuing inside the phi and select processing and setting
a preposterously named flag to convey that we're in the special state of
requiring speculating before promotion.
Thanks to Richard Trieu and Nick Lewycky for the excellent work reducing
a testcase for this from a pretty giant, nasty assert in a big
application. =] The testcase was excellent.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187029 91177308-0d34-0410-b5e6-96231b3b80d8
This commit also implements these functions for R600 and removes a test
case that was relying on the buggy behavior.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187007 91177308-0d34-0410-b5e6-96231b3b80d8
These are really the same address space in hardware. The only
difference is that CONSTANT_ADDRESS uses a special cache for faster
access. When we are unable to use the constant kcache for some reason
(e.g. smaller types or lack of indirect addressing) then the instruction
selector must use GLOBAL_ADDRESS loads instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187006 91177308-0d34-0410-b5e6-96231b3b80d8
Improve the Finder to handle context of a DIVariable.
If Scope is a DICompileUnit, add it to the list of CUs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187003 91177308-0d34-0410-b5e6-96231b3b80d8
When vectors are built from a single value, the ARM lowering issues a
scalar_to_vector node.
This node is then always morphed into a move from the general purpose unit to
the vector unit.
When the value comes from a load, this can be simplified into a vector load to
the right lane.
This patch changes the lowering of insert_vector_elt to expose a vector
friendly pattern in this situation.
This is a step toward fixing <rdar://problem/14170854>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186999 91177308-0d34-0410-b5e6-96231b3b80d8
The symbol table has forward references in the file. Instead of allocating
a temporary buffer or counting the size and then writing, this implementation
writes a dummy value first and patches it once the final value is known.
There is room for performance improvement. I will implement them as soon as I
get some other features (like a ranlib mode) in.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186934 91177308-0d34-0410-b5e6-96231b3b80d8
This increases the number of opportunites we have for folding. With the
previous implementation we were unable to fold into any instructions
other than the first when multiple instructions were selected from a
single SDNode.
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186919 91177308-0d34-0410-b5e6-96231b3b80d8
A side-effect of this is that now the compiler expects kernel arguments
to be 4-byte aligned.
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186916 91177308-0d34-0410-b5e6-96231b3b80d8
MDNodes used by DbgDeclareInst and DbgValueInst.
Another 16 testing cases failed and they are disabled with
-disable-debug-info-verifier.
A total of 34 cases are disabled with -disable-debug-info-verifier and will be
corrected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186902 91177308-0d34-0410-b5e6-96231b3b80d8
Use the function attributes to pass along the stack protector buffer size.
Now that we have robust function attributes, don't use a command line option to
specify the stack protecto buffer size.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186863 91177308-0d34-0410-b5e6-96231b3b80d8
Enable parsing all 32 floating point control registers $0-31 and stop trying to
parse floating point condition code register $fcc0. Also, return ParseFail if
the operand being parsed is not in the expected format.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186861 91177308-0d34-0410-b5e6-96231b3b80d8
instructions. With this patch:
1. ldr.n is recognized as mnemonic for the short encoding
2. ldr.w is recognized as menmonic for the long encoding
3. ldr will map to either short or long encodings depending on the size of the offset
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186831 91177308-0d34-0410-b5e6-96231b3b80d8
This matches gnu archive behavior and since archive member order can change
which member is used, not changing the order on replacement looks like the
right thing to do.
This patch also refactors the logic for which archive member to keep and
whether to move it to a helper function (computeInsertAction). The
nesting in computeNewArchiveMembers was getting a bit confusing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186829 91177308-0d34-0410-b5e6-96231b3b80d8
GNU ar when not given the a or b modifiers replaces archive members in the
same location of the old ones. I am about to implement that in llvm-ar. For
now, just don't depend on the current llvm-ar behavior on this test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186823 91177308-0d34-0410-b5e6-96231b3b80d8
We were incorrectly computing where to insert a member if it was replacing
a previous member that was before the insert point.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186792 91177308-0d34-0410-b5e6-96231b3b80d8
GlobalOpt simplifies llvm.compiler.used by removing any members that are also
in the more strict llvm.used. Handle the special case where llvm.compiler.used
becomes empty.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186778 91177308-0d34-0410-b5e6-96231b3b80d8
indirect branches correctly. Under some circumstances, this led to the deletion
of basic blocks that were the destination of indirect branches. In that case it
left indirect branches to nowhere in the code.
This patch replaces, and is more general than either of the previous fixes for
indirect-branch-analysis issues, r181161 and r186461.
For other branches (not indirect) this refactor should have *almost* identical
behavior to the previous version. There are some corner cases where this
refactor is able to analyze blocks that the previous version could not (e.g.
this necessitated the update to thumb2-ifcvt2.ll).
<rdar://problem/14464830>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186735 91177308-0d34-0410-b5e6-96231b3b80d8
The original change was rolled back in r186627 because of test
failures on the big endian machine. I believe I fixed the issue
so re-submitting.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186734 91177308-0d34-0410-b5e6-96231b3b80d8
We were incorrectly using compiler_used instead of compiler.used. Unfortunately
the passes using the broken name had tests also using the broken name.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186705 91177308-0d34-0410-b5e6-96231b3b80d8
The insn definitions themselves crept into r186689, sorry.
This should be the last of the distinct-ops instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186690 91177308-0d34-0410-b5e6-96231b3b80d8
Follows the same lines as r186686, but much more limited, since we only
use ADD LOGICAL for multi-i64 additions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186689 91177308-0d34-0410-b5e6-96231b3b80d8
The atomic tests assume the two-operand forms, so I've restricted them to z10.
Running and-01.ll, or-01.ll and xor-01.ll for z196 as well as z10 shows why
using convertToThreeAddress() is better than exposing the three-operand forms
first and then converting back to two operands where possible (which is what
I'd originally tried). Using the three-operand form first stops us from
taking advantage of NG, OG and XG for spills.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186683 91177308-0d34-0410-b5e6-96231b3b80d8
This first step just adds definitions for SLLK, SRLK and SRAK.
The next patch will actually make use of them during codegen.
insn-bad.s tests that some form of error is reported when using these
instructions on z10. More work is needed to get the "instruction requires:
distinct-ops" that we'd ideally like, so I've stubbed that part out for now.
I'll come back and make it mandatory once the necessary changes are in.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186680 91177308-0d34-0410-b5e6-96231b3b80d8
Somehow forgot to git rm these two files. I believe I left the remaining
invalid* tests intentionally, though whether my reasons were sound is a
different matter.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186663 91177308-0d34-0410-b5e6-96231b3b80d8
The tests were checking for barriers which the ARM ARM says they must execute
as a full system DMB/DSB, rather than that they're UNDEFINED and LLVM does in
fact represent them.
The tests happened to be passing because they were using a non-versioned ARM
triple which didn't have *any* DMB/DSB instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186662 91177308-0d34-0410-b5e6-96231b3b80d8
This allows "llvm-mc -disassemble" to accept two new features:
+ Using comma as a byte separator
+ Grouping bytes with '[' and ']' pairs.
The behaviour outside a [...] group is unchanged. But within the group once
llvm-mc encounters a true error, it stops rather than trying to resynchronise
the stream at the next byte. This is more useful for disassembly tests, where
we have an almost-instruction in mind and don't care what the misaligned
interpretation would be. Particularly if it means llvm-mc won't actually see
the next intended almost-instruction.
As a side effect, this means llvm-mc can disassemble its own -show-encoding
output if copy-pasted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186661 91177308-0d34-0410-b5e6-96231b3b80d8
SROA.
The crux of the issue is that now we track uses of a partition of the
alloca in two places: the iterators over the partitioning uses and the
previously collected split uses vector. We weren't accounting for the
fact that the split uses might invalidate integer widening in ways other
than due to their width (in this case due to being volatile).
Further reduced testcase added to the tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186655 91177308-0d34-0410-b5e6-96231b3b80d8