On MSVCRT and compatible, output of %e is incompatible to Posix by default. Number of exponent digits should be at least 2. "%+03d"
FIXME: Implement our formatter in future!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127872 91177308-0d34-0410-b5e6-96231b3b80d8
makes valgrind stop complaining about uninitialized variables being read when it
accesses a bitfield (category) that shares its bits with these variables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127871 91177308-0d34-0410-b5e6-96231b3b80d8
Stack slot real estate is virtually free compared to registers, so it is
advantageous to spill earlier even though the same value is now kept in both a
register and a stack slot.
Also eliminate redundant spills by extending the stack slot live range
underneath reloaded registers.
This can trigger a dead code elimination, removing copies and even reloads that
were only feeding spills.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127868 91177308-0d34-0410-b5e6-96231b3b80d8
This is not supposed to happen, but I have seen the x86 rematter getting
confused when rematerializing partial redefs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127857 91177308-0d34-0410-b5e6-96231b3b80d8
comparisons on x86. Essentially, the way this works is that SUB+SBB sets
the relevant flags the same way a double-width CMP would.
This is a substantial improvement over the generic lowering in LLVM. The output
is also shorter than the gcc-generated output; I haven't done any detailed
benchmarking, though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127852 91177308-0d34-0410-b5e6-96231b3b80d8
SCEV may generate expressions composed of multiple pointers, which can
lead to invalid GEP expansion. Until we can teach SCEV to follow strict
pointer rules, make sure no bad GEPs creep into IR.
Fixes rdar://problem/9038671.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127839 91177308-0d34-0410-b5e6-96231b3b80d8
o A8.6.195 STR (register) -- Encoding T1
o A8.6.193 STR (immediate, Thumb) -- Encoding T1
It has been changed so that now they use different addressing modes
and thus different MC representation (Operand Infos). Modify the
disassembler to reflect the change, and add relevant tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127833 91177308-0d34-0410-b5e6-96231b3b80d8
I have convinced myself that it can only happen when a phi value dies. When it
happens, allocate new virtual registers for the components.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127827 91177308-0d34-0410-b5e6-96231b3b80d8
rather than an int. Thankfully, this only causes LLVM to miss optimizations, not
generate incorrect code.
This just fixes the zext at the return. We still insert an i32 ZextAssert when
reading a function's arguments, but it is followed by a truncate and another i8
ZextAssert so it is not optimized.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127766 91177308-0d34-0410-b5e6-96231b3b80d8
plus the test where it used to break.", which broke Clang self-host of a
Debug+Asserts compiler, on OS X.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127763 91177308-0d34-0410-b5e6-96231b3b80d8
report_fatal_error() invokes exit(). We know report_fatal_error() might not write messages to stderr when any errors were detected on FD == 2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127726 91177308-0d34-0410-b5e6-96231b3b80d8
chose is having a non-memcpy/memset use and being larger than any native integer
type. Originally I chose having an access of a size smaller than the total size
of the alloca, but this caused some minor issues on the spirit benchmark where
SRoA runs again after some inlining.
This fixes <rdar://problem/8613163>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127718 91177308-0d34-0410-b5e6-96231b3b80d8
1. The ARM Darwin *r9 call instructions were pseudo-ized recently.
Modify the ARMDisassemblerCore.cpp file to accomodate the change.
2. The disassembler was unnecessarily adding 8 to the sign-extended imm24:
imm32 = SignExtend(imm24:'00', 32); // A8.6.23 BL, BLX (immediate)
// Encoding A1
It has no business doing such. Removed the offending logic.
Add test cases to arm-tests.txt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127707 91177308-0d34-0410-b5e6-96231b3b80d8
accept. If a value in the mask is out of range, it uses the value 0, for VTBL,
or leaves the value unchanged, for VTBX.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127700 91177308-0d34-0410-b5e6-96231b3b80d8
After live range splitting, an original value may be available in multiple
registers. Tracing back through the registers containing the same value, find
the best place to insert a spill, determine if the value has already been
spilled, or discover a reaching def that may be rematerialized.
This is only the analysis part. The information is not used for anything yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127698 91177308-0d34-0410-b5e6-96231b3b80d8
memory builtins as equivalent to malloc/free.
This is different from any attribute we have. For example, you can delete the
allocators when their result is unused, but you can't collapse two calls to the
same function, even if no global/memory state has changed in between. The
noalias return states that the result does not alias any other pointer, but
instcombine optimizes malloc() as though the result is non-null for the purpose
of eliminating unused pointers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127673 91177308-0d34-0410-b5e6-96231b3b80d8
v2 = bitcast v1
...
v3 = bitcast v2
...
= v3
=>
v2 = bitcast v1
...
= v1
if v1 and v3 are of in the same register class.
bitcast between i32 and fp (and others) are often not nops since they
are in different register classes. These bitcast instructions are often
left because they are in different basic blocks and cannot be
eliminated by dag combine.
rdar://9104514
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127668 91177308-0d34-0410-b5e6-96231b3b80d8
in the instruction tables and fixed a few bugs that
were causing decode conflicts. Rudimentary tests
are coming up in the next patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127646 91177308-0d34-0410-b5e6-96231b3b80d8
instruction set. This code adds support for the VEX prefix
and for the YMM registers accessible on AVX-enabled
architectures. Instruction table support that enables AVX
instructions for the disassembler is in an upcoming patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127644 91177308-0d34-0410-b5e6-96231b3b80d8
register operand was erroneously added. Remove an incorrect assert which triggers the bug.
rdar://problem/9131529
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127642 91177308-0d34-0410-b5e6-96231b3b80d8
and then go kablooie. The problem was that it was tracking the PHI nodes anew
each time into this function. But it didn't need to. And because the recursion
didn't know that a PHINode was visited before, it would go ahead and call
itself.
There is a testcase, but unfortunately it's too big to add. This problem will go
away with the EH rewrite.
<rdar://problem/8856298>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127640 91177308-0d34-0410-b5e6-96231b3b80d8
Also more cleanly separate the ARM vs. Thumb functionality. Previously, the
encoding would be incorrect for some Thumb instructions (the indirect calls).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127637 91177308-0d34-0410-b5e6-96231b3b80d8
Remove the unused reserved_ bit vector, no functional change intended.
This doesn't break 'svn blame', this file really is all my fault.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127607 91177308-0d34-0410-b5e6-96231b3b80d8
properties.
Added the self-wrap flag for SCEV::AddRecExpr.
A slew of temporary FIXMEs indicate the intention of the no-self-wrap flag
without changing behavior in this revision.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127590 91177308-0d34-0410-b5e6-96231b3b80d8
load and store reference same memory location, the memory location
is represented by getelementptr with two uses (load and store) and
the getelementptr's base is alloca with single use. At this point,
instructions from alloca to store can be removed.
(this pattern is generated when bitfield is accessed.)
For example,
%u = alloca %struct.test, align 4 ; [#uses=1]
%0 = getelementptr inbounds %struct.test* %u, i32 0, i32 0;[#uses=2]
%1 = load i8* %0, align 4 ; [#uses=1]
%2 = and i8 %1, -16 ; [#uses=1]
%3 = or i8 %2, 5 ; [#uses=1]
store i8 %3, i8* %0, align 4
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127565 91177308-0d34-0410-b5e6-96231b3b80d8
This allows the allocator to free any resources used by the virtual register,
including physical register assignments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127560 91177308-0d34-0410-b5e6-96231b3b80d8
llvm-gcc-i386-linux-selfhost and llvm-x86_64-linux-checks buildbots.
The original log entry:
Remove optimization emitting a reference insted of label difference, since
it can create more relocations. Removed isBaseAddressKnownZero method,
because it is no longer used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127540 91177308-0d34-0410-b5e6-96231b3b80d8
Live range splitting can create a number of small live ranges containing only a
single real use. Spill these small live ranges along with the large range they
are connected to with copies. This enables memory operand folding and maximizes
the spill to fill distance.
Work in progress with known bugs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127529 91177308-0d34-0410-b5e6-96231b3b80d8
There are too many compatibility problems with using mixed types in
std::upper_bound, and I don't want to spend 110 lines of boilerplate setting up
a call to a 10-line function. Binary search is not /that/ hard to implement
correctly.
I tried terminating the binary search with a linear search, but that actually
made the algorithm slower against my expectation. Most live intervals have less
than 4 segments. The early test against endIndex() does pay, and this version is
25% faster than plain std::upper_bound().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127522 91177308-0d34-0410-b5e6-96231b3b80d8
Go ahead and add them on when we might want to use them and let
later passes remove them.
Fixes rdar://9118569
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127518 91177308-0d34-0410-b5e6-96231b3b80d8
actual instruction as the non-Darwin defs, but have different call-clobber
semantics and so need separate patterns. They don't need to duplicate the
encoding information, however.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127515 91177308-0d34-0410-b5e6-96231b3b80d8
Optimize trivial branches in CodeGenPrepare, which often get created from the
lowering of objectsize intrinsics. Unfortunately, a number of tests were relying
on llc not optimizing trivial branches, so I had to add an option to allow them
to continue to test what they originally tested.
This fixes <rdar://problem/8785296> and <rdar://problem/9112893>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127498 91177308-0d34-0410-b5e6-96231b3b80d8
protector insertion not working correctly with unreachable code. Since that
revision was rolled out, this test doesn't actual fail before this fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127497 91177308-0d34-0410-b5e6-96231b3b80d8
The existing CompEnd predicate does not define a strict weak order as required
by the C++03 standard; therefore, its use as a predicate to std::upper_bound
is invalid. For a discussion of this issue, see
http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-defects.html#270
This patch replaces the asymmetrical comparison with an iterator adaptor that
achieves the same effect while being strictly standard-conforming by ensuring
an apples-to-apples comparison.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127462 91177308-0d34-0410-b5e6-96231b3b80d8
lowering of objectsize intrinsics. Unfortunately, a number of tests were relying
on llc not optimizing trivial branches, so I had to add an option to allow them
to continue to test what they originally tested.
This fixes <rdar://problem/8785296> and <rdar://problem/9112893>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127459 91177308-0d34-0410-b5e6-96231b3b80d8
corresponding testcases back to the previous versions.
Fixes some performance regressions only seen on 32-bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127441 91177308-0d34-0410-b5e6-96231b3b80d8
Value, not an Instruction, so casting is not necessary. Also,
it's theoretically possible that the Value is not an
Instruction, since WeakVH follows RAUWs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127427 91177308-0d34-0410-b5e6-96231b3b80d8
after it has finished all of its reassociations, because its
habit of unlinking operands and holding them in a datastructure
while working means that it's not easy to determine when an
instruction is really dead until after all its regular work is
done. rdar://9096268.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127424 91177308-0d34-0410-b5e6-96231b3b80d8
This happens a lot in clang-compiled C++ code because it adds overflow checks to operator new[]:
unsigned *foo(unsigned n) { return new unsigned[n]; }
We can optimize away the overflow check on 64 bit targets because (uint64_t)n*4 cannot overflow.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127418 91177308-0d34-0410-b5e6-96231b3b80d8
It generates output that lools like
8 times line number info lost by Scalar Replacement of Aggregates (SSAUp)
1 times line number info lost by Simplify well-known library calls
12 times variable info lost by Jump Threading
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127381 91177308-0d34-0410-b5e6-96231b3b80d8
flexible.
If it returns a register class that's different from the input, then that's the
register class used for cross-register class copies.
If it returns a register class that's the same as the input, then no cross-
register class copies are needed (normal copies would do).
If it returns null, then it's not at all possible to copy registers of the
specified register class.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127368 91177308-0d34-0410-b5e6-96231b3b80d8
The insufficient encoding information of the combined instruction confuses the decoder wrt
UQADD16. Add extra logic to recover from that.
Fixed an assert reported by Sean Callanan
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127354 91177308-0d34-0410-b5e6-96231b3b80d8
The damage done by physreg coalescing only depends on the number of instructions
the extended physreg live range covers. This fixes PR9438.
The heuristic is still luck-based, and physreg coalescing really should be
disabled completely. We need a register allocator with better hinting support
before that is possible.
Convert a test to FileCheck and force spilling by inserting an extra call. The
previous spilling behavior was dependent on misguided physreg coalescing
decisions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127351 91177308-0d34-0410-b5e6-96231b3b80d8
When ExactBECount is a constant, use it for MaxBECount.
When MaxBECount cannot be computed, replace it with ExactBECount.
Fixes PR9424.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127342 91177308-0d34-0410-b5e6-96231b3b80d8
alloca as both integer and floating-point vectors of the same size. Bugpoint is
not cooperating with me, but I'll try to find a manual testcase tomorrow.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127320 91177308-0d34-0410-b5e6-96231b3b80d8
gave up when I realized I couldn't come up with a good name for what the
refactored function would be, to describe what it does.
This is PR9343 test12, which is test3 with arguments reordered. Whoops!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127318 91177308-0d34-0410-b5e6-96231b3b80d8
a union of a float, <2 x float>, and <4 x float>. This mostly comes up with the
use of vector intrinsics, especially in NEON when programmers know the layout of
the register file. This enables codegen to eliminate a lot of the subregister
traffic it would otherwise generate.
This commit only enables this for a small number of floating-point cases, but a
lot more integer cases. I assume this is okay for all ports, but I did not do
extensive testing of the quality of code involving i512 vectors and the like. If
there is a use case where this generates worse code than before, let me know and
we can scale it back.
This fixes <rdar://problem/9036264>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127317 91177308-0d34-0410-b5e6-96231b3b80d8
This will we used for keeping register allocator data structures up to date
while LiveRangeEdit is trimming live intervals.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127300 91177308-0d34-0410-b5e6-96231b3b80d8
reachable uses, but there still might be uses in dead blocks. Use the
standard solution of replacing all the uses with undef. This is
a rare case because it's very sensitive to phase ordering in SimplifyCFG.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127299 91177308-0d34-0410-b5e6-96231b3b80d8
LiveRangeEdit::eliminateDeadDefs() will eventually be used by coalescing,
splitting, and spilling for dead code elimination. It can delete chains of dead
instructions as long as there are no dependency loops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127287 91177308-0d34-0410-b5e6-96231b3b80d8
with this before since none of the register tracking or nightly tests
had unschedulable nodes.
This should probably be refixed with a special default Node that just
returns some "don't touch me" values.
Fixes PR9427
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127263 91177308-0d34-0410-b5e6-96231b3b80d8
testcases accordingly. Some are currently xfailed and will be filed
as bugs to be fixed or understood.
Performance results:
roughly neutral on SPEC
some micro benchmarks in the llvm suite are up between 100 and 150%, only
a pair of regressions that are due to be investigated
john-the-ripper saw:
10% improvement in traditional DES
8% improvement in BSDI DES
59% improvement in FreeBSD MD5
67% improvement in OpenBSD Blowfish
14% improvement in LM DES
Small compile time impact.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127208 91177308-0d34-0410-b5e6-96231b3b80d8
This change uses the MaxReorderWindow for both height and depth, which
tends to limit the negative effects of high register pressure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127203 91177308-0d34-0410-b5e6-96231b3b80d8
This allows LLVM IR using ptx_kernel or ptx_device calling
conventions to be properly printed when emitted in text form.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127157 91177308-0d34-0410-b5e6-96231b3b80d8
the value splatted into every element. Extend this to getTrue and getFalse which
by providing new overloads that take Types that are either i1 or <N x i1>. Use
it in InstCombine to add vector support to some code, fixing PR8469!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127116 91177308-0d34-0410-b5e6-96231b3b80d8
This is just very first approximation how the stuff should be done
(e.g. ARM-only for now). More to follow.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127101 91177308-0d34-0410-b5e6-96231b3b80d8
The coalescer can in very rare cases leave too large live intervals around after
rematerializing cheap-as-a-move instructions.
Linear scan doesn't really care, but live range splitting gets very confused
when a live range is killed by a ghost instruction.
I will fix this properly in the coalescer after 2.9 branches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127096 91177308-0d34-0410-b5e6-96231b3b80d8
the percentage of time spent in CodeGenPrepare when llcing 403.gcc from 12.6% to
1.8% of total llc time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127069 91177308-0d34-0410-b5e6-96231b3b80d8
regs. This is the only change in this checkin that may affects the
default scheduler. With better register tracking and heuristics, it
doesn't make sense to artificially lower the register limit so much.
Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to
give the scheduler a way to account for div and sqrt on targets that
don't have an itinerary. It is currently defaults to 10 (the actual
number doesn't matter much), but only takes effect on non-default
schedulers: list-hybrid and list-ilp.
Added several heuristics that can be individually disabled for the
non-default sched=list-ilp mode. This helps us determine how much
better we can do on a given benchmark than the default
scheduler. Certain compute intensive loops run much faster in this
mode with the right set of heuristics, and it doesn't seem to have
much negative impact elsewhere. Not all of the heuristics are needed,
but we still need to experiment to decide which should be disabled by
default for sched=list-ilp.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127067 91177308-0d34-0410-b5e6-96231b3b80d8
possible. This goes into instcombine and instsimplify because instsimplify
doesn't need to check hasOneUse since it returns (almost exclusively) constants.
This fixes PR9343 #4#5 and #8!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127064 91177308-0d34-0410-b5e6-96231b3b80d8
The global cost is the sum of block frequencies for spill code that must be
inserted because preferences weren't met.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127062 91177308-0d34-0410-b5e6-96231b3b80d8
This simplifies the code and makes it faster too.
The interference patterns are saved for each candidate register. It will be
reused for actually executing the split. Work in progress.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127054 91177308-0d34-0410-b5e6-96231b3b80d8
It gives better results. Sometimes, a live range can be large and still have
high spill weight. Such a range should not be spilled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127036 91177308-0d34-0410-b5e6-96231b3b80d8
bitcasts, which are really no-ops here. This fixes slowdowns on
MultiSource/Applications/aha and others.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127031 91177308-0d34-0410-b5e6-96231b3b80d8