StringMap was not properly updating NumTombstones after a clear or rehash.
This was not fatal until now because the table was growing faster than
NumTombstones could, but with the previous change of preventing infinite
growth of the table the invariant (NumItems + NumTombstones <= NumBuckets)
stopped being observed, causing infinite loops in certain situations.
Patch by José Fonseca!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128567 91177308-0d34-0410-b5e6-96231b3b80d8
When the hash function uses object pointers all free entries eventually
become tombstones as they are used at least once, regardless of the size.
DenseMap cannot function with zero empty keys, so it double size to get
get ridof the tombstones.
However DenseMap never shrinks automatically unless it is cleared, so
the net result is that certain tables grow infinitely.
The solution is to make a fresh copy of the table without tombstones
instead of doubling size, by simply calling grow with the current size.
Patch by José Fonseca!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128564 91177308-0d34-0410-b5e6-96231b3b80d8
The idea is, that if an ieee 754 float is divided by a power of two, we can
turn the division into a cheaper multiplication. This function sees if we can
get an exact multiplicative inverse for a divisor and returns it if possible.
This is the hard part of PR9587.
I tested many inputs against llvm-gcc's frotend implementation of this
optimization and didn't find any difference. However, floating point is the
land of weird edge cases, so any review would be appreciated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128545 91177308-0d34-0410-b5e6-96231b3b80d8
was lowering them to sext / uxt + mul instructions. Unfortunately the
optimization passes may hoist the extensions out of the loop and separate them.
When that happens, the long multiplication instructions can be broken into
several scalar instructions, causing significant performance issue.
Note the vmla and vmls intrinsics are not added back. Frontend will codegen them
as intrinsics vmull* + add / sub. Also note the isel optimizations for catching
mul + sext / zext are not changed either.
First part of rdar://8832507, rdar://9203134
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128502 91177308-0d34-0410-b5e6-96231b3b80d8
otool(1), this time with the needed fix for case sensitive file systems :) .
This is a work in progress as the interface for producing symbolic operands is
not done. But a hacked prototype using information from the object file's
relocation entiries and replacing immediate operands with MCExpr's has been
shown to work with no changes to the instrucion printer. These APIs will be
moved into a dynamic library at some point.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128415 91177308-0d34-0410-b5e6-96231b3b80d8
Correctly terminate the range of register DBG_VALUEs when the register is
clobbered or when the basic block ends.
The code is now ready to deal with variables that are sometimes in a register
and sometimes on the stack. We just need to teach emitDebugLoc to say 'stack
slot'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128327 91177308-0d34-0410-b5e6-96231b3b80d8
This is a work in progress as the interface for producing symbolic operands is
not done. But a hacked prototype using information from the object file's
relocation entiries and replacing immediate operands with MCExpr's has been
shown to work with no changes to the instrucion printer. These APIs will be
moved into a dynamic library at some point.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128308 91177308-0d34-0410-b5e6-96231b3b80d8
The MC asm lexer wasn't honoring a non-default (anything but ';') statement
separator. Fix that, and generalize a bit to support multi-character
statement separators.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128227 91177308-0d34-0410-b5e6-96231b3b80d8
Move the dynamic linking functionality of the llvm-rtdyld program into an
ExecutionEngine support library. Update llvm-rtdyld to just load an object
file into memory, use the library to process it, then run the _main()
function, if one is found.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128031 91177308-0d34-0410-b5e6-96231b3b80d8
the alias of an InstAlias instead of the thing being aliased. Because we need to
know the features that are valid for an InstAlias.
This is part of a work-in-progress.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127986 91177308-0d34-0410-b5e6-96231b3b80d8
to have single return block (at least getting there) for optimizations. This
is general goodness but it would prevent some tailcall optimizations.
One specific case is code like this:
int f1(void);
int f2(void);
int f3(void);
int f4(void);
int f5(void);
int f6(void);
int foo(int x) {
switch(x) {
case 1: return f1();
case 2: return f2();
case 3: return f3();
case 4: return f4();
case 5: return f5();
case 6: return f6();
}
}
=>
LBB0_2: ## %sw.bb
callq _f1
popq %rbp
ret
LBB0_3: ## %sw.bb1
callq _f2
popq %rbp
ret
LBB0_4: ## %sw.bb3
callq _f3
popq %rbp
ret
This patch teaches codegenprep to duplicate returns when the return value
is a phi and where the phi operands are produced by tail calls followed by
an unconditional branch:
sw.bb7: ; preds = %entry
%call8 = tail call i32 @f5() nounwind
br label %return
sw.bb9: ; preds = %entry
%call10 = tail call i32 @f6() nounwind
br label %return
return:
%retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ]
ret i32 %retval.0
This allows codegen to generate better code like this:
LBB0_2: ## %sw.bb
jmp _f1 ## TAILCALL
LBB0_3: ## %sw.bb1
jmp _f2 ## TAILCALL
LBB0_4: ## %sw.bb3
jmp _f3 ## TAILCALL
rdar://9147433
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127953 91177308-0d34-0410-b5e6-96231b3b80d8
Proof-of-concept code that code-gens a module to an in-memory MachO object.
This will be hooked up to a run-time dynamic linker library (see: llvm-rtdyld
for similarly conceptual work for that part) which will take the compiled
object and link it together with the rest of the system, providing back to the
JIT a table of available symbols which will be used to respond to the
getPointerTo*() queries.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127916 91177308-0d34-0410-b5e6-96231b3b80d8
For example, on 32-bit architecture, don't promote all uses of the IV
to 64-bits just because one use is a 64-bit cast.
Alternate implementation of the patch by Arnaud de Grandmaison.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127884 91177308-0d34-0410-b5e6-96231b3b80d8
SCEV may generate expressions composed of multiple pointers, which can
lead to invalid GEP expansion. Until we can teach SCEV to follow strict
pointer rules, make sure no bad GEPs creep into IR.
Fixes rdar://problem/9038671.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127839 91177308-0d34-0410-b5e6-96231b3b80d8
I have convinced myself that it can only happen when a phi value dies. When it
happens, allocate new virtual registers for the components.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127827 91177308-0d34-0410-b5e6-96231b3b80d8
rather than an int. Thankfully, this only causes LLVM to miss optimizations, not
generate incorrect code.
This just fixes the zext at the return. We still insert an i32 ZextAssert when
reading a function's arguments, but it is followed by a truncate and another i8
ZextAssert so it is not optimized.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127766 91177308-0d34-0410-b5e6-96231b3b80d8
doesn't return, so just go back to using the old runtime function instead
of trying to use abort() when __builtin_unreachable (or an equivalent) isn't
supported.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127629 91177308-0d34-0410-b5e6-96231b3b80d8
where none was before. Just don't declare it and hope it's declared
in every translation unit that needs it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127612 91177308-0d34-0410-b5e6-96231b3b80d8
properties.
Added the self-wrap flag for SCEV::AddRecExpr.
A slew of temporary FIXMEs indicate the intention of the no-self-wrap flag
without changing behavior in this revision.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127590 91177308-0d34-0410-b5e6-96231b3b80d8
llvm-gcc-i386-linux-selfhost and llvm-x86_64-linux-checks buildbots.
The original log entry:
Remove optimization emitting a reference insted of label difference, since
it can create more relocations. Removed isBaseAddressKnownZero method,
because it is no longer used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127540 91177308-0d34-0410-b5e6-96231b3b80d8
It generates output that lools like
8 times line number info lost by Scalar Replacement of Aggregates (SSAUp)
1 times line number info lost by Simplify well-known library calls
12 times variable info lost by Jump Threading
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127381 91177308-0d34-0410-b5e6-96231b3b80d8
flexible.
If it returns a register class that's different from the input, then that's the
register class used for cross-register class copies.
If it returns a register class that's the same as the input, then no cross-
register class copies are needed (normal copies would do).
If it returns null, then it's not at all possible to copy registers of the
specified register class.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127368 91177308-0d34-0410-b5e6-96231b3b80d8
the value splatted into every element. Extend this to getTrue and getFalse which
by providing new overloads that take Types that are either i1 or <N x i1>. Use
it in InstCombine to add vector support to some code, fixing PR8469!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127116 91177308-0d34-0410-b5e6-96231b3b80d8
This is just very first approximation how the stuff should be done
(e.g. ARM-only for now). More to follow.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127101 91177308-0d34-0410-b5e6-96231b3b80d8
This makes lookup slightly more expensive but it's worth it, unused
DenseMaps are common in LLVM code apparently.
1% speedup on clang -O3 bzip2.c
4% speedup on clang -O3 oggenc.c (Release build of clang on i386/linux)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127088 91177308-0d34-0410-b5e6-96231b3b80d8
regs. This is the only change in this checkin that may affects the
default scheduler. With better register tracking and heuristics, it
doesn't make sense to artificially lower the register limit so much.
Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to
give the scheduler a way to account for div and sqrt on targets that
don't have an itinerary. It is currently defaults to 10 (the actual
number doesn't matter much), but only takes effect on non-default
schedulers: list-hybrid and list-ilp.
Added several heuristics that can be individually disabled for the
non-default sched=list-ilp mode. This helps us determine how much
better we can do on a given benchmark than the default
scheduler. Certain compute intensive loops run much faster in this
mode with the right set of heuristics, and it doesn't seem to have
much negative impact elsewhere. Not all of the heuristics are needed,
but we still need to experiment to decide which should be disabled by
default for sched=list-ilp.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127067 91177308-0d34-0410-b5e6-96231b3b80d8
Initially, slot indexes are quad-spaced. There is room for inserting up to 3
new instructions between the original instructions.
When we run out of indexes between two instructions, renumber locally using
double-spaced indexes. The original quad-spacing means that we catch up quickly,
and we only have to renumber a handful of instructions to get a monotonic
sequence. This is much faster than renumbering the whole function as we did
before.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127023 91177308-0d34-0410-b5e6-96231b3b80d8
it. It's been assumed up til now that it would be in its immediate
successor. However, this isn't necessarily the case. It could be in one of its
successor's successors.
Modify the code to more thoroughly check for an 'eh.selector' call in
successors. It only looks at a successor if we get there as a result of an
unconditional branch.
Testcase ObjC/exceptions-4.m in r126968.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126969 91177308-0d34-0410-b5e6-96231b3b80d8
and iprintf is available on the target. Currently iprintf is only
marked as being available on the XCore.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126935 91177308-0d34-0410-b5e6-96231b3b80d8
This is much faster than using a pointer to a ManagedStatic object accessed with
a function call. The greedy register allocator is 5% faster overall just from
the SlotIndex default constructor savings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126925 91177308-0d34-0410-b5e6-96231b3b80d8
The SlotIndex created by the default construction does not represent a position
in the function, and it doesn't make sense to compare it to other indexes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126924 91177308-0d34-0410-b5e6-96231b3b80d8
uses.
The result produced by the streamer is used to give the linker more accurate
information and to add to llvm.compiler.used. The second improvement removes
the need for the user to add __attribute__((used)) to functions only used in
inline asm. The first one lets us build firefox with LTO on Darwin :-)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126830 91177308-0d34-0410-b5e6-96231b3b80d8
This method could probably be used by LiveIntervalAnalysis::shrinkToUses, and
now it can use extendIntervalEndTo() which coalesces ranges.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126803 91177308-0d34-0410-b5e6-96231b3b80d8
Use 8 bits from line number field to keep track of argument ordering while encoding debug info for an argument. That leaves 24 bit for line no, DebugLoc also allocates 24 bit for line numbers. If a function has more than 255 arguments then rest of the arguments will be ordered by llvm.dbg.* intrinsics' ordering in IR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126793 91177308-0d34-0410-b5e6-96231b3b80d8