the value splatted into every element. Extend this to getTrue and getFalse which
by providing new overloads that take Types that are either i1 or <N x i1>. Use
it in InstCombine to add vector support to some code, fixing PR8469!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127116 91177308-0d34-0410-b5e6-96231b3b80d8
This is just very first approximation how the stuff should be done
(e.g. ARM-only for now). More to follow.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127101 91177308-0d34-0410-b5e6-96231b3b80d8
This makes lookup slightly more expensive but it's worth it, unused
DenseMaps are common in LLVM code apparently.
1% speedup on clang -O3 bzip2.c
4% speedup on clang -O3 oggenc.c (Release build of clang on i386/linux)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127088 91177308-0d34-0410-b5e6-96231b3b80d8
regs. This is the only change in this checkin that may affects the
default scheduler. With better register tracking and heuristics, it
doesn't make sense to artificially lower the register limit so much.
Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to
give the scheduler a way to account for div and sqrt on targets that
don't have an itinerary. It is currently defaults to 10 (the actual
number doesn't matter much), but only takes effect on non-default
schedulers: list-hybrid and list-ilp.
Added several heuristics that can be individually disabled for the
non-default sched=list-ilp mode. This helps us determine how much
better we can do on a given benchmark than the default
scheduler. Certain compute intensive loops run much faster in this
mode with the right set of heuristics, and it doesn't seem to have
much negative impact elsewhere. Not all of the heuristics are needed,
but we still need to experiment to decide which should be disabled by
default for sched=list-ilp.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127067 91177308-0d34-0410-b5e6-96231b3b80d8
Initially, slot indexes are quad-spaced. There is room for inserting up to 3
new instructions between the original instructions.
When we run out of indexes between two instructions, renumber locally using
double-spaced indexes. The original quad-spacing means that we catch up quickly,
and we only have to renumber a handful of instructions to get a monotonic
sequence. This is much faster than renumbering the whole function as we did
before.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127023 91177308-0d34-0410-b5e6-96231b3b80d8
it. It's been assumed up til now that it would be in its immediate
successor. However, this isn't necessarily the case. It could be in one of its
successor's successors.
Modify the code to more thoroughly check for an 'eh.selector' call in
successors. It only looks at a successor if we get there as a result of an
unconditional branch.
Testcase ObjC/exceptions-4.m in r126968.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126969 91177308-0d34-0410-b5e6-96231b3b80d8
and iprintf is available on the target. Currently iprintf is only
marked as being available on the XCore.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126935 91177308-0d34-0410-b5e6-96231b3b80d8
This is much faster than using a pointer to a ManagedStatic object accessed with
a function call. The greedy register allocator is 5% faster overall just from
the SlotIndex default constructor savings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126925 91177308-0d34-0410-b5e6-96231b3b80d8
The SlotIndex created by the default construction does not represent a position
in the function, and it doesn't make sense to compare it to other indexes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126924 91177308-0d34-0410-b5e6-96231b3b80d8
uses.
The result produced by the streamer is used to give the linker more accurate
information and to add to llvm.compiler.used. The second improvement removes
the need for the user to add __attribute__((used)) to functions only used in
inline asm. The first one lets us build firefox with LTO on Darwin :-)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126830 91177308-0d34-0410-b5e6-96231b3b80d8
This method could probably be used by LiveIntervalAnalysis::shrinkToUses, and
now it can use extendIntervalEndTo() which coalesces ranges.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126803 91177308-0d34-0410-b5e6-96231b3b80d8
Use 8 bits from line number field to keep track of argument ordering while encoding debug info for an argument. That leaves 24 bit for line no, DebugLoc also allocates 24 bit for line numbers. If a function has more than 255 arguments then rest of the arguments will be ordered by llvm.dbg.* intrinsics' ordering in IR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126793 91177308-0d34-0410-b5e6-96231b3b80d8
a StringRef, for the benefit of clients that want the result as a
nul-terminated string. Clients that expect a StringRef will get one via
the implicit conversion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126784 91177308-0d34-0410-b5e6-96231b3b80d8
Introduce a variable in the AsmParserExtension whether [] is valid in an
expression. If it is true, parse them like (). Enable this for ELF only.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126443 91177308-0d34-0410-b5e6-96231b3b80d8
registers at phis. This enables us to eliminate a lot of pointless zexts during
the DAGCombine phase. This fixes <rdar://problem/8760114>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126380 91177308-0d34-0410-b5e6-96231b3b80d8
events on the thread and wait until a resource is ready to event. The vector
of the resource that is ready is returned.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126320 91177308-0d34-0410-b5e6-96231b3b80d8
template <class T1, class T2> pair<T1,T2> make_pair(const T1&, const T2&);
to
template <class T1, class T2> pair<V1, V2> make_pair(T1&&, T2&&);
so explicitly specifying the template arguments to make_pair<> is going to break
when C++0x rolls through. Replace them with equivalent std::pair<>. Patch by
James Dennett!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126256 91177308-0d34-0410-b5e6-96231b3b80d8
share entries. Add a DenseSet to MachineConstantPool for the MachineCPVs that
it owns.
This will hopefully fix the MC/ARM/elf-reloc-01.ll failure on the leaks bots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126218 91177308-0d34-0410-b5e6-96231b3b80d8
at phis. This enables us to eliminate a lot of pointless zexts during the DAGCombine
phase. This fixes <rdar://problem/8760114>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126170 91177308-0d34-0410-b5e6-96231b3b80d8
In other words, do not keep track of argument's location. The debugger (gdb) is not prepared to see line table entries for arguments. For the debugger, "second" line table entry marks beginning of function body.
This requires some coordination with debugger to get this working.
- The debugger needs to be aware of prolog_end attribute attached with line table entries.
- The compiler needs to accurately mark prolog_end in line table entries (at -O0 and at -O1+)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126155 91177308-0d34-0410-b5e6-96231b3b80d8
itself without going via a phi node then we could return false here in
spite of making a change. Also, tweak the comment because this method
can (and always could) return true without deleting the original phi node.
For example, if the phi node was used by a read-only invoke instruction
which is used by another phi node phi2 which is only used by and only uses
the invoke, then phi2 would be deleted but not the invoke instruction and
not the original phi node.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126129 91177308-0d34-0410-b5e6-96231b3b80d8
need to be pulled out of the pass manager when the user specifies
-fno-builtin. It can intelligently determine which libcalls to
optimize based on what is enabled in TargetLibraryInfo. This
allows -fno-builtin-foo to work someday.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125981 91177308-0d34-0410-b5e6-96231b3b80d8
A major part of its (eventual) goal is to support a much cleaner separation between disassembly callbacks
provided by the target and the disassembler emitter itself, i.e. not requiring hardcoding of knowledge in tblgen
like the existing disassembly emitters do.
The hope is that some day this will allow us to replace the existing non-Thumb ARM disassembler and remove
some of the hacks the old one introduced to tblgen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125966 91177308-0d34-0410-b5e6-96231b3b80d8
query about available library functions. For now this just has
memset_pattern16, which exists on darwin, but it can be extended for a
bunch of other things in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125965 91177308-0d34-0410-b5e6-96231b3b80d8
No one uses *-mingw64. mingw-w64 is represented as {i686|x86_64}-w64-mingw32. In llvm side, i686 and x64 can be treated as similar way.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125747 91177308-0d34-0410-b5e6-96231b3b80d8
Simplify the spill weight calculation a bit by bypassing
getApproximateInstructionCount() and using LiveInterval::getSize() directly.
This changes the computed spill weights, but only by a constant factor in each
function. It should not affect how spill weights compare against each other, and
so it shouldn't affect code generation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125530 91177308-0d34-0410-b5e6-96231b3b80d8
use in many places where we pass a pointer and size to abstract APIs
that can take C arrays, std::vector, SmallVector, etc. It is to arrays
what StringRef is to strings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125486 91177308-0d34-0410-b5e6-96231b3b80d8
generating i8 shift amounts for things like i1024 types. Add
an assert in getNode to prevent this from occuring in the future,
fix the buggy transformation, revert my previous patch, and
document this gotcha in ISDOpcodes.h
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125465 91177308-0d34-0410-b5e6-96231b3b80d8
objects, since they'll end up using the implicit conversion to "bool"
and causing some very "fun" surprises.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125380 91177308-0d34-0410-b5e6-96231b3b80d8
for NSW/NUW binops to follow the pattern of exact binops. This
allows someone to use Builder.CreateAdd(x, y, "tmp", MaybeNUW);
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125270 91177308-0d34-0410-b5e6-96231b3b80d8
name of a path, after resolving symbolic links and eliminating excess
path elements such as "foo/../" and "./".
This routine still needs a Windows implementation, but I don't have a
Windows machine available. Help? Please?
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125228 91177308-0d34-0410-b5e6-96231b3b80d8
versions of creation functions. Eventually, the "insertion point" versions
of these should just be removed, we do have IRBuilder afterall.
Do a massive rewrite of much of pattern match. It is now shorter and less
redundant and has several other widgets I will be using in other patches.
Among other changes, m_Div is renamed to m_IDiv (since it only matches
integer divides) and m_Shift is gone (it used to match all binops!!) and
we now have m_LogicalShift for the one client to use.
Enhance IRBuilder to have "isExact" arguments to things like CreateUDiv
and reduce redundancy within IRbuilder by having these methods chain to
each other more instead of duplicating code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125194 91177308-0d34-0410-b5e6-96231b3b80d8
AC_CHECK_FUNCS seeks a symbol only in libs. We should check the declaration in string.h.
FIXME: I have never seen mingw(s) have strerror_s() (not _strerror_s()).
FIXME: Autoconf/CMake may seek strerror_s() with the definition MINGW_HAS_SECURE_API in future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125172 91177308-0d34-0410-b5e6-96231b3b80d8
This is a lot easier than trying to get kill flags right during live range
splitting and rematerialization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125113 91177308-0d34-0410-b5e6-96231b3b80d8
After uses of a live range are removed, recompute the live range to only cover
the remaining uses. This is necessary after rematerializing the value before
some (but not all) uses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125058 91177308-0d34-0410-b5e6-96231b3b80d8
Motivation: Improve the parsing of not usual (different from registers or
immediates) operand forms.
This commit implements only the generic support. The ARM specific modifications
will come next.
A table like the one below is autogenerated for every instruction
containing a 'ParserMethod' in its AsmOperandClass
static const OperandMatchEntry OperandMatchTable[20] = {
/* Mnemonic, Operand List Mask, Operand Class, Features */
{ "cdp", 29 /* 0, 2, 3, 4 */, MCK_Coproc, Feature_IsThumb|Feature_HasV6 },
{ "cdp", 58 /* 1, 3, 4, 5 */, MCK_Coproc, Feature_IsARM },
A matcher function very similar (but lot more naive) to
MatchInstructionImpl scans the table. After the mnemonic match, the
features are checked and if the "to be parsed" operand index is
present in the mask, there's a real match. Then, a switch like the one
below dispatch the parsing to the custom method provided in
'ParseMethod':
case MCK_Coproc:
return TryParseCoprocessorOperandName(Operands);
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125030 91177308-0d34-0410-b5e6-96231b3b80d8
config.h.* have conditions whether each symbol is defined or not.
Autoconf and CMake may check symbols in libgcc.a for JIT on Mingw.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124950 91177308-0d34-0410-b5e6-96231b3b80d8
a) Making it a per call site bonus for functions that we can move from
indirect to direct calls.
b) Reduces the bonus from 500 to 100 per call site.
c) Subtracts the size of the possible newly inlineable call from the
bonus to only add a bonus if we can inline a small function to devirtualize
it.
Also changes the bonus from a positive that's subtracted to a negative
that's added.
Fixes the remainder of rdar://8546196 by reducing the object file size
after inlining by 84%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124916 91177308-0d34-0410-b5e6-96231b3b80d8
A live range cannot be split everywhere in a basic block. A split must go before
the first terminator, and if the variable is live into a landing pad, the split
must happen before the call that can throw.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124894 91177308-0d34-0410-b5e6-96231b3b80d8
precisely track pressure on a selection DAG, but we can at least keep
it balanced. This design accounts for various interesting aspects of
selection DAGS: register and subregister copies, glued nodes, dead
nodes, unused registers, etc.
Added SUnit::NumRegDefsLeft and ScheduleDAGSDNodes::RegDefIter.
Note: I disabled PrescheduleNodesWithMultipleUses when register
pressure is enabled, based on no evidence other than I don't think it
makes sense to have both enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124853 91177308-0d34-0410-b5e6-96231b3b80d8
The greedy register allocator revealed some problems with the value mapping in
SplitKit. We would sometimes start mapping values before all defs were known,
and that could change a value from a simple 1-1 mapping to a multi-def mapping
that requires ssa update.
The new approach collects all defs and register assignments first without
filling in any live intervals. Only when finish() is called, do we compute
liveness and mapped values. At this time we know with certainty which values map
to multiple values in a split range.
This also has the advantage that we can compute live ranges based on the
remaining uses after rematerializing at split points.
The current implementation has many opportunities for compile time optimization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124765 91177308-0d34-0410-b5e6-96231b3b80d8
may be useful to understand "none", this is not the place for it. Tweak
the fix to Normalize while there: the fix added in 123990 works correctly,
but I like this way better. Finally, now that Triple understands some
non-trivial environment values, teach the unittests about them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124720 91177308-0d34-0410-b5e6-96231b3b80d8
the load, then it may be legal to transform the load and store to integer
load and store of the same width.
This is done if the target specified the transformation as profitable. e.g.
On arm, this can transform:
vldr.32 s0, []
vstr.32 s0, []
to
ldr r12, []
str r12, []
rdar://8944252
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124708 91177308-0d34-0410-b5e6-96231b3b80d8
Modified patch by Adam Preuss.
This builds on the existing framework for block tracing, edge profiling and optimal edge profiling.
See -help-hidden for new flags.
For documentation, see the technical report "Implementation of Path Profiling..." in llvm.org/pubs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124515 91177308-0d34-0410-b5e6-96231b3b80d8
benchmarks, and that it can be simplified to X/Y. (In general you can only
simplify (Z*Y)/Y to Z if the multiplication did not overflow; if Z has the
form "X/Y" then this is the case). This patch implements that transform and
moves some Div logic out of instcombine and into InstructionSimplify.
Unfortunately instcombine gets in the way somewhat, since it likes to change
(X/Y)*Y into X-(X rem Y), so I had to teach instcombine about this too.
Finally, thanks to the NSW/NUW flags, sometimes we know directly that "Z*Y"
does not overflow, because the flag says so, so I added that logic too. This
eliminates a bunch of divisions and subtractions in 447.dealII, and has good
effects on some other benchmarks too. It seems to have quite an effect on
tramp3d-v4 but it's hard to say if it's good or bad because inlining decisions
changed, resulting in massive changes all over.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124487 91177308-0d34-0410-b5e6-96231b3b80d8
default implementation for x86, going through the stack in a similr
fashion to how the codegen implements BUILD_VECTOR. Eventually this
will get matched to VINSERTF128 if AVX is available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124307 91177308-0d34-0410-b5e6-96231b3b80d8
implementation of EXTRACT_SUBVECTOR for x86, going through the stack
in a similr fashion to how the codegen implements BUILD_VECTOR.
Eventually this will get matched to VEXTRACTF128 if AVX is available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124292 91177308-0d34-0410-b5e6-96231b3b80d8
a few loops accordingly. Should be no functional change.
This is a step for more accurate cost/benefit analysis of devirt/inlining
bonuses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124275 91177308-0d34-0410-b5e6-96231b3b80d8
This will be used to check patterns referencing a forthcoming
INSERT_SUBVECTOR SDNode and will also be used to check
EXTRACT_SUBVECTOR nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124191 91177308-0d34-0410-b5e6-96231b3b80d8
optimized code are:
(non-negative number)+(power-of-two) != 0 -> true
and
(x | 1) != 0 -> true
Instcombine knows about the second one of course, but only does it if X|1
has only one use. These fire thousands of times in the testsuite.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124183 91177308-0d34-0410-b5e6-96231b3b80d8
with BasicAA's DecomposeGEPExpression, which recently began
using a TargetData. This fixes PR8968, though the testcase
is awkward to reduce.
Also, update several off GetUnderlyingObject's users
which happen to have a TargetData handy to pass it in.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124134 91177308-0d34-0410-b5e6-96231b3b80d8
computation, the Ancestor field is always set to the Parent, so we can remove
the explicit link entirely and merge the Parent and Ancestor fields. Instead of
checking for whether an ancestor exists for a node or not, we simply check
whether the node has already been processed. This is simpler if Compress is
inlined into Eval, so I did that as well.
This is about a 3% speedup running -domtree on test-suite + SPEC2000 & SPEC2006,
but it also opens up some opportunities for further improvement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124061 91177308-0d34-0410-b5e6-96231b3b80d8
flags. They are still not enable in this revision.
Added TargetInstrInfo::isZeroCost() to fix a fundamental problem with
the scheduler's model of operand latency in the selection DAG.
Generalized unit tests to work with sched-cycles.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123969 91177308-0d34-0410-b5e6-96231b3b80d8
TargetInstrInfo:
Change produceSameValue() to take MachineRegisterInfo as an optional argument.
When in SSA form, targets can use it to make more aggressive equality analysis.
Machine LICM:
1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead.
2. Fix a bug which prevent CSE of instructions which are not re-materializable.
3. Use improved form of produceSameValue.
ARM:
1. Teach ARM produceSameValue to look pass some PIC labels.
2. Look for operands from different loads of different constant pool entries
which have same values.
3. Re-implement PIC GA materialization using movw + movt. Combine the pair with
a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible
to re-materialize the instruction, allow machine LICM to hoist the set of
instructions out of the loop and make it possible to CSE them. It's a bit
hacky, but it significantly improve code quality.
4. Some minor bug fixes as well.
With the fixes, using movw + movt to materialize GAs significantly outperform the
load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap
and 176.gcc ~10%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123905 91177308-0d34-0410-b5e6-96231b3b80d8
checks enabled:
1) Use '<' to compare integers in a comparison function rather than '<='.
2) Use the uniqued set DefBlocks rather than Info.DefiningBlocks to initialize
the priority queue.
The speedup of scalarrepl on test-suite + SPEC2000 + SPEC2006 is a bit less, at
just under 16% rather than 17%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123662 91177308-0d34-0410-b5e6-96231b3b80d8
eliminating a potentially quadratic data structure, this also gives a 17%
speedup when running -scalarrepl on test-suite + SPEC2000 + SPEC2006. My initial
experiment gave a greater speedup around 25%, but I moved the dominator tree
level computation from dominator tree construction to PromoteMemToReg.
Since this approach to computing IDFs has a much lower overhead than the old
code using precomputed DFs, it is worth looking at using this new code for the
second scalarrepl pass as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123609 91177308-0d34-0410-b5e6-96231b3b80d8