indicates whether the intrinsic has a defined result for a first
argument equal to zero. This will eventually allow these intrinsics to
accurately model the semantics of GCC's __builtin_ctz and __builtin_clz
and the X86 instructions (prior to AVX) which implement them.
This patch merely sets the stage by extending the signature of these
intrinsics and establishing auto-upgrade logic so that the old spelling
still works both in IR and in bitcode. The upgrade logic preserves the
existing (inefficient) semantics. This patch should not change any
behavior. CodeGen isn't updated because it can use the existing
semantics regardless of the flag's value.
Note that this will be followed by API updates to Clang and DragonEgg.
Reviewed by Nick Lewycky!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146357 91177308-0d34-0410-b5e6-96231b3b80d8
Since we're not rewriting IVs in other loops, there's not much reason
to consider their stride when generating formulae.
This should reduce the number of useless formulas considered by LSR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146302 91177308-0d34-0410-b5e6-96231b3b80d8
Patch by Brendon Cahoon!
This extends the existing LoopUnroll and LoopUnrollPass. Brendon
measured no regressions in the llvm test suite with -unroll-runtime
enabled. This implementation works by using the existing loop
unrolling code to unroll the loop by a power-of-two (default 8). It
generates an if-then-else sequence of code prior to the loop to
execute the extra iterations before entering the unrolled loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146245 91177308-0d34-0410-b5e6-96231b3b80d8
- Walking over pred_begin/pred_end is an expensive operation.
- PHINodes contain a value for each predecessor anyway.
- While it may look like we used to save a few iterations with the set,
be aware that getIncomingValueForBlock does a linear search on
the values of the phi node.
- Another -5% on ARMDisassembler.cpp (Release build). This was the last
entry in the profile that was obviously wasting time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145937 91177308-0d34-0410-b5e6-96231b3b80d8
It's always good to prune early, but formulae that are unsatisfactory
in their own right need to be removed before running any other pruning
heuristics. We easily avoid generating such formulae, but we need them
as an intermediate basis for forming other good formulae.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145906 91177308-0d34-0410-b5e6-96231b3b80d8
where this would be bad as the backend shouldn't have a problem inlining small
memcpys.
rdar://10510150
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145865 91177308-0d34-0410-b5e6-96231b3b80d8
- Calling getUser in a loop is much more expensive than iterating over a few instructions.
- Use it instead of the open-coded loop in AddrModeMatcher.
- 5% speedup on ARMDisassembler.cpp Release builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145810 91177308-0d34-0410-b5e6-96231b3b80d8
The callee is usually smaller than the caller, too. This reduces the compile
time of ARMDisassembler.cpp by 32% (Release build). It still takes ages to
compile though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145690 91177308-0d34-0410-b5e6-96231b3b80d8
weak variable are compiled by different compilers, such as GCC and LLVM, while
LLVM may increase the alignment to the preferred alignment there is no reason to
think that GCC will use anything more than the ABI alignment. Since it is the
GCC version that might end up in the final program (as the linkage is weak), it
is wrong to increase the alignment of loads from the global up to the preferred
alignment as the alignment might only be the ABI alignment.
Increasing alignment up to the ABI alignment might be OK, but I'm not totally
convinced that it is. It seems better to just leave the alignment of weak
globals alone.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145413 91177308-0d34-0410-b5e6-96231b3b80d8
gcc, though I thought it was older (my gcc 4.4 has it as a local patch. Whoops!)
This fixes PR10589.
Also add some debugging statements.
Remove GcnoFiles, the mapping from CompilationUnit to raw_ostream. Now that we
start by iterating over each CU and descending into them, there's no need to
maintain a mapping.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145208 91177308-0d34-0410-b5e6-96231b3b80d8
The right way to check for a binary operation is
cast<BinaryOperator>. The original check: cast<Instruction> &&
numOperands() == 2 would match phi "instructions", leading to an
infinite loop in extreme corner case: a useless phi with operands
[self, constant] that prior optimization passes failed to remove,
being used in the loop by another useless phi, in turn being used by an
lshr or udiv.
Fixes PR11350: runaway iteration assertion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144935 91177308-0d34-0410-b5e6-96231b3b80d8