specified in the same file that the library itself is created. This is
more idiomatic for CMake builds, and also allows us to correctly specify
dependencies that are missed due to bugs in the GenLibDeps perl script,
or change from compiler to compiler. On Linux, this returns CMake to
a place where it can relably rebuild several targets of LLVM.
I have tried not to change the dependencies from the ones in the current
auto-generated file. The only places I've really diverged are in places
where I was seeing link failures, and added a dependency. The goal of
this patch is not to start changing the dependencies, merely to move
them into the correct location, and an explicit form that we can control
and change when necessary.
This also removes a serialization point in the build because we don't
have to scan all the libraries before we begin building various tools.
We no longer have a step of the build that regenerates a file inside the
source tree. A few other associated cleanups fall out of this.
This isn't really finished yet though. After talking to dgregor he urged
switching to a single CMake macro to construct libraries with both
sources and dependencies in the arguments. Migrating from the two macros
to that style will be a follow-up patch.
Also, llvm-config is still generated with GenLibDeps.pl, which means it
still has slightly buggy dependencies. The internal CMake
'llvm-config-like' macro uses the correct explicitly specified
dependencies however. A future patch will switch llvm-config generation
(when using CMake) to be based on these deps as well.
This may well break Windows. I'm getting a machine set up now to dig
into any failures there. If anyone can chime in with problems they see
or ideas of how to solve them for Windows, much appreciated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136433 91177308-0d34-0410-b5e6-96231b3b80d8
an assert on Darwin llvm-gcc builds.
Assertion failed: (castIsValid(op, S, Ty) && "Invalid cast!"), function Create, file /Users/buildslave/zorg/buildbot/smooshlab/slave-0.8/build.llvm-gcc-i386-darwin9-RA/llvm.src/lib/VMCore/Instructions.cpp, li\
ne 2067.
etc.
http://smooshlab.apple.com:8013/builders/llvm-gcc-i386-darwin9-RA/builds/2354
--- Reverse-merging r134893 into '.':
U include/llvm/Target/TargetData.h
U include/llvm/DerivedTypes.h
U tools/bugpoint/ExtractFunction.cpp
U unittests/Support/TypeBuilderTest.cpp
U lib/Target/ARM/ARMGlobalMerge.cpp
U lib/Target/TargetData.cpp
U lib/VMCore/Constants.cpp
U lib/VMCore/Type.cpp
U lib/VMCore/Core.cpp
U lib/Transforms/Utils/CodeExtractor.cpp
U lib/Transforms/Instrumentation/ProfilingUtils.cpp
U lib/Transforms/IPO/DeadArgumentElimination.cpp
U lib/CodeGen/SjLjEHPrepare.cpp
--- Reverse-merging r134888 into '.':
G include/llvm/DerivedTypes.h
U include/llvm/Support/TypeBuilder.h
U include/llvm/Intrinsics.h
U unittests/Analysis/ScalarEvolutionTest.cpp
U unittests/ExecutionEngine/JIT/JITTest.cpp
U unittests/ExecutionEngine/JIT/JITMemoryManagerTest.cpp
U unittests/VMCore/PassManagerTest.cpp
G unittests/Support/TypeBuilderTest.cpp
U lib/Target/MBlaze/MBlazeIntrinsicInfo.cpp
U lib/Target/Blackfin/BlackfinIntrinsicInfo.cpp
U lib/VMCore/IRBuilder.cpp
G lib/VMCore/Type.cpp
U lib/VMCore/Function.cpp
G lib/VMCore/Core.cpp
U lib/VMCore/Module.cpp
U lib/AsmParser/LLParser.cpp
U lib/Transforms/Utils/CloneFunction.cpp
G lib/Transforms/Utils/CodeExtractor.cpp
U lib/Transforms/Utils/InlineFunction.cpp
U lib/Transforms/Instrumentation/GCOVProfiling.cpp
U lib/Transforms/Scalar/ObjCARC.cpp
U lib/Transforms/Scalar/SimplifyLibCalls.cpp
U lib/Transforms/Scalar/MemCpyOptimizer.cpp
G lib/Transforms/IPO/DeadArgumentElimination.cpp
U lib/Transforms/IPO/ArgumentPromotion.cpp
U lib/Transforms/InstCombine/InstCombineCompares.cpp
U lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
U lib/Transforms/InstCombine/InstCombineCalls.cpp
U lib/CodeGen/DwarfEHPrepare.cpp
U lib/CodeGen/IntrinsicLowering.cpp
U lib/Bitcode/Reader/BitcodeReader.cpp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134949 91177308-0d34-0410-b5e6-96231b3b80d8
This tightens up checking for overflow in alloca sizes, based on feedback
from Duncan and John about the change in r132926.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134749 91177308-0d34-0410-b5e6-96231b3b80d8
all over the place in different styles and variants. Standardize on two
preferred entrypoints: one that takes a StructType and ArrayRef, and one that
takes StructType and varargs.
In cases where there isn't a struct type convenient, we now add a
ConstantStruct::getAnon method (whose name will make more sense after a few
more patches land).
It would be "really really nice" if the ConstantStruct::get and
ConstantVector::get methods didn't make temporary std::vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133412 91177308-0d34-0410-b5e6-96231b3b80d8
might overflow. Re-typing the alloca to a larger type (e.g. double)
hoists a shift into the alloca, potentially exposing overflow in the
expression. rdar://problem/9265821
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132926 91177308-0d34-0410-b5e6-96231b3b80d8
crc32.[8|16|32] have been renamed to .crc32.32.[8|16|32] and
crc64.[8|16|32] have been renamed to .crc32.64.[8|64].
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132163 91177308-0d34-0410-b5e6-96231b3b80d8
It's better to do this in codegen, mul.with.overflow(X, 2) is more canonical because it has only one use on "X".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131798 91177308-0d34-0410-b5e6-96231b3b80d8
As an example, the change to InstCombineCalls catches a common case where a call to a bitcast of a function is rewritten.
Chris, does this approach look reasonable?
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131516 91177308-0d34-0410-b5e6-96231b3b80d8
This automagically provides a transform noticed by my super-optimizer
as occurring quite often: "rem x, (select cond, x, 1)" -> 0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130694 91177308-0d34-0410-b5e6-96231b3b80d8
This obviously helps a lot if the division would be turned into a libcall
(think i64 udiv on i386), but div is also one of the few remaining instructions
on modern CPUs that become more expensive when the bitwidth gets bigger.
This also helps register pressure on i386 when dividing chars, divb needs
two 8-bit parts of a 16 bit register as input where divl uses two registers.
int foo(unsigned char a) { return a/10; }
int bar(unsigned char a, unsigned char b) { return a/b; }
compiles into (x86_64)
_foo:
imull $205, %edi, %eax
shrl $11, %eax
ret
_bar:
movzbl %dil, %eax
divb %sil, %al
movzbl %al, %eax
ret
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130615 91177308-0d34-0410-b5e6-96231b3b80d8
This shouldn't happen in practice because the icmp would be a constant.
Add a check so we don't miscompile code if something goes wrong.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130446 91177308-0d34-0410-b5e6-96231b3b80d8
effective in avoiding recomputation of LCSSA form; the widespread
use of instsimplify (which looks through phi nodes) means it was
not preserving LCSSA form anyway; and instcombine is no longer
scheduled in the middle of the loop passes so this doesn't matter
anymore.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130301 91177308-0d34-0410-b5e6-96231b3b80d8
when X has multiple uses. This is useful for exposing secondary optimizations,
but the X86 backend isn't ready for this when X has a single use. For example,
this can disable load folding.
This is inching towards resolving PR6627.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130238 91177308-0d34-0410-b5e6-96231b3b80d8
canonical, and generally leads to better code. Found while looking at
an article about saturating arithmetic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129545 91177308-0d34-0410-b5e6-96231b3b80d8
Now that we have a first-class way to represent unaligned loads, the unaligned
load intrinsics are superfluous.
First part of <rdar://problem/8460511>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129401 91177308-0d34-0410-b5e6-96231b3b80d8
space info. We crash with an assert in this case. This change checks that the
address space of the bitcasted pointer is the same as the gep ptr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128884 91177308-0d34-0410-b5e6-96231b3b80d8
It's possible to craft an input that hits the recursion limits in a way
that SimplifyDemandedBits doesn't simplify the icmp but ComputeMaskedBits
can infer which bits are zero.
No test case as it depends on too many other things. Fixes PR9609.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128777 91177308-0d34-0410-b5e6-96231b3b80d8
- Localize the check if an icmp has one use to a place where we know we're
introducing something that's likely more expensive than a sext from i1.
- Add an assert to make sure a case that would lead to a miscompilation is
folded away earlier.
- Fix a typo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128744 91177308-0d34-0410-b5e6-96231b3b80d8
removes one use of X which helps it pass the many hasOneUse() checks.
In my analysis, this turns up very often where X = A >>exact B and that can't be
simplified unless X has one use (except by increasing the lifetime of A which is
generally a performance loss).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128373 91177308-0d34-0410-b5e6-96231b3b80d8
load and store reference same memory location, the memory location
is represented by getelementptr with two uses (load and store) and
the getelementptr's base is alloca with single use. At this point,
instructions from alloca to store can be removed.
(this pattern is generated when bitfield is accessed.)
For example,
%u = alloca %struct.test, align 4 ; [#uses=1]
%0 = getelementptr inbounds %struct.test* %u, i32 0, i32 0;[#uses=2]
%1 = load i8* %0, align 4 ; [#uses=1]
%2 = and i8 %1, -16 ; [#uses=1]
%3 = or i8 %2, 5 ; [#uses=1]
store i8 %3, i8* %0, align 4
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127565 91177308-0d34-0410-b5e6-96231b3b80d8
This happens a lot in clang-compiled C++ code because it adds overflow checks to operator new[]:
unsigned *foo(unsigned n) { return new unsigned[n]; }
We can optimize away the overflow check on 64 bit targets because (uint64_t)n*4 cannot overflow.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127418 91177308-0d34-0410-b5e6-96231b3b80d8
the value splatted into every element. Extend this to getTrue and getFalse which
by providing new overloads that take Types that are either i1 or <N x i1>. Use
it in InstCombine to add vector support to some code, fixing PR8469!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127116 91177308-0d34-0410-b5e6-96231b3b80d8
possible. This goes into instcombine and instsimplify because instsimplify
doesn't need to check hasOneUse since it returns (almost exclusively) constants.
This fixes PR9343 #4#5 and #8!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127064 91177308-0d34-0410-b5e6-96231b3b80d8
intersection of the LHS and RHS ConstantRanges and return "false" when
the range is empty.
This simplifies some code and catches some extra cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126744 91177308-0d34-0410-b5e6-96231b3b80d8
function prototype into a call to a varargs prototype. We do
allow the xform if we have a definition, but otherwise we don't
want to risk that we're changing the abi in a subtle way. On
X86-64, for example, varargs require passing stuff in %al.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126363 91177308-0d34-0410-b5e6-96231b3b80d8
We usually catch this kind of optimization through InstSimplify's distributive
magic, but or doesn't distribute over xor in general.
"A | ~(A | B) -> A | ~B" hits 24 times on gcc.c.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126081 91177308-0d34-0410-b5e6-96231b3b80d8
variations (some of these were already present so I unified the code). Spotted by my
auto-simplifier as occurring a lot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125734 91177308-0d34-0410-b5e6-96231b3b80d8
unsigned overflow (e.g. due to a negative array index), but
the scales on array size multiplications are known to not
sign wrap.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125409 91177308-0d34-0410-b5e6-96231b3b80d8
gep to explicit addressing, we know that none of the intermediate
computation overflows.
This could use review: it seems that the shifts certainly wouldn't
overflow, but could the intermediate adds overflow if there is a
negative index?
Previously the testcase would instcombine to:
define i1 @test(i64 %i) {
%p1.idx.mask = and i64 %i, 4611686018427387903
%cmp = icmp eq i64 %p1.idx.mask, 1000
ret i1 %cmp
}
now we get:
define i1 @test(i64 %i) {
%cmp = icmp eq i64 %i, 1000
ret i1 %cmp
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125271 91177308-0d34-0410-b5e6-96231b3b80d8
exact/nsw/nuw shifts and have instcombine infer them when it can prove
that the relevant properties are true for a given shift without them.
Also, a variety of refactoring to use the new patternmatch logic thrown
in for good luck. I believe that this takes care of a bunch of related
code quality issues attached to PR8862.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125267 91177308-0d34-0410-b5e6-96231b3b80d8
optimizations to be much more aggressive in the face of
exact/nsw/nuw div and shifts. For example, these (which
are the same except the first is 'exact' sdiv:
define i1 @sdiv_icmp4_exact(i64 %X) nounwind {
%A = sdiv exact i64 %X, -5 ; X/-5 == 0 --> x == 0
%B = icmp eq i64 %A, 0
ret i1 %B
}
define i1 @sdiv_icmp4(i64 %X) nounwind {
%A = sdiv i64 %X, -5 ; X/-5 == 0 --> x == 0
%B = icmp eq i64 %A, 0
ret i1 %B
}
compile down to:
define i1 @sdiv_icmp4_exact(i64 %X) nounwind {
%1 = icmp eq i64 %X, 0
ret i1 %1
}
define i1 @sdiv_icmp4(i64 %X) nounwind {
%X.off = add i64 %X, 4
%1 = icmp ult i64 %X.off, 9
ret i1 %1
}
This happens when you do something like:
(ptr1-ptr2) == 42
where the pointers are pointers to non-unit types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125266 91177308-0d34-0410-b5e6-96231b3b80d8
and generally tidying things up. Only very trivial functionality changes
like now doing (-1 - A) -> (~A) for vectors too.
InstCombineAddSub.cpp | 296 +++++++++++++++++++++-----------------------------
1 file changed, 126 insertions(+), 170 deletions(-)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125264 91177308-0d34-0410-b5e6-96231b3b80d8
versions of creation functions. Eventually, the "insertion point" versions
of these should just be removed, we do have IRBuilder afterall.
Do a massive rewrite of much of pattern match. It is now shorter and less
redundant and has several other widgets I will be using in other patches.
Among other changes, m_Div is renamed to m_IDiv (since it only matches
integer divides) and m_Shift is gone (it used to match all binops!!) and
we now have m_LogicalShift for the one client to use.
Enhance IRBuilder to have "isExact" arguments to things like CreateUDiv
and reduce redundancy within IRbuilder by having these methods chain to
each other more instead of duplicating code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125194 91177308-0d34-0410-b5e6-96231b3b80d8
reassociation. No testcase, because I wasn't able to create a testcase
which actually demonstrates a problem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124713 91177308-0d34-0410-b5e6-96231b3b80d8
benchmarks, and that it can be simplified to X/Y. (In general you can only
simplify (Z*Y)/Y to Z if the multiplication did not overflow; if Z has the
form "X/Y" then this is the case). This patch implements that transform and
moves some Div logic out of instcombine and into InstructionSimplify.
Unfortunately instcombine gets in the way somewhat, since it likes to change
(X/Y)*Y into X-(X rem Y), so I had to teach instcombine about this too.
Finally, thanks to the NSW/NUW flags, sometimes we know directly that "Z*Y"
does not overflow, because the flag says so, so I added that logic too. This
eliminates a bunch of divisions and subtractions in 447.dealII, and has good
effects on some other benchmarks too. It seems to have quite an effect on
tramp3d-v4 but it's hard to say if it's good or bad because inlining decisions
changed, resulting in massive changes all over.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124487 91177308-0d34-0410-b5e6-96231b3b80d8
clang's -Wuninitialized-experimental warning.
While these don't look like real bugs, clang's
-Wuninitialized-experimental analysis is stricter
than GCC's, and these fixes have the benefit
of being general nice cleanups.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124073 91177308-0d34-0410-b5e6-96231b3b80d8
A == B, and A > B, does not mean we can fold it to true. We still need to
check for A ? B (A unordered B).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123993 91177308-0d34-0410-b5e6-96231b3b80d8
a select. A vector select is pairwise on each element so we'd need a new
condition with the right number of elements to select on. Fixes PR8994.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123963 91177308-0d34-0410-b5e6-96231b3b80d8
auto-simplier the transform most missed by early-cse is (zext X) != 0 -> X != 0.
This patch adds this transform and some related logic to InstructionSimplify
and removes some of the logic from instcombine (unfortunately not all because
there are several situations in which instcombine can improve things by making
new instructions, whereas instsimplify is not allowed to do this). At -O2 this
often results in more than 15% more simplifications by early-cse, and results in
hundreds of lines of bitcode being eliminated from the testsuite. I did see some
small negative effects in the testsuite, for example a few additional instructions
in three programs. One program, 483.xalancbmk, got an additional 35 instructions,
which seems to be due to a function getting an additional instruction and then
being inlined all over the place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123911 91177308-0d34-0410-b5e6-96231b3b80d8
multiple uses. In some cases, all the uses are the same operation,
so instcombine can go ahead and promote the phi. In the testcase
this pushes an add out of the loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123568 91177308-0d34-0410-b5e6-96231b3b80d8
While there, I noticed that the transform "undef >>a X -> undef" was wrong.
For example if X is 2 then the top two bits must be equal, so the result can
not be anything. I fixed this in the constant folder as well. Also, I made
the transform for "X << undef" stronger: it now folds to undef always, even
though X might be zero. This is in accordance with the LangRef, but I must
admit that it is fairly aggressive. Also, I added "i32 X << 32 -> undef"
following the LangRef and the constant folder, likewise fairly aggressive.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123417 91177308-0d34-0410-b5e6-96231b3b80d8
X = sext x; x >s c ? X : C+1 --> X = sext x; X <s C+1 ? C+1 : X
X = sext x; x <s c ? X : C-1 --> X = sext x; X >s C-1 ? C-1 : X
X = zext x; x >u c ? X : C+1 --> X = zext x; X <u C+1 ? C+1 : X
X = zext x; x <u c ? X : C-1 --> X = zext x; X >u C-1 ? C-1 : X
X = sext x; x >u c ? X : C+1 --> X = sext x; X <u C+1 ? C+1 : X
X = sext x; x <u c ? X : C-1 --> X = sext x; X >u C-1 ? C-1 : X
Instead of calculating this with mixed types promote all to the
larger type. This enables scalar evolution to analyze this
expression. PR8866
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123034 91177308-0d34-0410-b5e6-96231b3b80d8
if both A op B and A op C simplify. This fires fairly often but doesn't
make that much difference. On gcc-as-one-file it removes two "and"s and
turns one branch into a select.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122399 91177308-0d34-0410-b5e6-96231b3b80d8