There are still places which treat the Attribute object as a collection of
attributes. I'm systematically removing them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173990 91177308-0d34-0410-b5e6-96231b3b80d8
In the future, AttributeWithIndex won't be used anymore. Besides, it exposes the
internals of the AttributeSet to outside users, which isn't goodness.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173602 91177308-0d34-0410-b5e6-96231b3b80d8
The 'getSlot' function and its ilk allow introspection into the AttributeSet
class. However, that class should be opaque. Allow access through accessor
methods instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173522 91177308-0d34-0410-b5e6-96231b3b80d8
This does the right thing unless the multiplication overflows, but the old code
didn't handle that case either.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173276 91177308-0d34-0410-b5e6-96231b3b80d8
Collections of attributes are handled via the AttributeSet class now. This
finally frees us up to make significant changes to how attributes are structured.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173228 91177308-0d34-0410-b5e6-96231b3b80d8
This is more code to isolate the use of the Attribute class to that of just
holding one attribute instead of a collection of attributes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173094 91177308-0d34-0410-b5e6-96231b3b80d8
Further encapsulation of the Attribute object. Don't allow direct access to the
Attribute object as an aggregate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172853 91177308-0d34-0410-b5e6-96231b3b80d8
Because the Attribute class is going to stop representing a collection of
attributes, limit the use of it as an aggregate in favor of using AttributeSet.
This replaces some of the uses for querying the function attributes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172844 91177308-0d34-0410-b5e6-96231b3b80d8
some optimization opportunities (in the enclosing supper-expressions).
rule 1. (-0.0 - X ) * Y => -0.0 - (X * Y)
if expression "-0.0 - X" has only one reference.
rule 2. (0.0 - X ) * Y => -0.0 - (X * Y)
if expression "0.0 - X" has only one reference, and
the instruction is marked "noSignedZero".
2. Eliminate negation (The compiler was already able to handle these
opt if the 0.0s are replaced with -0.0.)
rule 3: (0.0 - X) * (0.0 - Y) => X * Y
rule 4: (0.0 - X) * C => X * -C
if the expr is flagged "noSignedZero".
3.
Rule 5: (X*Y) * X => (X*X) * Y
if X!=Y and the expression is flagged with "UnsafeAlgebra".
The purpose of this transformation is two-fold:
a) to form a power expression (of X).
b) potentially shorten the critical path: After transformation, the
latency of the instruction Y is amortized by the expression of X*X,
and therefore Y is in a "less critical" position compared to what it
was before the transformation.
4. Remove the InstCombine code about simplifiying "X * select".
The reasons are following:
a) The "select" is somewhat architecture-dependent, therefore the
higher level optimizers are not able to precisely predict if
the simplification really yields any performance improvement
or not.
b) The "select" operator is bit complicate, and tends to obscure
optimization opportunities. It is btter to keep it as low as
possible in expr tree, and let CodeGen to tackle the optimization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172551 91177308-0d34-0410-b5e6-96231b3b80d8
---------------------------------------------------------------------------
C_A: reassociation is allowed
C_R: reciprocal of a constant C is appropriate, which means
- 1/C is exact, or
- reciprocal is allowed and 1/C is neither a special value nor a denormal.
-----------------------------------------------------------------------------
rule1: (X/C1) / C2 => X / (C2*C1) (if C_A)
=> X * (1/(C2*C1)) (if C_A && C_R)
rule 2: X*C1 / C2 => X * (C1/C2) if C_A
rule 3: (X/Y)/Z = > X/(Y*Z) (if C_A && at least one of Y and Z is symbolic value)
rule 4: Z/(X/Y) = > (Z*Y)/X (similar to rule3)
rule 5: C1/(X*C2) => (C1/C2) / X (if C_A)
rule 6: C1/(X/C2) => (C1*C2) / X (if C_A)
rule 7: C1/(C2/X) => (C1/C2) * X (if C_A)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172488 91177308-0d34-0410-b5e6-96231b3b80d8
- this expression is explicitly marked no-signed-zero, or
- no-signed-zero of this expression can be derived from some context.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171922 91177308-0d34-0410-b5e6-96231b3b80d8
o. X/C1 * C2 => X * (C2/C1) (if C2/C1 is neither special FP nor denormal)
o. X/C1 * C2 -> X/(C1/C2) (if C2/C1 is either specical FP or denormal, but C1/C2 is a normal Fp)
Let MDC denote multiplication or dividion with one & only one operand being a constant
o. (MDC ± C1) * C2 => (MDC * C2) ± (C1 * C2)
(so long as the constant-folding doesn't yield any denormal or special value)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171793 91177308-0d34-0410-b5e6-96231b3b80d8
turning a code like this:
if (foo)
free(foo)
into that:
free(foo)
Move a call to free from basic block FB into FB's predecessor, P,
when the path from P to FB is taken only if the argument of free is
not equal to NULL.
Some restrictions apply on P and FB to be sure that this code motion
is profitable. Namely:
1. FB must have only one predecessor P.
2. FB must contain only the call to free plus an unconditional
branch to S.
3. P's successors are FB and S.
Because of 1., we will not increase the code size when moving the call
to free from FB to P.
Because of 2., FB will be empty after the move.
Because of 2. and 3., P's branch instruction becomes useless, so as FB
(simplifycfg will do the job).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171762 91177308-0d34-0410-b5e6-96231b3b80d8
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.
There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.
The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.
I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).
I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171366 91177308-0d34-0410-b5e6-96231b3b80d8
The later API is nicer than the former, and is correct regarding wrap-around offsets (if anyone cares).
There are a few more places left with duplicated code, which I'll remove soon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171259 91177308-0d34-0410-b5e6-96231b3b80d8
such as by a compiler warning, a check in clang -fsanitizer=undefined, being
optimized to unreachable, or a combination of the above. PR14722.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171119 91177308-0d34-0410-b5e6-96231b3b80d8
When the backend is used from clang, it should produce proper diagnostics
instead of just printing messages to errs(). Other clients may also want to
register their own error handlers with the LLVMContext, and the same handler
should work for warnings in the same way as the existing emitError methods.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171041 91177308-0d34-0410-b5e6-96231b3b80d8
When the least bit of C is greater than V, (x&C) must be greater than V
if it is not zero, so the comparison can be simplified.
Although this was suggested in Target/X86/README.txt, it benefits any
architecture with a directly testable form of AND.
Patch by Kevin Schoedel
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170576 91177308-0d34-0410-b5e6-96231b3b80d8
This assumes (1 << n) is always not zero. Consider n is greater than word size.
Although I know it is undefined, this transforms undefined behavior hidden.
This led clang unexpected behavior with some failures. I will investigate to fix undefined shl in clang.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170128 91177308-0d34-0410-b5e6-96231b3b80d8
In a previous thread it was pointed out that isPowerOfTwo is not a very precise
name since it can return false for powers of two if it is unable to show that
they are powers of two.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170093 91177308-0d34-0410-b5e6-96231b3b80d8
Provides m_Argument that allows matching against a CallSite's specified argument. Provides m_Intrinsic pattern that can be templatized over the intrinsic id and bind/match arguments similarly to other pattern matchers. Implementations provided for 0 to 4 arguments, though it's very simple to extend for more. Also provides example template specialization for bswap (m_BSwap) and example of code cleanup for its use.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170091 91177308-0d34-0410-b5e6-96231b3b80d8
been used in the first place. It simply was passed to the function and to the
recursive invocations. Simply drop the parameter and update the callers for the
new signature.
Patch by Saleem Abdulrasool!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169988 91177308-0d34-0410-b5e6-96231b3b80d8
This change attempts to simplify (X^Y) -> X or Y in the user's context if we know that
only bits from X or Y are demanded.
A minimized case is provided bellow. This change will simplify "t>>16" into "var1 >>16".
=============================================================
unsigned foo (unsigned val1, unsigned val2) {
unsigned t = val1 ^ 1234;
return (t >> 16) | t; // NOTE: t is used more than once.
}
=============================================================
Note that if the "t" were used only once, the expression would be finally optimized as well.
However, with with this change, the optimization will take place earlier.
Reviewed by Nadav, Thanks a lot!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169317 91177308-0d34-0410-b5e6-96231b3b80d8
missed in the first pass because the script didn't yet handle include
guards.
Note that the script is now able to handle all of these headers without
manual edits. =]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169224 91177308-0d34-0410-b5e6-96231b3b80d8
The type of shirt-right (logical or arithemetic) should remain unchanged
when transforming "X << C1 >> C2" into "X << (C1-C2)"
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169209 91177308-0d34-0410-b5e6-96231b3b80d8
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.
Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169131 91177308-0d34-0410-b5e6-96231b3b80d8
The simplify-libcalls pass maintained a statistic to count the number
of library calls that have been simplified. Now that library call
simplification is being carried out in instcombine the statistic should
be moved to there.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168975 91177308-0d34-0410-b5e6-96231b3b80d8
depends on the IR infrastructure, there is no sense in it being off in
Support land.
This is in preparation to start working to expand InstVisitor into more
special-purpose visitors that are still generic and can be re-used
across different passes. The expansion will go into the Analylis tree
though as nothing in VMCore needs it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168972 91177308-0d34-0410-b5e6-96231b3b80d8
My commit to migrate the printf simplifiers from the simplify-libcalls
in r168604 introduced a regression reported by Duncan [1]. The problem
is that in some cases the library call simplifier can return a new value
that has no uses and the new value's type is different than the old value's
type (which is fine because there are no uses). The specific case that
triggered the bug looked something like:
declare void @printf(i8*, ...)
...
call void (i8*, ...)* @printf(i8* %fmt)
Which we want to optimized into:
call i32 @putchar(i32 104)
However, the code was attempting to replace all uses of the printf with
the putchar and the types differ, hence a crash. This is fixed by *just*
deleting the original instruction when there are no uses. The old
simplify-libcalls pass is already doing something similar.
[1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-November/056338.html
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168716 91177308-0d34-0410-b5e6-96231b3b80d8
InstCombineLoadStoreAlloca.cpp, which had many issues.
(At least two bugs were noted on llvm-commits, and it was overly conservative.)
Instead, use getOrEnforceKnownAlignment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168629 91177308-0d34-0410-b5e6-96231b3b80d8
Enhancement to InstCombine. Try to catch this opportunity:
---------------------------------------------------------------
((X^C1) >> C2) ^ C3 => (X>>C2) ^ ((C1>>C2)^C3)
where the subexpression "X ^ C1" has more than one uses, and
"(X^C1) >> C2" has single use.
----------------------------------------------------------------
Reviewed by Nadav (with minor change per his request).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168615 91177308-0d34-0410-b5e6-96231b3b80d8
When code deletes the context, the AttributeImpls that the AttrListPtr points to
are now invalid. Therefore, instead of keeping a separate managed static for the
AttrListPtrs that's reference counted, move it into the LLVMContext and delete
it when deleting the AttributeImpls.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168354 91177308-0d34-0410-b5e6-96231b3b80d8
replaced by this patch is equivalent to the new logic, but you'd be wrong, and
that's exactly where the bug was. There's a similar bug in instsimplify which
manifests itself as instsimplify failing to simplify this, rather than doing it
wrong, see next commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168181 91177308-0d34-0410-b5e6-96231b3b80d8
This patch migrates the math library call simplifications from the
simplify-libcalls pass into the instcombine library call simplifier.
I have typically migrated just one simplifier at a time, but the math
simplifiers are interdependent because:
1. CosOpt, PowOpt, and Exp2Opt all depend on UnaryDoubleFPOpt.
2. CosOpt, PowOpt, Exp2Opt, and UnaryDoubleFPOpt all depend on
the option -enable-double-float-shrink.
These two factors made migrating each of these simplifiers individually
more of a pain than it would be worth. So, I migrated them all together.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167815 91177308-0d34-0410-b5e6-96231b3b80d8
In some cases the library call simplifier may need to replace instructions
other than the library call being simplified. In those cases it may be
necessary for clients of the simplifier to override how the replacements
are actually done. As such, a new overrideable method for replacing
instructions was added to LibCallSimplifier.
A new subclass of LibCallSimplifier is also defined which overrides
the instruction replacement method. This is because the instruction
combiner defines its own replacement method which updates the worklist
when instructions are replaced.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167681 91177308-0d34-0410-b5e6-96231b3b80d8
r165941: Resubmit the changes to llvm core to update the functions to
support different pointer sizes on a per address space basis.
Despite this commit log, this change primarily changed stuff outside of
VMCore, and those changes do not carry any tests for correctness (or
even plausibility), and we have consistently found questionable or flat
out incorrect cases in these changes. Most of them are probably correct,
but we need to devise a system that makes it more clear when we have
handled the address space concerns correctly, and ideally each pass that
gets updated would receive an accompanying test case that exercises that
pass specificaly w.r.t. alternate address spaces.
However, from this commit, I have retained the new C API entry points.
Those were an orthogonal change that probably should have been split
apart, but they seem entirely good.
In several places the changes were very obvious cleanups with no actual
multiple address space code added; these I have not reverted when
I spotted them.
In a few other places there were merge conflicts due to a cleaner
solution being implemented later, often not using address spaces at all.
In those cases, I've preserved the new code which isn't address space
dependent.
This is part of my ongoing effort to clean out the partial address space
code which carries high risk and low test coverage, and not likely to be
finished before the 3.2 release looms closer. Duncan and I would both
like to see the above issues addressed before we return to these
changes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167222 91177308-0d34-0410-b5e6-96231b3b80d8
getIntPtrType support for multiple address spaces via a pointer type,
and also introduced a crasher bug in the constant folder reported in
PR14233.
These commits also contained several problems that should really be
addressed before they are re-committed. I have avoided reverting various
cleanups to the DataLayout APIs that are reasonable to have moving
forward in order to reduce the amount of churn, and minimize the number
of commits that were reverted. I've also manually updated merge
conflicts and manually arranged for the getIntPtrType function to stay
in DataLayout and to be defined in a plausible way after this revert.
Thanks to Duncan for working through this exact strategy with me, and
Nick Lewycky for tracking down the really annoying crasher this
triggered. (Test case to follow in its own commit.)
After discussing with Duncan extensively, and based on a note from
Micah, I'm going to continue to back out some more of the more
problematic patches in this series in order to ensure we go into the
LLVM 3.2 branch with a reasonable story here. I'll send a note to
llvmdev explaining what's going on and why.
Summary of reverted revisions:
r166634: Fix a compiler warning with an unused variable.
r166607: Add some cleanup to the DataLayout changes requested by
Chandler.
r166596: Revert "Back out r166591, not sure why this made it through
since I cancelled the command. Bleh, sorry about this!
r166591: Delete a directory that wasn't supposed to be checked in yet.
r166578: Add in support for getIntPtrType to get the pointer type based
on the address space.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167221 91177308-0d34-0410-b5e6-96231b3b80d8
%V = mul i64 %N, 4
%t = getelementptr i8* bitcast (i32* %arr to i8*), i32 %V
into
%t1 = getelementptr i32* %arr, i32 %N
%t = bitcast i32* %t1 to i8*
incorporating the multiplication into the getelementptr.
This happens all the time in dragonegg, for example for
int foo(int *A, int N) {
return A[N];
}
because gcc turns this into byte pointer arithmetic before it hits the plugin:
D.1590_2 = (long unsigned int) N_1(D);
D.1591_3 = D.1590_2 * 4;
D.1592_5 = A_4(D) + D.1591_3;
D.1589_6 = *D.1592_5;
return D.1589_6;
The D.1592_5 line is a POINTER_PLUS_EXPR, which is turned into a getelementptr
on a bitcast of A_4 to i8*, so this becomes exactly the kind of IR that the
transform fires on.
An analogous transform (with no testcases!) already existed for bitcasts of
arrays, so I rewrote it to share code with this one.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166474 91177308-0d34-0410-b5e6-96231b3b80d8
An obfuscated splat is where the frontend poorly generates code for a splat
using several different shuffles to create the splat, i.e.,
%A = load <4 x float>* %in_ptr, align 16
%B = shufflevector <4 x float> %A, <4 x float> undef, <4 x i32> <i32 0, i32 0, i32 undef, i32 undef>
%C = shufflevector <4 x float> %B, <4 x float> %A, <4 x i32> <i32 0, i32 1, i32 4, i32 undef>
%D = shufflevector <4 x float> %C, <4 x float> %A, <4 x i32> <i32 0, i32 1, i32 2, i32 4>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166061 91177308-0d34-0410-b5e6-96231b3b80d8
Convert the internal representation of the Attributes class into a pointer to an
opaque object that's uniqued by and stored in the LLVMContext object. The
Attributes class then becomes a thin wrapper around this opaque
object. Eventually, the internal representation will be expanded to include
attributes that represent code generation options, etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165917 91177308-0d34-0410-b5e6-96231b3b80d8
This patch implements the new LibCallSimplifier class as outlined in [1].
In addition to providing the new base library simplification infrastructure,
all the fortified library call simplifications were moved over to the new
infrastructure. The rest of the library simplification optimizations will
be moved over with follow up patches.
NOTE: The original fortified library call simplifier located in the
SimplifyFortifiedLibCalls class was not removed because it is still
used by CodeGenPrepare. This class will eventually go away too.
[1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-August/052283.html
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165873 91177308-0d34-0410-b5e6-96231b3b80d8
We use the enums to query whether an Attributes object has that attribute. The
opaque layer is responsible for knowing where that specific attribute is stored.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165488 91177308-0d34-0410-b5e6-96231b3b80d8
This disables malloc-specific optimization when -fno-builtin (or -ffreestanding)
is specified. This has been a problem for a long time but became more severe
with the recent memory builtin improvements.
Since the memory builtin functions are used everywhere, this required passing
TLI in many places. This means that functions that now have an optional TLI
argument, like RecursivelyDeleteTriviallyDeadFunctions, won't remove dead
mallocs anymore if the TLI argument is missing. I've updated most passes to do
the right thing.
Fixes PR13694 and probably others.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162841 91177308-0d34-0410-b5e6-96231b3b80d8
No test case, undefined shifts get folded early, but can occur when other
transforms generate a constant. Thanks to Duncan for bringing this up.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162755 91177308-0d34-0410-b5e6-96231b3b80d8
This optimization is really just replacing allocas wholesale with
globals, there is no scalarization.
The underlying motivation for this patch is to simplify the SROA pass
and focus it on splitting and promoting allocas.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162271 91177308-0d34-0410-b5e6-96231b3b80d8
- memcpy size is wrongly truncated into 32-bit and treat 8GB memcpy is
0-sized memcpy
- as 0-sized memcpy/memset is already removed before SimplifyMemTransfer
and SimplifyMemSet in visitCallInst, replace 0 checking with
assertions.
- replace getZExtValue() with getLimitedValue() according to
Eli Friedman
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161923 91177308-0d34-0410-b5e6-96231b3b80d8
An unsigned value converted to floating-point will always be greater than
a negative constant. Unfortunately InstCombine reversed the check so that
unsigned values were being optimized to always be greater than all positive
floating-point constants. <rdar://problem/12029145>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161452 91177308-0d34-0410-b5e6-96231b3b80d8
This can happen as long as the instruction is not reachable. Instcombine does generate these unreachable malformed selects when doing RAUW
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160874 91177308-0d34-0410-b5e6-96231b3b80d8
%shr = lshr i64 %key, 3
%0 = load i64* %val, align 8
%sub = add i64 %0, -1
%and = and i64 %sub, %shr
ret i64 %and
to:
%shr = lshr i64 %key, 3
%0 = load i64* %val, align 8
%sub = add i64 %0, 2305843009213693951
%and = and i64 %sub, %shr
ret i64 %and
The demanded bit optimization is actually a pessimization because add -1 would
be codegen'ed as a sub 1. Teach the demanded constant shrinking optimization
to check for negated constant to make sure it is actually reducing the width
of the constant.
rdar://11793464
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160101 91177308-0d34-0410-b5e6-96231b3b80d8
This patch removes ~70 lines in InstCombineLoadStoreAlloca.cpp and makes both functions a bit more aggressive than before :)
In theory, we can be more aggressive when removing an alloca than a malloc, because an alloca pointer should never escape, but we are not taking advantage of this anyway
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159952 91177308-0d34-0410-b5e6-96231b3b80d8
This means we can do cheap DSE for heap memory.
Nothing is done if the pointer excapes or has a load.
The churn in the tests is mostly due to objectsize, since we want to make sure we
don't delete the malloc call before evaluating the objectsize (otherwise it becomes -1/0)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159876 91177308-0d34-0410-b5e6-96231b3b80d8
This was always part of the VMCore library out of necessity -- it deals
entirely in the IR. The .cpp file in fact was already part of the VMCore
library. This is just a mechanical move.
I've tried to go through and re-apply the coding standard's preferred
header sort, but at 40-ish files, I may have gotten some wrong. Please
let me know if so.
I'll be committing the corresponding updates to Clang and Polly, and
Duncan has DragonEgg.
Thanks to Bill and Eric for giving the green light for this bit of cleanup.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159421 91177308-0d34-0410-b5e6-96231b3b80d8
// C - zext(bool) -> bool ? C - 1 : C
if (ZExtInst *ZI = dyn_cast<ZExtInst>(Op1))
if (ZI->getSrcTy()->isIntegerTy(1))
return SelectInst::Create(ZI->getOperand(0), SubOne(C), C);
This ends up forming sext i1 instructions that codegen to terrible code. e.g.
int blah(_Bool x, _Bool y) {
return (x - y) + 1;
}
=>
movzbl %dil, %eax
movzbl %sil, %ecx
shll $31, %ecx
sarl $31, %ecx
leal 1(%rax,%rcx), %eax
ret
Without the rule, llvm now generates:
movzbl %sil, %ecx
movzbl %dil, %eax
incl %eax
subl %ecx, %eax
ret
It also helps with ARM (and pretty much any target that doesn't have a sext i1 :-).
The transformation was done as part of Eli's r75531. He has given the ok to
remove it.
rdar://11748024
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159230 91177308-0d34-0410-b5e6-96231b3b80d8
merge all zero-sized alloca's into one, fixing c43204g from the Ada ACATS
conformance testsuite. What happened there was that a variable sized object
was being allocated on the stack, "alloca i8, i32 %size". It was then being
passed to another function, which tested that the address was not null (raising
an exception if it was) then manipulated %size bytes in it (load and/or store).
The optimizers cleverly managed to deduce that %size was zero (congratulations
to them, as it isn't at all obvious), which made the alloca zero size, causing
the optimizers to replace it with null, which then caused the check mentioned
above to fail, and the exception to be raised, wrongly. Note that no loads
and stores were actually being done to the alloca (the loop that does them is
executed %size times, i.e. is not executed), only the not-null address check.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159202 91177308-0d34-0410-b5e6-96231b3b80d8
- simplifycfg: invoke undef/null -> unreachable
- instcombine: invoke new -> invoke expect(0, 0) (an arbitrary NOOP intrinsic; only done if the allocated memory is unused, of course)
- verifier: allow invoke of intrinsics (to make the previous step work)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159146 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes PR5997.
These transforms were disabled because codegen couldn't deal with other
uses of trunc(x). This is now handled by the peephole pass.
This causes no regressions on x86-64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159003 91177308-0d34-0410-b5e6-96231b3b80d8
- provide more extensive set of functions to detect library allocation functions (e.g., malloc, calloc, strdup, etc)
- provide an API to compute the size and offset of an object pointed by
Move a few clients (GVN, AA, instcombine, ...) to the new API.
This implementation is a lot more aggressive than each of the custom implementations being replaced.
Patch reviewed by Nick Lewycky and Chandler Carruth, thanks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158919 91177308-0d34-0410-b5e6-96231b3b80d8
With this change, we avoid relying on the IR Builder to constant fold the operations.
No functionality change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158829 91177308-0d34-0410-b5e6-96231b3b80d8
This saves a cast, and zext is more expensive on platforms with subreg support
than trunc is. This occurs in the BSD implementation of memchr(3), see PR12750.
On the synthetic benchmark from that bug stupid_memchr and bsd_memchr have the
same performance now when not inlining either function.
stupid_memchr: 323.0us
bsd_memchr: 321.0us
memchr: 479.0us
where memchr is the llvm-gcc compiled bsd_memchr from osx lion's libc. When
inlining is enabled bsd_memchr still regresses down to llvm-gcc memchr time,
I haven't fully understood the issue yet, something is grossly mangling the
loop after inlining.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158297 91177308-0d34-0410-b5e6-96231b3b80d8
-%a + 42
into
42 - %a
previously we were emitting:
-(%a + 42)
This fixes the infinite loop in PR12338. The generated code is still not perfect, though.
Will work on that next
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158237 91177308-0d34-0410-b5e6-96231b3b80d8
The test case feeds the following into InstCombine's visitSelect:
%tobool8 = icmp ne i32 0, 0
%phitmp = select i1 %tobool8, i32 3, i32 0
Then instcombine replaces the right side of the switch with 0, doesn't notice
that nothing changes and tries again indefinitely.
This fixes PR12897.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157587 91177308-0d34-0410-b5e6-96231b3b80d8
move EmitGEPOffset from InstCombine to Transforms/Utils/Local.h
(a draft of this) patch reviewed by Andrew, thanks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157261 91177308-0d34-0410-b5e6-96231b3b80d8
add an additional parameter to InstCombiner::EmitGEPOffset() to force it to *not* emit operations with NUW flag
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156585 91177308-0d34-0410-b5e6-96231b3b80d8
refactor code a bit to enable future changes to support run-time information
add support to compute allocation sizes at run-time if penalty > 1 (e.g., malloc(x), calloc(x, y), and VLAs)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156515 91177308-0d34-0410-b5e6-96231b3b80d8
<rdar://problem/11291436>.
This is a second attempt at a fix for this, the first was r155468. Thanks
to Chandler, Bob and others for the feedback that helped me improve this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155866 91177308-0d34-0410-b5e6-96231b3b80d8
Original commit message:
Defer some shl transforms to DAGCombine.
The shl instruction is used to represent multiplication by a constant
power of two as well as bitwise left shifts. Some InstCombine
transformations would turn an shl instruction into a bit mask operation,
making it difficult for later analysis passes to recognize the
constsnt multiplication.
Disable those shl transformations, deferring them to DAGCombine time.
An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'.
These transformations are deferred:
(X >>? C) << C --> X & (-1 << C) (When X >> C has multiple uses)
(X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2) (When C2 > C1)
(X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2) (When C1 > C2)
The corresponding exact transformations are preserved, just like
div-exact + mul:
(X >>?,exact C) << C --> X
(X >>?,exact C1) << C2 --> X << (C2-C1)
(X >>?,exact C1) << C2 --> X >>?,exact (C1-C2)
The disabled transformations could also prevent the instruction selector
from recognizing rotate patterns in hash functions and cryptographic
primitives. I have a test case for that, but it is too fragile.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155362 91177308-0d34-0410-b5e6-96231b3b80d8
While the patch was perfect and defect free, it exposed a really nasty
bug in X86 SelectionDAG that caused an llc crash when compiling lencod.
I'll put the patch back in after fixing the SelectionDAG problem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155181 91177308-0d34-0410-b5e6-96231b3b80d8
The shl instruction is used to represent multiplication by a constant
power of two as well as bitwise left shifts. Some InstCombine
transformations would turn an shl instruction into a bit mask operation,
making it difficult for later analysis passes to recognize the
constsnt multiplication.
Disable those shl transformations, deferring them to DAGCombine time.
An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'.
These transformations are deferred:
(X >>? C) << C --> X & (-1 << C) (When X >> C has multiple uses)
(X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2) (When C2 > C1)
(X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2) (When C1 > C2)
The corresponding exact transformations are preserved, just like
div-exact + mul:
(X >>?,exact C) << C --> X
(X >>?,exact C1) << C2 --> X << (C2-C1)
(X >>?,exact C1) << C2 --> X >>?,exact (C1-C2)
The disabled transformations could also prevent the instruction selector
from recognizing rotate patterns in hash functions and cryptographic
primitives. I have a test case for that, but it is too fragile.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155136 91177308-0d34-0410-b5e6-96231b3b80d8
GEPs, bit casts, and stores reaching it but no other instructions. These
often show up during the iterative processing of the inliner, SROA, and
DCE. Once we hit this point, we can completely remove the alloca. These
were actually showing up in the final, fully optimized code in a bunch
of inliner tests I've been working on, and notably they show up after
LLVM finishes optimizing away all function calls involved in
hash_combine(a, b).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154285 91177308-0d34-0410-b5e6-96231b3b80d8
This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154011 91177308-0d34-0410-b5e6-96231b3b80d8
alignment. If that's the case, then we want to make sure that we don't increase
the alignment of the store instruction. Because if we increase it to be "more
aligned" than the pointer, code-gen may use instructions which require a greater
alignment than the pointer guarantees.
<rdar://problem/11043589>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@152907 91177308-0d34-0410-b5e6-96231b3b80d8
Renamed methods caseBegin, caseEnd and caseDefault with case_begin, case_end, and case_default.
Added some notes relative to case iterators.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@152532 91177308-0d34-0410-b5e6-96231b3b80d8
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120130/136146.html
Implemented CaseIterator and it solves almost all described issues: we don't need to mix operand/case/successor indexing anymore. Base iterator class is implemented as a template since it may be initialized either from "const SwitchInst*" or from "SwitchInst*".
ConstCaseIt is just a read-only iterator.
CaseIt is read-write iterator; it allows to change case successor and case value.
Usage of iterator allows totally remove resolveXXXX methods. All indexing convertions done automatically inside the iterator's getters.
Main way of iterator usage looks like this:
SwitchInst *SI = ... // intialize it somehow
for (SwitchInst::CaseIt i = SI->caseBegin(), e = SI->caseEnd(); i != e; ++i) {
BasicBlock *BB = i.getCaseSuccessor();
ConstantInt *V = i.getCaseValue();
// Do something.
}
If you want to convert case number to TerminatorInst successor index, just use getSuccessorIndex iterator's method.
If you want initialize iterator from TerminatorInst successor index, use CaseIt::fromSuccessorIndex(...) method.
There are also related changes in llvm-clients: klee and clang.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@152297 91177308-0d34-0410-b5e6-96231b3b80d8
This transformation is not safe in some pathological cases (signed icmp of pointers should be an
extremely rare thing, but it's valid IR!). Add an explanatory comment.
Kudos to Duncan for pointing out this edge case (and not giving up explaining it until I finally got it).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@151055 91177308-0d34-0410-b5e6-96231b3b80d8
- Ignore pointer casts.
- Also expand GEPs that aren't constantexprs when they have one use or only constant indices.
- We now compile "&foo[i] - &foo[j]" into "i - j".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150961 91177308-0d34-0410-b5e6-96231b3b80d8