This was always part of the VMCore library out of necessity -- it deals
entirely in the IR. The .cpp file in fact was already part of the VMCore
library. This is just a mechanical move.
I've tried to go through and re-apply the coding standard's preferred
header sort, but at 40-ish files, I may have gotten some wrong. Please
let me know if so.
I'll be committing the corresponding updates to Clang and Polly, and
Duncan has DragonEgg.
Thanks to Bill and Eric for giving the green light for this bit of cleanup.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159421 91177308-0d34-0410-b5e6-96231b3b80d8
The primitive conservative heuristic seems to give a slight overall
improvement while not regressing stuff. Make it available to wider
testing. If you notice any speed regressions (or significant code
size regressions) let me know!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156258 91177308-0d34-0410-b5e6-96231b3b80d8
This came up when a change in block placement formed a cmov and slowed down a
hot loop by 50%:
ucomisd (%rdi), %xmm0
cmovbel %edx, %esi
cmov is a really bad choice in this context because it doesn't get branch
prediction. If we emit it as a branch, an out-of-order CPU can do a better job
(if the branch is predicted right) and avoid waiting for the slow load+compare
instruction to finish. Of course it won't help if the branch is unpredictable,
but those are really rare in practice.
This patch uses a dumb conservative heuristic, it turns all cmovs that have one
use and a direct memory operand into branches. cmovs usually save some code
size, so we disable the transform in -Os mode. In-Order architectures are
unlikely to benefit as well, those are included in the
"predictableSelectIsExpensive" flag.
It would be better to reuse branch probability info here, but BPI doesn't
support select instructions currently. It would make sense to use the same
heuristics as the if-converter pass, which does the opposite direction of this
transform.
Test suite shows a small improvement here and there on corei7-level machines,
but the actual results depend a lot on the used microarchitecture. The
transformation is currently disabled by default and available by passing the
-enable-cgp-select2branch flag to the code generator.
Thanks to Chandler for the initial test case to him and Evan Cheng for providing
me with comments and test-suite numbers that were more stable than mine :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156234 91177308-0d34-0410-b5e6-96231b3b80d8
bit simpler by handling a common case explicitly.
Also, refactor the implementation to use a worklist based walk of the
recursive users, rather than trying to use value handles to detect and
recover from RAUWs during the recursive descent. This fixes a very
subtle bug in the previous implementation where degenerate control flow
structures could cause mutually recursive instructions (PHI nodes) to
collapse in just such a way that From became equal to To after some
amount of recursion. At that point, we hit the inf-loop that the assert
at the top attempted to guard against. This problem is defined away when
not using value handles in this manner. There are lots of comments
claiming that the WeakVH will protect against just this sort of error,
but they're not accurate about the actual implementation of WeakVHs,
which do still track RAUWs.
I don't have any test case for the bug this fixes because it requires
running the recursive simplification on unreachable phi nodes. I've no
way to either run this or easily write an input that triggers it. It was
found when using instruction simplification inside the inliner when
running over the nightly test-suite.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153393 91177308-0d34-0410-b5e6-96231b3b80d8
Some BBs can become dead after codegen preparation. If we delete them here, it
could help enable tail-call optimizations later on.
<rdar://problem/10256573>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@152002 91177308-0d34-0410-b5e6-96231b3b80d8
Problem: LLVM needs more function attributes than currently available (32 bits).
One such proposed attribute is "address_safety", which shows that a function is being checked for address safety (by AddressSanitizer, SAFECode, etc).
Solution:
- extend the Attributes from 32 bits to 64-bits
- wrap the object into a class so that unsigned is never erroneously used instead
- change "unsigned" to "Attributes" throughout the code, including one place in clang.
- the class has no "operator uint64 ()", but it has "uint64_t Raw() " to support packing/unpacking.
- the class has "safe operator bool()" to support the common idiom: if (Attributes attr = getAttrs()) useAttrs(attr);
- The CTOR from uint64_t is marked explicit, so I had to add a few explicit CTOR calls
- Add the new attribute "address_safety". Doing it in the same commit to check that attributes beyond first 32 bits actually work.
- Some of the functions from the Attribute namespace are worth moving inside the class, but I'd prefer to have it as a separate commit.
Tested:
"make check" on Linux (32-bit and 64-bit) and Mac (10.6)
built/run spec CPU 2006 on Linux with clang -O2.
This change will break clang build in lib/CodeGen/CGCall.cpp.
The following patch will fix it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148553 91177308-0d34-0410-b5e6-96231b3b80d8
This looks like it flagged an actual bug. Devang, please review. I added
the parentheses that change behavior, but make the behavior more closely
match commit log's intent.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132165 91177308-0d34-0410-b5e6-96231b3b80d8
I also changed -simplifycfg, -jump-threading and -codegenprepare to use this to produce slightly better code without any extra cleanup passes (AFAICT this was the only place in -simplifycfg where now-dead conditions of replaced terminators weren't being cleaned up). The only other user of this function is -sccp, but I didn't read that thoroughly enough to figure out whether it might be holding pointers to instructions that could be deleted by this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131855 91177308-0d34-0410-b5e6-96231b3b80d8
Original log entry:
Refactor getActionType and getTypeToTransformTo ; place all of the 'decision'
code in one place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131536 91177308-0d34-0410-b5e6-96231b3b80d8
delete the instruction pointed to by CGP's current instruction
iterator, leading to a crash on the testcase. This fixes PR9578.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129200 91177308-0d34-0410-b5e6-96231b3b80d8
to have single return block (at least getting there) for optimizations. This
is general goodness but it would prevent some tailcall optimizations.
One specific case is code like this:
int f1(void);
int f2(void);
int f3(void);
int f4(void);
int f5(void);
int f6(void);
int foo(int x) {
switch(x) {
case 1: return f1();
case 2: return f2();
case 3: return f3();
case 4: return f4();
case 5: return f5();
case 6: return f6();
}
}
=>
LBB0_2: ## %sw.bb
callq _f1
popq %rbp
ret
LBB0_3: ## %sw.bb1
callq _f2
popq %rbp
ret
LBB0_4: ## %sw.bb3
callq _f3
popq %rbp
ret
This patch teaches codegenprep to duplicate returns when the return value
is a phi and where the phi operands are produced by tail calls followed by
an unconditional branch:
sw.bb7: ; preds = %entry
%call8 = tail call i32 @f5() nounwind
br label %return
sw.bb9: ; preds = %entry
%call10 = tail call i32 @f6() nounwind
br label %return
return:
%retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ]
ret i32 %retval.0
This allows codegen to generate better code like this:
LBB0_2: ## %sw.bb
jmp _f1 ## TAILCALL
LBB0_3: ## %sw.bb1
jmp _f2 ## TAILCALL
LBB0_4: ## %sw.bb3
jmp _f3 ## TAILCALL
rdar://9147433
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127953 91177308-0d34-0410-b5e6-96231b3b80d8
Optimize trivial branches in CodeGenPrepare, which often get created from the
lowering of objectsize intrinsics. Unfortunately, a number of tests were relying
on llc not optimizing trivial branches, so I had to add an option to allow them
to continue to test what they originally tested.
This fixes <rdar://problem/8785296> and <rdar://problem/9112893>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127498 91177308-0d34-0410-b5e6-96231b3b80d8
lowering of objectsize intrinsics. Unfortunately, a number of tests were relying
on llc not optimizing trivial branches, so I had to add an option to allow them
to continue to test what they originally tested.
This fixes <rdar://problem/8785296> and <rdar://problem/9112893>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127459 91177308-0d34-0410-b5e6-96231b3b80d8
the percentage of time spent in CodeGenPrepare when llcing 403.gcc from 12.6% to
1.8% of total llc time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127069 91177308-0d34-0410-b5e6-96231b3b80d8
addressing code. On 403.gcc this almost halves CodeGenPrepare time and reduces
total llc time by 9.5%. Unfortunately, getNumUses() is still the hottest function
in llc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126782 91177308-0d34-0410-b5e6-96231b3b80d8
The basic issue is that isel (very reasonably!) expects conditional branches
to be folded, so CGP leaving around a bunch dead computation feeding
conditional branches isn't such a good idea. Just fold branches on constants
into unconditional branches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123526 91177308-0d34-0410-b5e6-96231b3b80d8
have objectsize folding recursively simplify away their result when it
folds. It is important to catch this here, because otherwise we won't
eliminate the cross-block values at isel and other times.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123524 91177308-0d34-0410-b5e6-96231b3b80d8
potentially invalidate it (like inline asm lowering) to be sunk into
their proper place, cleaning up a ton of code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123523 91177308-0d34-0410-b5e6-96231b3b80d8
they all ready do). This removes two dominator recomputations prior to isel,
which is a 1% improvement in total llc time for 403.gcc.
The only potentially suspect thing is making GCStrategy recompute dominators if
it used a custom lowering strategy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123064 91177308-0d34-0410-b5e6-96231b3b80d8
into a separate function, so that it can be called from a loop using a worklist
rather than a loop traversing a whole basic block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122943 91177308-0d34-0410-b5e6-96231b3b80d8
capacity on the Visited SmallPtrSet. On 403.gcc, this is about a 4.5% speedup of
CodeGenPrepare time (which itself is 10% of time spent in the backend).
This is progress towards PR8889.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122741 91177308-0d34-0410-b5e6-96231b3b80d8
pipeline to be caught by instcombine, and it's not feasible to catch them in SimplifyCFG because the
use-lists are in an inconsistent state at the point where it could know that it need to simplify them.
Instead, have CodeGenPrepare look for trivially redundant PHIs as part of its general cleanup effort.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122516 91177308-0d34-0410-b5e6-96231b3b80d8
which have trapping constant exprs in them due to PHI nodes.
Eliminating them can cause the constant expr to be evalutated
on new paths if the input edges are critical.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122164 91177308-0d34-0410-b5e6-96231b3b80d8
by my recent GVN improvement. Looking through a single layer of
PHI nodes when attempting to sink GEPs, we need to iteratively
look through arbitrary PHI nests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120202 91177308-0d34-0410-b5e6-96231b3b80d8
if all the operands of the PHI are equivalent. This allows CodeGenPrepare to undo
unprofitable PRE transforms.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119853 91177308-0d34-0410-b5e6-96231b3b80d8
must be called in the pass's constructor. This function uses static dependency declarations to recursively initialize
the pass's dependencies.
Clients that only create passes through the createFooPass() APIs will require no changes. Clients that want to use the
CommandLine options for passes will need to manually call the appropriate initialization functions in PassInitialization.h
before parsing commandline arguments.
I have tested this with all standard configurations of clang and llvm-gcc on Darwin. It is possible that there are problems
with the static dependencies that will only be visible with non-standard options. If you encounter any crash in pass
registration/creation, please send the testcase to me directly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116820 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts revision 114633. It was breaking llvm-gcc-i386-linux-selfhost.
It seems there is a downstream bug that is exposed by
-cgp-critical-edge-splitting=0. When that bug is fixed, this patch can go back
in.
Note that the changes to tailcallfp2.ll are not reverted. They were good are
required.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114859 91177308-0d34-0410-b5e6-96231b3b80d8
truncates are free only in the case where the extended type is legal but the
load type is not. If both types are illegal, such as when they are too big,
the load may not be legalized into an extended load.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114568 91177308-0d34-0410-b5e6-96231b3b80d8
load when the type of the load is not legal, even if truncates are not free.
The load is going to be legalized to an extending load anyway.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114488 91177308-0d34-0410-b5e6-96231b3b80d8
walking the asm arguments once and stashing their Values. This is
wrong because the same memory location can be in the list twice, and
if the first one has a sunkaddr substituted, the stashed value for the
second one will be wrong (use-after-free). PR 8154.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114104 91177308-0d34-0410-b5e6-96231b3b80d8
for an "i" constraint should get lowered; PR 6309. While
this argument was passed around a lot, this is the only
place it was used, so it goes away from a lot of other
places.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106893 91177308-0d34-0410-b5e6-96231b3b80d8
Probably the best way to know that all getOperand() calls have been handled
is to replace that API instead of updating.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101579 91177308-0d34-0410-b5e6-96231b3b80d8
with a fix for self-hosting
rotate CallInst operands, i.e. move callee to the back
of the operand array
the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101465 91177308-0d34-0410-b5e6-96231b3b80d8
generate wrong code pretty much anywhere AFAICT.
A case that hits the bug reproducibly is impossible,
but the situation was like this:
Addr = ...
Store -> Addr
Addr2 = GEP , 0, 0
Store -> Addr2
Handling the first store, the code changed replaced Addr
with a sunkaddr and deleted Addr, but not its table
entry. Code in OptimizedBlock replaced Addr2 with a
bitcast; if that happened to reuse the memory of Addr,
the old table entry was erroneously found when handling
the second store.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100044 91177308-0d34-0410-b5e6-96231b3b80d8
and T->isPointerTy(). Convert most instances of the first form to the second form.
Requested by Chris.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96344 91177308-0d34-0410-b5e6-96231b3b80d8