This reverts commit 963cdbccf6e5578822836fd9b2ebece0ba9a60b7 (ie r236514)
This is to get the bots green while i investigate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236518 91177308-0d34-0410-b5e6-96231b3b80d8
The code was basically the same here already. Just added an out parameter for a vector of seen defs so that UpdatePredRedefs can call StepForward first, then do its own post processing on the seen defs.
Will be used in the next commit to also handle regmasks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236514 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r236360.
This change exposed a bug in WinEHPrepare by opting win32 code into EH
preparation. We already knew that WinEHPrepare has bugs, and is the
status quo for x64, so I don't think that's a reason to hold off on this
change. I disabled exceptions in the sanitizer tests in r236505 and an
earlier revision.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236508 91177308-0d34-0410-b5e6-96231b3b80d8
This patch introduces a new pass that computes the safe point to insert the
prologue and epilogue of the function.
The interest is to find safe points that are cheaper than the entry and exits
blocks.
As an example and to avoid regressions to be introduce, this patch also
implements the required bits to enable the shrink-wrapping pass for AArch64.
** Context **
Currently we insert the prologue and epilogue of the method/function in the
entry and exits blocks. Although this is correct, we can do a better job when
those are not immediately required and insert them at less frequently executed
places.
The job of the shrink-wrapping pass is to identify such places.
** Motivating example **
Let us consider the following function that perform a call only in one branch of
a if:
define i32 @f(i32 %a, i32 %b) {
%tmp = alloca i32, align 4
%tmp2 = icmp slt i32 %a, %b
br i1 %tmp2, label %true, label %false
true:
store i32 %a, i32* %tmp, align 4
%tmp4 = call i32 @doSomething(i32 0, i32* %tmp)
br label %false
false:
%tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ]
ret i32 %tmp.0
}
On AArch64 this code generates (removing the cfi directives to ease
readabilities):
_f: ; @f
; BB#0:
stp x29, x30, [sp, #-16]!
mov x29, sp
sub sp, sp, #16 ; =16
cmp w0, w1
b.ge LBB0_2
; BB#1: ; %true
stur w0, [x29, #-4]
sub x1, x29, #4 ; =4
mov w0, wzr
bl _doSomething
LBB0_2: ; %false
mov sp, x29
ldp x29, x30, [sp], #16
ret
With shrink-wrapping we could generate:
_f: ; @f
; BB#0:
cmp w0, w1
b.ge LBB0_2
; BB#1: ; %true
stp x29, x30, [sp, #-16]!
mov x29, sp
sub sp, sp, #16 ; =16
stur w0, [x29, #-4]
sub x1, x29, #4 ; =4
mov w0, wzr
bl _doSomething
add sp, x29, #16 ; =16
ldp x29, x30, [sp], #16
LBB0_2: ; %false
ret
Therefore, we would pay the overhead of setting up/destroying the frame only if
we actually do the call.
** Proposed Solution **
This patch introduces a new machine pass that perform the shrink-wrapping
analysis (See the comments at the beginning of ShrinkWrap.cpp for more details).
It then stores the safe save and restore point into the MachineFrameInfo
attached to the MachineFunction.
This information is then used by the PrologEpilogInserter (PEI) to place the
related code at the right place. This pass runs right before the PEI.
Unlike the original paper of Chow from PLDI’88, this implementation of
shrink-wrapping does not use expensive data-flow analysis and does not need hack
to properly avoid frequently executed point. Instead, it relies on dominance and
loop properties.
The pass is off by default and each target can opt-in by setting the
EnableShrinkWrap boolean to true in their derived class of TargetPassConfig.
This setting can also be overwritten on the command line by using
-enable-shrink-wrap.
Before you try out the pass for your target, make sure you properly fix your
emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not
necessarily the entry block.
** Design Decisions **
1. ShrinkWrap is its own pass right now. It could frankly be merged into PEI but
for debugging and clarity I thought it was best to have its own file.
2. Right now, we only support one save point and one restore point. At some
point we can expand this to several save point and restore point, the impacted
component would then be:
- The pass itself: New algorithm needed.
- MachineFrameInfo: Hold a list or set of Save/Restore point instead of one
pointer.
- PEI: Should loop over the save point and restore point.
Anyhow, at least for this first iteration, I do not believe this is interesting
to support the complex cases. We should revisit that when we motivating
examples.
Differential Revision: http://reviews.llvm.org/D9210
<rdar://problem/3201744>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236507 91177308-0d34-0410-b5e6-96231b3b80d8
and avoid cloning unused decls into every partition.
Module partitioning showed up as a source of significant overhead when I
profiled some trivial test cases. Avoiding the overhead of partitionging
for uncalled functions helps to mitigate this.
This change also means that it is no longer necessary to have a
LazyEmittingLayer underneath the CompileOnDemand layer, since the
CompileOnDemandLayer will not extract or emit function bodies until they are
called.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236465 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds the --load-address command line option to
llvm-pdbdump, which dumps all addresses assuming the module has
loaded at the specified address.
Additionally, this patch adds an option to llvm-pdbdump to support
dumping of public symbols (i.e. symbols with external linkage).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236342 91177308-0d34-0410-b5e6-96231b3b80d8
This pass is responsible for constructing the EH registration object
that gets linked into fs:00, which is all it does in this change. In the
future, it will also insert stores to update the EH state number.
I considered keeping this functionality in WinEHPrepare, but it's pretty
separable and X86 specific. It has conceptually very little to do with
the task of WinEHPrepare, which is currently outlining. WinEHPrepare is
also in theory useful on ARM, but this logic is pretty x86 specific.
Reviewers: andrew.w.kaylor, majnemer
Differential Revision: http://reviews.llvm.org/D9422
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236339 91177308-0d34-0410-b5e6-96231b3b80d8
Patch from dexonsmith. The call to toInt() was calling compareTo() which
in some cases would call back to toInt(), creating an infinite loop.
Fixed by simplifying the logic in compareTo() to avoid the co-recursion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236326 91177308-0d34-0410-b5e6-96231b3b80d8
Found by -Wpessimizing-move, no functional change. The APFloat and
PassManager change doesn't affect codegen as returning a by-value
argument will always result in a move.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236316 91177308-0d34-0410-b5e6-96231b3b80d8
This change is the second of 3 patches to add support for specifying
the profile output from the command line via -fprofile-instr-generate=<path>,
where the specified output path/file will be overridden by the
LLVM_PROFILE_FILE environment variable.
This patch adds the necessary support to the llvm instrumenter, specifically
a new member of GCOVOptions for clang to save the specified filename, and
support for calling the new compiler-rt interface from __llvm_profile_init.
Patch by Teresa Johnson. Thanks!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236288 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The private constructor for ScaledNumber was using uint64_t instead of
DigitsT. This was preventing instantiations of ScaledNumber with
anything other than uint64_t types.
In implementing the tests, I ran into another issue. Operators >>= and
<<= did not have variants for accepting other ScaledNumber as the shift
argument. This is expected by the SCALED_NUMBER_BOP.
It makes no sense to allow shifting a ScaledNumber by another
ScaledNumber, so the patch includes two new templates for shifting
ScaledNumbers.
Reviewers: dexonsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9350
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236232 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This pathc add convenient overloads for CreateInsertElement and CreateExtractElement methods in IRBuilder
where vector index can be uint64_t instead of Value*.
Test Plan: Unit test included.
Reviewers: majnemer
Reviewed By: majnemer
Subscribers: majnemer, llvm-commits
Differential Revision: http://reviews.llvm.org/D9347
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236214 91177308-0d34-0410-b5e6-96231b3b80d8
Many of the callers already have the pointer type anyway, and for the
couple of callers that don't it's pretty easy to call PointerType::get
on the pointee type and address space.
This avoids LLParser from using PointerType::getElementType when parsing
GlobalAliases from IR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236160 91177308-0d34-0410-b5e6-96231b3b80d8
Finish off PR23080 by renaming the debug info IR constructs from `MD*`
to `DI*`. The last of the `DIDescriptor` classes were deleted in
r235356, and the last of the related typedefs removed in r235413, so
this has all baked for about a week.
Note: If you have out-of-tree code (like a frontend), I recommend that
you get everything compiling and tests passing with the *previous*
commit before updating to this one. It'll be easier to keep track of
what code is using the `DIDescriptor` hierarchy and what you've already
updated, and I think you're extremely unlikely to insert bugs. YMMV of
course.
Back to *this* commit: I did this using the rename-md-di-nodes.sh
upgrade script I've attached to PR23080 (both code and testcases) and
filtered through clang-format-diff.py. I edited the tests for
test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns
were off-by-three. It should work on your out-of-tree testcases (and
code, if you've followed the advice in the previous paragraph).
Some of the tests are in badly named files now (e.g.,
test/Assembler/invalid-mdcompositetype-missing-tag.ll should be
'dicompositetype'); I'll come back and move the files in a follow-up
commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236120 91177308-0d34-0410-b5e6-96231b3b80d8
This is a preliminary step to using the IR-level floating-point fast-math-flags in the SDAG (D8900).
In this patch, we introduce the optimization flags as their own struct. As noted in the TODO comment,
we should eventually share this data between the IR passes and the backend.
We also switch the existing nsw / nuw / exact bit functionality of the BinaryWithFlagsSDNode class to
use the new struct.
The tradeoff is that instead of using the free but limited space of SDNode's SubclassData, we add a
data member to the subclass. This means we don't have to repeat all of the get/set methods per flag,
but we're potentially adding size to all nodes of this subclassi type.
In practice on 64-bit systems (measured on Linux and MacOS X), there is no size difference between an
SDNode and BinaryWithFlagsSDNode after this change: they're both 80 bytes. This means that we had at
least one free byte to play with due to struct alignment.
Differential Revision: http://reviews.llvm.org/D9325
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235997 91177308-0d34-0410-b5e6-96231b3b80d8
[DebugInfo] Add debug locations to constant SD nodes
This adds debug location to constant nodes of Selection DAG and updates
all places that create constants to pass debug locations
(see PR13269).
Can't guarantee that all locations are correct, but in a lot of cases choice
is obvious, so most of them should be. At least all tests pass.
Tests for these changes do not cover everything, instead just check it for
SDNodes, ARM and AArch64 where it's easy to get incorrect locations on
constants.
This is not complete fix as FastISel contains workaround for wrong debug
locations, which drops locations from instructions on processing constants,
but there isn't currently a way to use debug locations from constants there
as llvm::Constant doesn't cache it (yet). Although this is a bit different
issue, not directly related to these changes.
Differential Revision: http://reviews.llvm.org/D9084
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235989 91177308-0d34-0410-b5e6-96231b3b80d8
This adds debug location to constant nodes of Selection DAG and updates
all places that create constants to pass debug locations
(see PR13269).
Can't guarantee that all locations are correct, but in a lot of cases choice
is obvious, so most of them should be. At least all tests pass.
Tests for these changes do not cover everything, instead just check it for
SDNodes, ARM and AArch64 where it's easy to get incorrect locations on
constants.
This is not complete fix as FastISel contains workaround for wrong debug
locations, which drops locations from instructions on processing constants,
but there isn't currently a way to use debug locations from constants there
as llvm::Constant doesn't cache it (yet). Although this is a bit different
issue, not directly related to these changes.
Differential Revision: http://reviews.llvm.org/D9084
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235977 91177308-0d34-0410-b5e6-96231b3b80d8