The normal dataflow sequence in the ARC optimizer consists of the following
states:
Retain -> CanRelease -> Use -> Release
The optimizer before this patch stored the uses that determine the lifetime of
the retainable object pointer when it bottom up hits a retain or when top down
it hits a release. This is correct for an imprecise lifetime scenario since what
we are trying to do is remove retains/releases while making sure that no
``CanRelease'' (which is usually a call) deallocates the given pointer before we
get to the ``Use'' (since that would cause a segfault).
If we are considering the precise lifetime scenario though, this is not
correct. In such a situation, we *DO* care about the previous sequence, but
additionally, we wish to track the uses resulting from the following incomplete
sequences:
Retain -> CanRelease -> Release (TopDown)
Retain <- Use <- Release (BottomUp)
*NOTE* This patch looks large but the most of it consists of updating
test cases. Additionally this fix exposed an additional bug. I removed
the test case that expressed said bug and will recommit it with the fix
in a little bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178921 91177308-0d34-0410-b5e6-96231b3b80d8
This optimization is unstable at this moment; it
1) block us on a very important application
2) PR15200
3) test6 and test7 in test/Transforms/ScalarRepl/dynamic-vector-gep.ll
(the CHECK command compare the output against wrong result)
I personally believe this optimization should not have any impact on the
autovectorized code, as auto-vectorizer is supposed to put gather/scatter
in a "right" way. Although in theory downstream optimizaters might reveal
some gather/scatter optimization opportunities, the chance is quite slim.
For the hand-crafted vectorizing code, in term of redundancy elimination,
load-CSE, copy-propagation and DSE can collectively achieve the same result,
but in much simpler way. On the other hand, these optimizers are able to
improve the code in a incremental way; in contrast, SROA is sort of all-or-none
approach. However, SROA might slighly win in stack size, as it tries to figure
out a stretch of memory tightenly cover the area accessed by the dynamic index.
rdar://13174884
PR15200
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178912 91177308-0d34-0410-b5e6-96231b3b80d8
Pass down the fact that an operand is going to be a vector of constants.
This should bring the performance of MultiSource/Benchmarks/PAQ8p/paq8p on x86
back. It had degraded to scalar performance due to my pervious shift cost change
that made all shifts expensive on x86.
radar://13576547
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178809 91177308-0d34-0410-b5e6-96231b3b80d8
OpndPtrs stored pointers into the Opnd vector that became invalid when the
vector grows. Store indices instead. Sadly I only have a large testcase that
only triggers under valgrind, so I didn't include it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178793 91177308-0d34-0410-b5e6-96231b3b80d8
Cleaned up trailing whitespace and added extra slashes in front of a
function level comment so that it follow the convention of having 3
slashes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178712 91177308-0d34-0410-b5e6-96231b3b80d8
The semantics of ARC implies that a pointer passed into an objc_autorelease
must live until some point (potentially down the stack) where an
autorelease pool is popped. On the other hand, an
objc_autoreleaseReturnValue just signifies that the object must live
until the end of the given function at least.
Thus objc_autorelease is stronger than objc_autoreleaseReturnValue in
terms of the semantics of ARC* implying that performing the given
strength reduction without any knowledge of how this relates to
the autorelease pool pop that is further up the stack violates the
semantics of ARC.
*Even though objc_autoreleaseReturnValue if you know that no RV
optimization will occur is more computationally expensive.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178612 91177308-0d34-0410-b5e6-96231b3b80d8
The iterator could be invalidated when it's recursively deleting a whole bunch
of constant expressions in a constant initializer.
Note: This was only reproducible if `opt' was run on a `.bc' file. If `opt' was
run on a `.ll' file, it wouldn't crash. This is why the test first pushes the
`.ll' file through `llvm-as' before feeding it to `opt'.
PR15440
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178531 91177308-0d34-0410-b5e6-96231b3b80d8
clang.arc.used is an interesting call for ARC since ObjCARCContract
needs to run to remove said intrinsic to avoid a linker error (since the
call does not exist).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178369 91177308-0d34-0410-b5e6-96231b3b80d8
Since we handle optimizable objc_retainBlocks through strength reduction
in OptimizableIndividualCalls, we know that all code after that point
will only see non-optimizable objc_retainBlock calls. IsForwarding is
only called by functions after that point, so it is ok to just classify
objc_retainBlock as non-forwarding.
<rdar://problem/13249661>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178285 91177308-0d34-0410-b5e6-96231b3b80d8
If an objc_retainBlock has the copy_on_escape metadata attached to it
AND if the block pointer argument only escapes down the stack, we are
allowed to strength reduce the objc_retainBlock to to an objc_retain and
thus optimize it.
Current there is logic in the ARC data flow analysis to handle
this case which is complicated and involved making distinctions in
between objc_retainBlock and objc_retain in certain places and
considering them the same in others.
This patch simplifies said code by:
1. Performing the strength reduction in the initial ARC peephole
analysis (ObjCARCOpts::OptimizeIndividualCalls).
2. Changes the ARC dataflow analysis (which runs after the peephole
analysis) to consider all objc_retainBlock calls to not be optimizable
(since if the call was optimizable, we would have strength reduced it
already).
This patch leaves in the infrastructure in the ARC dataflow analysis to
handle this case, which due to 2 will just be dead code. I am doing this
on purpose to separate the removal of the old code from the testing of
the new code.
<rdar://problem/13249661>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178284 91177308-0d34-0410-b5e6-96231b3b80d8
If we compile a single source program, the `.gcda' file will be generated where
the program was executed. This isn't desirable, because that place may be at an
unpredictable place (the program could call `chdir' for instance).
Instead, we will output the `.gcda' file in the same place we output the `.gcno'
file. I.e., the directory where the executable was generated. This matches GCC's
behavior.
<rdar://problem/13061072> & PR11809
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178084 91177308-0d34-0410-b5e6-96231b3b80d8
The OptimizeIntToFloatBitCast converts shift-truncate sequences
into extractelement operations. The computation of the element
index to be used in the resulting operation is currently only
correct for little-endian targets.
This commit fixes the element index computation to be correct
for big-endian targets as well. If the target byte order is
unknown, the optimization cannot be performed at all.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178031 91177308-0d34-0410-b5e6-96231b3b80d8
This will allow for verification and analysis of the merge function of
the data flow analyses in the ARC optimizer.
The actual implementation of this feature is by introducing calls to
the functions llvm.arc.annotation.{bottomup,topdown}.{bbstart,bbend}
which are only declared. Each such call takes in a pointer to a global
with the same name as the pointer whose provenance is being tracked and
a pointer whose name is one of our Sequence states and points to a
string that contains the same name.
To ensure that the optimizer does not consider these annotations in any
way, I made it so that the annotations are considered to be of IC_None
type.
A test case is included for this commit and the previous
ObjCARCAnnotation commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177952 91177308-0d34-0410-b5e6-96231b3b80d8
Previously the inner works of the data flow analysis in ObjCARCOpts was hard to
get out of the optimizer for analysis of bugs or testing. All of the current ARC
unit tests are based off of testing the effect of the data flow
analysis (i.e. what statements are removed or moved, etc.). This creates
weakness in the current unit testing regimem since we are not actually testing
what effects various instructions have on the modeled pointer state.
Additionally in order to analyze a bug in the optimizer, one would need to track
by hand what the optimizer was actually doing either through use of DEBUG
statements or through the usage of a debugger, both yielding large loses in
developer productivity.
This patch deals with these two issues by providing ARC annotation
metadata that annotates instructions with the state changes that they cause in
various pointers as well as provides metadata to annotate provenance sources.
Specifically, we introduce the following metadata types:
1. llvm.arc.annotation.bottomup.
2. llvm.arc.annotation.topdown.
3. llvm.arc.annotation.provenancesource.
llvm.arc.annotation.{bottomup,topdown}: These annotations describes a state
change in a pointer when we are visiting instructions bottomup/topdown
respectively. The output format for both is the same:
!1 = metadata !{metadata !"(test,%x)", metadata !"S_Release", metadata !"S_Use"}
The first element is a string tuple with the following format:
(function,variable name)
The second two elements of the metadata show the previous state of the
pointer (in this case S_Release) and the new state of the pointer (S_Use). We
write the metadata in such a manner to ensure that it is easy for outside tools
to parse. This is important since I am currently working on a tool for taking
this information and pretty printing it besides the IR and that can be used for
LIT style testing via the generation of an index.
llvm.arc.annotation.provenancesource: This metadata is used to annotate
instructions which act as provenance sources, i.e. ones that introduce a
new (from the optimizer's perspective) non-argument pointer to track. This
enables cross-referencing in between provenance sources and the state changes
that occur to them.
This is still a work in progress. Additionally I plan on committing
later today additions to the annotations that annotate at the top/bottom
of basic blocks the state of the various pointers being tracked.
*NOTE* The metadata support is conditionally compiled into libObjCARCOpts only
when we are producing a debug build of llvm/clang and even so are
disabled by default. To enable the annotation metadata, pass in
-enable-objc-arc-annotations to opt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177951 91177308-0d34-0410-b5e6-96231b3b80d8
The problem is that the code mistakenly took for granted that following constructor
is able to create an APFloat from a *SIGNED* integer:
APFloat::APFloat(const fltSemantics &ourSemantics, integerPart value)
rdar://13486998
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177906 91177308-0d34-0410-b5e6-96231b3b80d8
This simplification happens at 2 places :
- using the nsw attribute when the shl / mul is used by a sign test
- when the shl / mul is compared for (in)equality to zero
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177856 91177308-0d34-0410-b5e6-96231b3b80d8
Now said method matches namewise every other method which refers to
the member KnownPositiveRefCount of the class PtrState.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177816 91177308-0d34-0410-b5e6-96231b3b80d8
Before: the function name was stored by the compiler as a constant string
and the run-time was printing it.
Now: the PC is stored instead and the run-time prints the full symbolized frame.
This adds a couple of instructions into every function with non-empty stack frame,
but also reduces the binary size because we store less strings (I saw 2% size reduction).
This change bumps the asan ABI version to v3.
llvm part.
Example of report (now):
==31711==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fffa77cf1c5 at pc 0x41feb0 bp 0x7fffa77cefb0 sp 0x7fffa77cefa8
READ of size 1 at 0x7fffa77cf1c5 thread T0
#0 0x41feaf in Frame0(int, char*, char*, char*) stack-oob-frames.cc:20
#1 0x41f7ff in Frame1(int, char*, char*) stack-oob-frames.cc:24
#2 0x41f477 in Frame2(int, char*) stack-oob-frames.cc:28
#3 0x41f194 in Frame3(int) stack-oob-frames.cc:32
#4 0x41eee0 in main stack-oob-frames.cc:38
#5 0x7f0c5566f76c (/lib/x86_64-linux-gnu/libc.so.6+0x2176c)
#6 0x41eb1c (/usr/local/google/kcc/llvm_cmake/a.out+0x41eb1c)
Address 0x7fffa77cf1c5 is located in stack of thread T0 at offset 293 in frame
#0 0x41f87f in Frame0(int, char*, char*, char*) stack-oob-frames.cc:12 <<<<<<<<<<<<<< this is new
This frame has 6 object(s):
[32, 36) 'frame.addr'
[96, 104) 'a.addr'
[160, 168) 'b.addr'
[224, 232) 'c.addr'
[288, 292) 's'
[352, 360) 'd'
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177724 91177308-0d34-0410-b5e6-96231b3b80d8
The original code used i32, and i64 if legal. This introduced unneeded
casts when they aren't legal, or when the index variable i has another
type. In order of preference: try to use i's type; use the smallest
fitting legal type (using an added DataLayout method); default to i32.
A testcase checks that this works when the index gep operand is i16.
Patch by : Ahmed Bougacha <ahmed.bougacha@gmail.com>
Reviewed by : Duncan
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177712 91177308-0d34-0410-b5e6-96231b3b80d8
How did this ever work?
Basically, if you have a function that's inlined into the caller, it may not
have any 'call' instructions, but any 'resume' instructions it may have should
still be forwarded to the outer (caller's) landing pad. This requires that all
of the 'landingpad' instructions in the callee have their clauses merged with
the caller's outer 'landingpad' instruction (hence the bit of ugly code in the
`forwardResume' method).
Testcase in a follow commit to the test-suite repository.
<rdar://problem/13360379> & PR15555
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177680 91177308-0d34-0410-b5e6-96231b3b80d8
The key part of this is ensuring that name prefixes remain in a Twine
form until we get to a point where we can nuke them under NDEBUG. This
is tricky using the old APIs as they played fast and loose with Twine,
which is prone to serious error. The inserter is much cleaner as it is
actually in the call stack leading to the setName call, and so has
a good opportunity to prepend the prefix.
This matters more than you might imagine because most runs over an
alloca find a single partition, and rewrite 3 or 4 instructions
referring to it. As a consequence doing this lazily and exclusively with
Twine allows the optimizer to delete more of it and shaves another 2% to
3% off of the release build's SROA run time for PR15412. I also think
the APIs are cleaner, and the use of Twine is more reliable, so
I consider it a win-win despite the churn required to reach this state.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177631 91177308-0d34-0410-b5e6-96231b3b80d8
The 'Modified' variable should have been removed from SimplifyLibCalls
in r177619, but was missed. This commit removes it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177622 91177308-0d34-0410-b5e6-96231b3b80d8