12543 Commits

Author SHA1 Message Date
Craig Topper
f781855732 [X86] Remove AVX512 pslldq/psrldq shift intrinsics. They aren't implemented yet and when they are they should be done with shuffles like SSE2 and AVX2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229641 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-18 06:24:49 +00:00
Craig Topper
ed42dcef75 [X86] Remove AVX2 and SSE2 pslldq and psrldq intrinsics. We can represent them in IR with vector shuffles now. All their uses have been removed from clang in favor of shuffles.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229640 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-18 06:24:44 +00:00
Adam Nemet
5934863461 [LoopAccesses] Change LAA:getInfo to return a constant reference
As expected, this required a few more const-correctness fixes.

Based on Hal's feedback on D7684.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229634 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-18 03:44:33 +00:00
Adam Nemet
c548c640bc [LoopAccesses] Split out LoopAccessReport from VectorizerReport
The only difference between these two is that VectorizerReport adds a
vectorizer-specific prefix to its messages.  When LAA is used in the
vectorizer context the prefix is added when we promote the
LoopAccessReport into a VectorizerReport via one of the constructors.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229632 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-18 03:44:25 +00:00
Adam Nemet
69c9697fa7 [LoopAccesses] Add missing const to APIs in VectorizationReport
When I split out LoopAccessReport from this, I need to create some temps
so constness becomes necessary.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229631 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-18 03:44:20 +00:00
Adam Nemet
ce5c5c0301 [LoopAccesses] Change debug messages from LV to LAA
Also add pass name as an argument to VectorizationReport::emitAnalysis.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229628 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-18 03:43:37 +00:00
Adam Nemet
718b023033 [LoopAccesses] Create the analysis pass
This is a function pass that runs the analysis on demand.  The analysis
can be initiated by querying the loop access info via LAA::getInfo.  It
either returns the cached info or runs the analysis.

Symbolic stride information continues to reside outside of this analysis
pass. We may move it inside later but it's not a priority for me right
now.  The idea is that Loop Distribution won't support run-time stride
checking at least initially.

This means that when querying the analysis, symbolic stride information
can be provided optionally.  Whether stride information is used can
invalidate the cache entry and rerun the analysis.  Note that if the
loop does not have any symbolic stride, the entry should be preserved
across Loop Distribution and LV.

Since currently the only user of the pass is LV, I just check that the
symbolic stride information didn't change when using a cached result.

On the LV side, LoopVectorizationLegality requests the info object
corresponding to the loop from the analysis pass.  A large chunk of the
diff is due to LAI becoming a pointer from a reference.

A test will be added as part of the -analyze patch.

Also tested that with AVX, we generate identical assembly output for the
testsuite (including the external testsuite) before and after.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229626 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-18 03:43:24 +00:00
Adam Nemet
0d1e8dd84e [LoopAccesses] Make blockNeedsPredication static
blockNeedsPredication is in LoopAccess in order to share it with the
vectorizer.  It's a utility needed by LoopAccess not strictly provided
by it but it's a good place to share it.  This makes the function static
so that it no longer required to create an LoopAccessInfo instance in
order to access it from LV.

This was actually causing problems because it would have required
creating LAI much earlier that LV::canVectorizeMemory().

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229625 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-18 03:43:19 +00:00
Adam Nemet
14cc2e25c5 [LoopAccesses] Cache the result of canVectorizeMemory
LAA will be an on-demand analysis pass, so we need to cache the result
of the analysis.  canVectorizeMemory is renamed to analyzeLoop which
computes the result.  canVectorizeMemory becomes the query function for
the cached result.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229624 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-18 03:42:57 +00:00
Adam Nemet
8b0647f26b [LoopAccesses] Stash the report from the analysis rather than emitting it
The transformation passes will query this and then emit them as part of
their own report.  The currently only user LV is modified to do just
that.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229623 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-18 03:42:50 +00:00
Adam Nemet
eefec589e8 [LoopAccesses] Make VectorizerParams global
As LAA is becoming a pass, we can no longer pass the params to its
constructor.  This changes the command line flags to have external
storage.  These can now be accessed both from LV and LAA.

VectorizerParams is moved out of LoopAccessInfo in order to shorten the
code to access it.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229622 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-18 03:42:43 +00:00
Adam Nemet
38a9ebb065 [LoopAccesses] Rename LoopAccessAnalysis to LoopAccessInfo
LoopAccessAnalysis will be used as the name of the pass.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229621 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-18 03:42:35 +00:00
Akira Hatanaka
5506e22865 [InstCombine] Do not insert a GEP instruction before a landingpad instruction.
InstCombiner::visitGetElementPtrInst was using getFirstNonPHI to compute the
insertion point, which caused the verifier to complain when a GEP was inserted
before a landingpad instruction. This commit fixes it to use getFirstInsertionPt
instead.

rdar://problem/19394964


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229619 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-18 03:30:11 +00:00
Hal Finkel
8a85dee989 [BDCE] Don't forget uses of root instructions seen before the instruction itself
When visiting the initial list of "root" instructions (those which must always
be alive), for those that are integer-valued (such as invokes returning an
integer), we mark their bits as (initially) all dead (we might, obviously, find
uses of those bits later, but all bits are assumed dead until proven
otherwise). Don't do so, however, if we're already seen a use of those bits by
another root instruction (such as a store).

Fixes a miscompile of the sanitizer unit tests on x86_64.

Also, add a debug line for visiting the root instructions, and remove a debug
line which tried to print instructions being removed (printing dead
instructions is dangerous, and can sometimes crash).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229618 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-18 03:12:28 +00:00
Benjamin Kramer
1a50a12b43 Prefer SmallVector::append/insert over push_back loops.
Same functionality, but hoists the vector growth out of the loop.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229500 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-17 15:29:18 +00:00
Elena Demikhovsky
b70bdd9034 Fixed a bug in store sinking.
The problem was in store-sink barrier check.

Store sink barrier should be checked for ModRef (read-write) mode.

http://llvm.org/bugs/show_bug.cgi?id=22613



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229495 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-17 13:10:05 +00:00
Hal Finkel
5b43c8551e [BDCE] Add a bit-tracking DCE pass
BDCE is a bit-tracking dead code elimination pass. It is based on ADCE (the
"aggressive DCE" pass), with the added capability to track dead bits of integer
valued instructions and remove those instructions when all of the bits are
dead.

Currently, it does not actually do this all-bits-dead removal, but rather
replaces the instruction's uses with a constant zero, and lets instcombine (and
the later run of ADCE) do the rest. Because we essentially get a run of ADCE
"for free" while tracking the dead bits, we also do what ADCE does and removes
actually-dead instructions as well (this includes instructions newly trivially
dead because all bits were dead, but not all such instructions can be removed).

The motivation for this is a case like:

int __attribute__((const)) foo(int i);
int bar(int x) {
  x |= (4 & foo(5));
  x |= (8 & foo(3));
  x |= (16 & foo(2));
  x |= (32 & foo(1));
  x |= (64 & foo(0));
  x |= (128& foo(4));
  return x >> 4;
}

As it turns out, if you order the bit-field insertions so that all of the dead
ones come last, then instcombine will remove them. However, if you pick some
other order (such as the one above), the fact that some of the calls to foo()
are useless is not locally obvious, and we don't remove them (without this
pass).

I did a quick compile-time overhead check using sqlite from the test suite
(Release+Asserts). BDCE took ~0.4% of the compilation time (making it about
twice as expensive as ADCE).

I've not looked at why yet, but we eliminate instructions due to having
all-dead bits in:
External/SPEC/CFP2006/447.dealII/447.dealII
External/SPEC/CINT2006/400.perlbench/400.perlbench
External/SPEC/CINT2006/403.gcc/403.gcc
MultiSource/Applications/ClamAV/clamscan
MultiSource/Benchmarks/7zip/7zip-benchmark

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229462 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-17 01:36:59 +00:00
Mehdi Amini
e97c675022 InstCombine: fold more cases of (fp_to_u/sint (u/sint_to_fp val))
Fixes radar 15486701.

From: Fiona Glaser <fglaser@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229437 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-16 21:47:54 +00:00
James Molloy
25a35b5639 Run LICM as part of the cleanup phase from the scalar optimizer.
Things like LoopUnrolling can produce loop invariant values - make sure
we pick them up.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229419 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-16 18:59:54 +00:00
Hal Finkel
793a52967b [ADCE] Don't indent inside an anonymous namespace
To be consistent with what clang-format does, don't add extra indentation
inside an anonymous namespace. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229412 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-16 18:08:00 +00:00
James Molloy
2a7fbb1927 [LoopReroll] Relax some assumptions a little.
We won't find a root with index zero in any loop that we are able to reroll.
However, we may find one in a non-rerollable loop, so bail gracefully instead
of failing hard.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229406 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-16 17:02:00 +00:00
James Molloy
4b739069e4 [LoopReroll] Don't crash on dead code
If a PHI has no users, don't crash; bail gracefully. This shouldn't
happen often, but we can make no guarantees that previous passes didn't leave
dead code around.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229405 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-16 17:01:52 +00:00
Evgeniy Stepanov
989552ca7f [asan] Reuse a common function.
Do not reimplement RoundUpToAlignment.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229397 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-16 14:49:37 +00:00
Aaron Ballman
66981fe208 Removing LLVM_DELETED_FUNCTION, as MSVC 2012 was the last reason for requiring the macro. NFC; LLVM edition.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229340 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-15 22:54:22 +00:00
Hal Finkel
962ebd4f23 [ADCE] Convert another loop for a range-based for
We can use a range-based for for the operands loop too; NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229319 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-15 15:51:25 +00:00
Hal Finkel
6f0a9df3e3 [ADCE] Use inst_range and range-based fors
Convert a few loops to range-based fors; NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229318 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-15 15:51:23 +00:00
Hal Finkel
4090dfd7ba [ADCE] Fix formatting of pointer types
We prefer to put the * with the variable, not with the type; NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229317 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-15 15:47:52 +00:00
Hal Finkel
13f4dc0217 [ADCE] Fix capitalization of another local variable
Bring another local variable in compliance with our naming conventions, NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229316 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-15 15:45:30 +00:00
Hal Finkel
07e2323e71 [ADCE] Fix capitalization of some local variables
Bring some local variables in compliance with our naming conventions, NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229315 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-15 15:45:28 +00:00
Elena Demikhovsky
cc64d95696 Enabled cost calculation for masked memory operations.
We already have implementation for cost calculation for
masked memory operations. I just call it from the loop vectorizer.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229290 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-15 08:08:48 +00:00
Ramkumar Ramachandra
0608cec657 InstCombine: propagate deref via new addDereferenceableAttr
The "dereferenceable" attribute cannot be added via .addAttribute(),
since it also expects a size in bytes. AttrBuilder#addAttribute or
AttributeSet#addAttribute is wrapped by classes Function, InvokeInst,
and CallInst. Add corresponding wrappers to
AttrBuilder#addDereferenceableAttr.

Having done this, propagate the dereferenceable attribute via
gc.relocate, adding a test to exercise it. Note that -datalayout is
required during execution over and above -instcombine, because
InstCombine only optionally requires DataLayoutPass.

Differential Revision: http://reviews.llvm.org/D7510

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229265 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-14 19:37:54 +00:00
Andrea Di Biagio
47cd120a18 [optnone] Skip pass Constant Hoisting on optnone functions.
Added test CodeGen/X86/constant-hoisting-optnone.ll to verify that
pass Constant Hoisting is not run on optnone functions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229258 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-14 15:11:48 +00:00
Duncan P. N. Exon Smith
7520a90c75 Transforms: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.

getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind)
  => getFnAttribute(Kind)

getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind)
  => hasFnAttribute(Kind)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229202 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-14 01:11:29 +00:00
Philip Reames
d777c2c0c0 [InstCombine] When canonicalizing gep indices, prefer zext when possible
If we know that the sign bit of a value being sign extended is zero, we can use a zero extension instead.  This is motivated by the fact that zero extensions are generally cheaper on x86 (and most other architectures?).  We already apply a similar transform in DAGCombine, this just extends that to the IR level.

This comes up when we eagerly canonicalize gep indices to the width of a machine register (i64 on x86_64). To do so, we insert sign extensions (sext) to promote smaller types. 

Differential Revision: http://reviews.llvm.org/D7255



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229189 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-14 00:05:36 +00:00
Andrea Di Biagio
d25126faae [InstCombine] Fix regression introduced at r227197.
This patch fixes a problem I accidentally introduced in an instruction combine
on select instructions added at r227197. That revision taught the instruction
combiner how to fold a cttz/ctlz followed by a icmp plus select into a single
cttz/ctlz with flag 'is_zero_undef' cleared.

However, the new rule added at r227197 would have produced wrong results in the
case where a cttz/ctlz with flag 'is_zero_undef' cleared was follwed by a
zero-extend or truncate. In that case, the folded instruction would have
been inserted in a wrong location thus leaving the CFG in an inconsistent
state.

This patch fixes the problem and add two reproducible test cases to
existing test 'InstCombine/select-cmp-cttz-ctlz.ll'.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229124 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 16:33:34 +00:00
James Molloy
c76bff0715 [SimplifyCFG] Be more aggressive
Up the phi node folding threshold from a cheap "1" to a meagre "2".

Update tests for extra added selects and slight code churn.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229099 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 10:48:30 +00:00
Chandler Carruth
417c5c172c [PM] Remove the old 'PassManager.h' header file at the top level of
LLVM's include tree and the use of using declarations to hide the
'legacy' namespace for the old pass manager.

This undoes the primary modules-hostile change I made to keep
out-of-tree targets building. I sent an email inquiring about whether
this would be reasonable to do at this phase and people seemed fine with
it, so making it a reality. This should allow us to start bootstrapping
with modules to a certain extent along with making it easier to mix and
match headers in general.

The updates to any code for users of LLVM are very mechanical. Switch
from including "llvm/PassManager.h" to "llvm/IR/LegacyPassManager.h".
Qualify the types which now produce compile errors with "legacy::". The
most common ones are "PassManager", "PassManagerBase", and
"FunctionPassManager".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229094 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 10:01:29 +00:00
Chandler Carruth
25fa343bd8 [unroll] Concede defeat and disable the unroll analyzer for now.
The issues with the new unroll analyzer are more fundamental than code
cleanup, algorithm, or data structure changes. I've sent an email to the
original commit thread with details and a proposal for how to redesign
things. I'm disabling this for now so that we don't spend time
debugging issues with it in its current state.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229064 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 05:31:46 +00:00
Michael Liao
4235574ce3 [InstCombine] Fix a bug when combining icmp from ptrtoint
- First, there's a crash when we try to combine that pointers into `icmp`
  directly by creating a `bitcast`, which is invalid if that two pointers are
  from different address spaces.

- It's not always appropriate to cast one pointer to another if they are from
  different address spaces as that is not no-op cast. Instead, we only combine
  `icmp` from `ptrtoint` if that two pointers are of the same address space.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229063 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 04:51:26 +00:00
Chandler Carruth
e95568c0a0 [unroll] Merge the simplification and DCE estimation methods on the
UnrollAnalyzer.

Now they share a single worklist and have less implicit state between
them. There was no real benefit to separating these two things out.

I'm going to subsequently refactor things to share even more code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229062 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 04:39:05 +00:00
Chandler Carruth
17cc3c80ee [unroll] Remove pointless dyn_cast<>s to Instruction - the users of an
instruction must by definition be instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229061 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 04:33:21 +00:00
Chandler Carruth
2d38576d56 [unroll] Don't check the loop set for whether an instruction is
contained in it each time we try to add it to the worklist, just check
this when pulling it off the worklist. That way we do it at most once
per instruction with the cost of the worklist set we would need to pay
anyways.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229060 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 04:30:44 +00:00
Chandler Carruth
b9cb5b19d9 [unroll] Change the other worklist in the unroll analyzer to be a set
vector.

In addition to dramatically reducing the work required for contrived
example loops, this also has to correct some serious latent bugs in the
cost computation. Previously, we might add an instruction onto the
worklist once for every load which it used and was simplified. Then we
would visit it many times and accumulate "savings" each time.

I mean, fortunately this couldn't matter for things like calls with 100s
of operands, but even for binary operators this code seems like it must
be double counting the savings.

I just noticed this by inspection and due to the runtime problems it can
introduce, I don't have any test cases for cases where the cost produced
by this routine is unacceptable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229059 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 04:27:50 +00:00
Chandler Carruth
361ac0df65 [unroll] Replace a boolean, for loop, condition, and break with
std::all_of and a lambda. Much cleaner, no functionality
changed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229058 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 04:18:14 +00:00
Chandler Carruth
758256535c [unroll] Directly query for dead instructions.
In the unroll analyzer, it is checking each user to see if that user
will become dead. However, it first checked if that user was missing
from the simplified values map, and then if was also missing from the
dead instructions set. We add everything from the simplified values map
to the dead instructions set, so the first step is completely subsumed
by the second. Moreover, the first step requires *inserting* something
into the simplified value map which isn't what we want at all.

This also replaces a dyn_cast with a cast as an instruction cannot be
used by a non-instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229057 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 04:14:05 +00:00
Chandler Carruth
2640fd5bae [unroll] Replace a linear time check for no uses with a constant time
check.

Also hoist this into the enqueue process as it is faster even than
testing the worklist set, we should just directly filter these out much
like we filter out constants and such.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229056 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 04:06:08 +00:00
Chandler Carruth
6a12276573 [unroll] Rather than an operand set, use a setvector for the worklist.
We don't just want to handle duplicate operands within an instruction,
but also duplicates across operands of different instructions. I should
have gone straight to this, but I had convinced myself that it wasn't
going to be necessary briefly. I've come to my senses after chatting
more with Nick, and am now happier here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229054 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 03:57:40 +00:00
Chandler Carruth
29e00cf519 [unroll] Extract the code to enqueue operansd for the worklist in the
unroll analysis into a lambda and call it. That's much simpler than
duplicating all the code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229053 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 03:49:41 +00:00
Chandler Carruth
76908cba94 [unroll] Use a small set to de-duplicate operands prior to putting them
into the worklist. This avoids allocating lots of worklist memory for
them when there are large numbers of repeated operands.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229052 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 03:48:38 +00:00
Chandler Carruth
595a99573a [unroll] Make the unroll cost analysis terminate deterministically and
reasonably quickly.

I don't have a reduced test case, but for a version of FFMPEG, this
makes the loop unroller start finishing at all (after over 15 minutes of
running, it hadn't terminated for me, no idea if it was a true infloop
or just exponential work).

The key thing here is to check the DeadInstructions set when pulling
things off the worklist. Without this, we would re-walk the user list of
already dead instructions again and again and again. Consider phi nodes
with many, many operands and other patterns.

The other important aspect of this is that because we would keep
re-visiting instructions that were already known dead, we kept adding
their cost savings to this! This would cause our cost savings to be
*insanely* inflated from this.

While I was here, I also rotated the operand walk out of the worklist
loop to make the code easier to read. There is still work to be done to
minimize worklist traffic because we don't de-duplicate operands. This
means we may add the same instruction onto the worklist 1000s of times
if it shows up in 1000s of operansd to a PHI node for example.

Still, with this patch, the ffmpeg testcase I have finishes quickly and
I can't measure the runtime impact of the unroll analysis any more. I'll
probably try to do a few more cleanups to this code, but not sure how
much cleanup I can justify right now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229038 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 03:40:58 +00:00